# Searching for Phosphine in the Venusian Atmosphere with the JWST MIRI Instrument

As computational scientists and astronomers, you have just obtained a (simulated) spectrum of the Venusian atmosphere with the JWST MIRI instrument! This instrument works between the 350-2000 wavenumber range.

With this notebook you will:
1. Visualise the component molecules of the Venusian atmosphere so you have an idea of what to look for in the Venusian atmosphere data.
2. Visualise the simulated JWST MIRI spectrum of the Venusian atmosphere to gain an understanding of how the relative abundance of each component molecule impacts the spectrum and look for any unknown molecule spectral features.
3. Use your calculated phosphine data to check if the unknown molecule spectral features could be phosphine.
4. Bring all of your analysis together to decide if you can use your calculated data to detect phosphine in the atmosphere.


In [None]:
# --------------------------------------------------
# Importing Libraries and Connecting to your Drive
# --------------------------------------------------
import re
import numpy as np
import pandas as pd
import seaborn as sns
from scipy import integrate
import matplotlib.pyplot as plt
from ipywidgets import interact

# Setting the figures to be large with bigger font
sns.set_context('talk', font_scale=1.1)

from google.colab import drive
drive.mount("/content/drive")

drivedir = '/content/drive/MyDrive/'

# Pull of python functions targeted for our data analysis and processing
! cp /content/drive/MyDrive/SciXIgnite_Astro/SciXFunctions.py /content/SciXFunctions.py

from SciXFunctions import figsettings, readiqmol, makespectra, givewidth

## Master code for plotting (Press the play button below!)

In [None]:
### DO NOT edit this code! ###

# ------------------------
# makespectraphosphine
# ------------------------

def makespectraphosphine(freqs, intens, npts = 200, peakwidth = 5, sflow = 0.970, sfmid = 0.953, sfhigh = 0.9475):

    freqs_scaled = []
    for frequency in freqs:
        if frequency < 1000:
            freqs_scaled.append(frequency*sflow)
        if 1000 <= frequency < 2000:
            freqs_scaled.append(frequency*sfmid)
        if frequency >= 2000:
            freqs_scaled.append(frequency*sfhigh)

    print("Scaled Frequencies: ", freqs_scaled)

    minf = min(freqs_scaled)-(min(freqs_scaled)*0.1)
    maxf = max(freqs_scaled)+(max(freqs_scaled)*0.1)
    x = np.linspace(minf,maxf,npts)
    y = np.zeros(npts)

    for i in range(len(freqs)):
        #This add each peak sequentially based on the intensity of the peak and the
        y = y - np.exp(-(2.77/(2*peakwidth)**2)*(x-freqs_scaled[i])**2)*intens[i]

    ymin = min(y)
    ynorm= []
    for i in range(len(y)):
        ynorm.append(y[i]/ymin)
    return x.tolist(), np.abs(np.array(y).tolist())


# ------------------------
# MoleculesSpectra
# ------------------------

def MoleculesSpectra(Molecule):

  # --------------
  # Carbon Dioxide
  # --------------

  # Data that we want to plot. In this case, is the IQMol output file for ethanol.
  iq_CO2 = readiqmol("/content/drive/MyDrive/SciXIgnite_Astro/IQMolFiles/CO2.out", sflow = 0.970, sfmid = 0.953, sfhigh = 0.9475)
  # Adding the width to the frequencies and intensities
  iq_CO2 = makespectra(iq_CO2, peakwidth=20, npts=500)

  # -----
  # Water
  # -----

  # Data that we want to plot. In this case, is the IQMol output file for ethanol.
  iq_water = readiqmol("/content/drive/MyDrive/SciXIgnite_Astro/IQMolFiles/H2O.out", sflow = 0.970, sfmid = 0.953, sfhigh = 0.9475)
  # Adding the width to the frequencies and intensities
  iq_water = makespectra(iq_water, peakwidth=20, npts=500)

  # ---------------
  # Sulphur Dioxide
  # ---------------

  # Data that we want to plot. In this case, is the IQMol output file for ethanol.
  iq_SO2 = readiqmol("/content/drive/MyDrive/SciXIgnite_Astro/IQMolFiles/SO2.out", sflow = 0.970, sfmid = 0.953, sfhigh = 0.9475)
  # Adding the width to the frequencies and intensities
  iq_SO2 = makespectra(iq_SO2, peakwidth=20, npts=500)

  # ---------------
  # Phosphine
  # ---------------

  # Data that we want to plot. In this case, is the IQMol output file for ethanol.
  iq_PH3 = readiqmol("/content/drive/MyDrive/SciXIgnite_Astro/IQMolFiles/PH3.out", sflow = 0.970, sfmid = 0.953, sfhigh = 0.9475)
  # Adding the width to the frequencies and intensities
  iq_PH3 = makespectra(iq_PH3, peakwidth=20, npts=500)

  # Adding the frequencies and intensities for water and ethanol considering 30% water and 70% ethanol
  all_freq = iq_CO2['Frequency'] + iq_water['Frequency'] + iq_SO2['Frequency'] + iq_PH3['Frequency']
  all_int = list(np.array(iq_CO2['Intensity_norm'])*0.95) + list(np.array(iq_water['Intensity_norm'])*0.01) + list(np.array(iq_SO2['Intensity_norm'])*0.01) + list(np.array(iq_PH3['Intensity_norm'])*0.03)


  # ----------
  # Collecting ExoMol line lists in 'empty_dict' suitable for plotting
  columns = ['CO2', 'H2O', 'SO2']
  colour_list = ['grey', 'blue', 'orange']
  scales = [0.95,0.01,0.01,0.03]

  data_dict = {'CO2':iq_CO2, 'H2O':iq_water, 'SO2':iq_SO2}

  empty_dict = {}

  for mol,colour,scale in zip(columns,colour_list,scales):
      if mol != None:
          temp_data = data_dict[mol]
          temp_freq = temp_data['Frequency']
          temp_int = list(np.array(temp_data['Intensity_norm'])*scale)
          freq, ints = givewidth(temp_freq, temp_int, peakwidth=20, npts=500)

          empty_dict[mol] = list(freq),list(ints),colour,scale


  # --------------------------
  # SPECTRUM WITH ABUNDANCES
  # --------------------------

  # Figure formatting.
  fig = plt.figure(figsize = (15,7))
  ax = fig.add_subplot(111)

  # Remeber that we need to scale our frequencies. We are using the givewidth function for that
  joinf, joini = givewidth(all_freq, all_int, peakwidth=20, npts=500)

  ax.plot(joinf, joini, label = "Venus Atmosphere", color = 'goldenrod')



  # Dictionary to display the chemical formulas with the right formatting.
  mol_names = {'CO2':'CO2 : Carbon Dioxide',
                'H2O':'H$_2$O : Water',
                'CO2':'CO$_2$ : Carbon Dioxide',
                'SO2':'SO$_2$ : Sulfur Dioxide'}

  wfreqs = np.array(empty_dict[Molecule][0])
  wintens = np.array(empty_dict[Molecule][1])#*empty_dict[Molecule][3]

  ax.plot(wfreqs,wintens, linestyle = '-', label = mol_names[Molecule], color = 'rebeccapurple',linewidth=1.5)

  # Plot legend.
  legend = plt.legend(frameon = False, loc = 'best')

  # Plot axes formating.
  ax = figsettings(ax);
  ax[0].set_xlim(350,2000)
  plt.ylim(-0.1,max(all_int)+0.1)

  plt.show()

  return


# ------------------------
# PhosphineSpectra
# ------------------------

def PhosphineSpectra(freqs, intens, abundance, sflow = 0.970, sfmid = 0.953, sfhigh = 0.9475):

  # Loading in the data for Venusian atmosphere
  iq_CO2 = readiqmol("/content/drive/MyDrive/SciXIgnite_Astro/IQMolFiles/CO2.out", sflow = 0.970, sfmid = 0.953, sfhigh = 0.9475)
  iq_water = readiqmol("/content/drive/MyDrive/SciXIgnite_Astro/IQMolFiles/H2O.out", sflow = 0.970, sfmid = 0.953, sfhigh = 0.9475)
  iq_SO2 = readiqmol("/content/drive/MyDrive/SciXIgnite_Astro/IQMolFiles/SO2.out", sflow = 0.970, sfmid = 0.953, sfhigh = 0.9475)
  iq_PH3 = readiqmol("/content/drive/MyDrive/SciXIgnite_Astro/IQMolFiles/PH3.out", sflow = 0.970, sfmid = 0.953, sfhigh = 0.9475)

  # Adding the frequencies and intensities for water and ethanol considering 30% water and 70% ethanol
  all_freq = iq_CO2['Frequency'] + iq_water['Frequency'] + iq_SO2['Frequency'] + iq_PH3['Frequency']
  all_int = list(np.array(iq_CO2['Intensity_norm'])*0.95) + list(np.array(iq_water['Intensity_norm'])*0.01) + list(np.array(iq_SO2['Intensity_norm'])*0.01) + list(np.array(iq_PH3['Intensity_norm'])*0.03)

  print((intens/np.max(intens))*abundance)
  intens = (intens/np.max(intens))*abundance
  Frequency_spectrum, Intensity_spectrum = makespectraphosphine(freqs, intens, npts = 500, peakwidth = 20, sflow = sflow, sfmid = sfmid, sfhigh = sfhigh)

  # ----------------------------------
  # PLOTTING COMBINED SPECTRAL DATA
  # ----------------------------------

  fig, ax = plt.subplots(1,1,figsize = (15,8))

  # --------------------------
  # SPECTRUM WITH ABUNDANCES
  # --------------------------

  # Remeber that we need to scale our frequencies. We are using the givewidth function for that
  joinf, joini = givewidth(all_freq, all_int, peakwidth=20, npts=500)

  ax.plot(joinf, joini, label = "Venus Atmosphere", color = 'goldenrod')


  ax.plot(Frequency_spectrum, Intensity_spectrum, label = "PH$_3$", color = 'green')

  ax = figsettings(ax);
  ax[0].set_xlim(350,2000)
  fig.legend(loc = 1)
  fig.show()

  return

  # Loading in the data for Venusian atmosphere
iq_CO2 = readiqmol("/content/drive/MyDrive/SciXIgnite_Astro/IQMolFiles/CO2.out", sflow = 0.970, sfmid = 0.953, sfhigh = 0.9475)
iq_water = readiqmol("/content/drive/MyDrive/SciXIgnite_Astro/IQMolFiles/H2O.out", sflow = 0.970, sfmid = 0.953, sfhigh = 0.9475)
iq_SO2 = readiqmol("/content/drive/MyDrive/SciXIgnite_Astro/IQMolFiles/SO2.out", sflow = 0.970, sfmid = 0.953, sfhigh = 0.9475)
iq_PH3 = readiqmol("/content/drive/MyDrive/SciXIgnite_Astro/IQMolFiles/PH3.out", sflow = 0.970, sfmid = 0.953, sfhigh = 0.9475)

# Adding the frequencies and intensities for water and ethanol considering 30% water and 70% ethanol
all_freq = iq_CO2['Frequency'] + iq_water['Frequency'] + iq_SO2['Frequency'] + iq_PH3['Frequency']
all_int = list(np.array(iq_CO2['Intensity_norm'])*0.95) + list(np.array(iq_water['Intensity_norm'])*0.01) + list(np.array(iq_SO2['Intensity_norm'])*0.01) + list(np.array(iq_PH3['Intensity_norm'])*0.03)

# Remeber that we need to scale our frequencies. We are using the givewidth function for that
Frequency_Venus, Intensity_Venus = givewidth(all_freq, all_int, peakwidth=20, npts=500)

# Using python in SciX:Ignite

Python is perhaps one of the most popular programming language not only in science but in many other interdisciplinary fields. This is mostly because of its super user-fieldy formatting and low-barrier entry, allowing newcomers to learn and explore python easily.

In SciX however, our focus is not to teach python from scracth but to use it as a powerful tool for our data processing, analysis, and visualition, showcasing its applications in a scientific context. You will be using a pull of pre-written functions written by your SciX mentors that will allow you to explore the functionalities of python without being a programmer.

If you wanna know more about python and learn the basics to maximise your engagement and enhance your programming skills, at UNSW we have developed a set of Jupyter notebooks exploring lots of different applications of python. You can find these Jupyter notebook here: https://sites.google.com/view/unswpy4sci/home.

The image below presents a proposed roadmap of the notebooks that you might find useful, increasing in complexity and python knowledge required. However, if you wanna know more, have a look at the entire website; there a lots of options to learn from!

<img src="BackgroudImages/SciXRoadmap.png" alt="Alternative text" width="800"/>

# 1. The Molecules in the Venusian Atmosphere

When measuring the infrared spectrum of a planetary atmosphere, several molecules make up for the chemical inventory responsible for all spectral features present in the recorded spectrum. These atmospheric spectra are therefore the combination of all indivual infrared spectra for the molecules present in the atmosphere.

We can use python to simulate spectra containing feautres from multiple molecules at the same time. In this example, we are going to plot a spectrum of the main components of the Venusian Atmosphere:
- CO$_2$ : Carbon Dioxide
- H$_2$O : Water
- SO$_2$ : Sulphur Dioxide

The first step is to store the data for carbon dioxide, water, and sulphur dioxide:

In [None]:
### DO NOT edit this code! ###

iq_CO2 = readiqmol("/content/drive/MyDrive/SciXIgnite_Astro/IQMolFiles/CO2.out", sflow = 0.970, sfmid = 0.953, sfhigh = 0.9475)
iq_water = readiqmol("/content/drive/MyDrive/SciXIgnite_Astro/IQMolFiles/H2O.out", sflow = 0.970, sfmid = 0.953, sfhigh = 0.9475)
iq_SO2 = readiqmol("/content/drive/MyDrive/SciXIgnite_Astro/IQMolFiles/SO2.out", sflow = 0.970, sfmid = 0.953, sfhigh = 0.9475)

### Press the play sign in the upper-left of this box to run the code! ###

Next, we can visualise the different components of the Venusian atmosphere to get an idea for which frequencies each molecule absorbs at!

In [None]:
### DO NOT edit this code! ###

## IQMol (with width) and NIST spectra for CO2

# Figure size
fig, ax1 = plt.subplots(figsize = (15,8))

# -------
# Carbon Dioxide
# -------

# Data that we want to plot. In this case, is the IQMol output file for ethanol.
iq_CO2 = readiqmol("/content/drive/MyDrive/SciXIgnite_Astro/IQMolFiles/CO2.out", sflow = 0.970, sfmid = 0.953, sfhigh = 0.9475)
# Adding the width to the frequencies and intensities
iq_CO2 = makespectra(iq_CO2, peakwidth=20, npts=500)

# Creating the stem plot for ethanol
ax1.plot(iq_CO2['Frequency_width'], iq_CO2["Intensity_width"],
         label = "Carbon Dioxide", color = 'maroon')

# -----
# Water
# -----

# Data that we want to plot. In this case, is the IQMol output file for ethanol.
iq_water = readiqmol("/content/drive/MyDrive/SciXIgnite_Astro/IQMolFiles/H2O.out", sflow = 0.970, sfmid = 0.953, sfhigh = 0.9475)
# Adding the width to the frequencies and intensities
iq_water = makespectra(iq_water, peakwidth=20, npts=500)

# Creating the stem plot for ethanol
ax1.plot(iq_water['Frequency_width'], iq_water["Intensity_width"],
         label = "Water", color = 'steelblue')

# ---------------
# Sulphur Dioxide
# ---------------

# Data that we want to plot. In this case, is the IQMol output file for ethanol.
iq_SO2 = readiqmol("/content/drive/MyDrive/SciXIgnite_Astro/IQMolFiles/SO2.out", sflow = 0.970, sfmid = 0.953, sfhigh = 0.9475)
# Adding the width to the frequencies and intensities
iq_SO2 = makespectra(iq_SO2, peakwidth=20, npts=500)

# Creating the stem plot for ethanol
ax1.plot(iq_SO2['Frequency_width'], iq_SO2["Intensity_width"],
         label = "Sulphur Dioxide", color = 'goldenrod')


ax1 = figsettings(ax1);
ax1[0].set_xlim(350,2000)
fig.legend(loc = 1)
fig.show()

### Press the play sign in the upper-left of this box to run the code! ###

> **Q:** Do any of the spectral features overlap?

> **A:**

> **Q:** If the spectral features of two different molecules overlap, how will that impact your ability to detect those molecules?

> **A:**

# 2. Visualising the Venusian Atmosphere with the JWST MIRI Instrument

Realistic atmospheric spectra that are obtained with telescopes **do not** distinguish the molecules by colour.

Additionally, the molecules in the Venusian atmosphere exist in different relative amounts!

The main molecules in the Venusian atmosphere are ranked:
1. CO$_2$ - Vast majority at around 90% of the gas
2. H$_2$O - Trace gas at ~1%
3. SO$_2$ - Trace gas ~1%

The Venusian atmosphere is impacted by the runaway greenhouse gas effect, which has led to the massive about of carbon dioxide in its atmosphere. On Earth, only 0.04% of the atmosphere is composed of carbon dioxide.


The following plot is the Venusian atmosphere gathered by the JWST , containing
- CO$_2$,
- H$_2$O,
- SO$_2$,
- and some unknown gas (maybe phosphine).

The plot is limited to the range of the MIRI instrument, that can only gather data between 350-2000 Wavenumbers.


The spectrum will therefore change to

In [None]:
### DO NOT edit this code! ###

# ----------------------------------
# PLOTTING the Venusian Atmosphere
# ----------------------------------

fig, ax = plt.subplots(1,1,figsize = (15,8))

# --------------------------
# SPECTRUM WITH ABUNDANCES
# --------------------------

ax.plot(Frequency_Venus, Intensity_Venus, label = "Venus Atmosphere", color = 'goldenrod')



ax = figsettings(ax);
ax[0].set_xlim(350,2000)
fig.legend(loc = 1)
fig.show()

### Press the play sign in the upper-left of this box to run the code! ###

You can check how the different *known* components of the Venusian atmosphere show up in the MIRI instrument data. Choose from CO$_2$, H$_2$O, and SO$_2$ from the drop down menu to see each molecule plotted on the Venusian atmosphere.

In [None]:
interact(MoleculesSpectra, Molecule = ['CO2', 'H2O', 'SO2'])

> **Q:** Are there any spectral features that don't match the three known molecules? Double-click and type the approximate wavenumbers of the unknown feature(s) in this box.

> **A:**

> **Q:** Which molecule would you guess has the highest abundance in the Venusian atmosphere? Hint: Compare the height of the spectral features between the molecules.

> **A:**

> **Q:** What is your estimation for the abundance of molecule corresponding to the unknown features in the Venusian atmosphere? The number should be between 0 and 1.

> **A:**

# 3. Checking if the unknown features are phosphine with your calculated data

Now you will compare the data you calculated with WebMO to the Venusian atmosphere! Copy and paste the data you analysed in the previous notebook into the following two lists:

In [None]:
### Start editing here! ###

# Storing the WebMO frequencies and intensities for phosphine from the CompChem_SciXIgnite24 notebook

freqs = [] #YOUR FREQUENCIES IN THE BRACKETS
intens = [] #YOUR Intensities IN THE BRACKETS

### Press the play sign in the upper-left of this box to run the code! ###

Next, take your guess abundance for the unknown molecule from the previous section and put it here:

In [None]:
### Start editing here! ###

abundance = # YOUR ABUNDANCE HERE

### Press the play sign in the upper-left of this box to run the code! ###

Now you will plot **your calculated phosphine** spectrum scaled by your abundance guess on the Venusian atmosphere. You will also need to replace the scaled factors sflow, sfmid, and sfhigh with the values used in the previous notebook!


Let's see if the unknown molecule is phosphine!

In [None]:
### Start editing here! ###

PhosphineSpectra(freqs, intens, abundance, sflow = 0.970, sfmid = 0.953, sfhigh = 0.9475)

### Press the play sign in the upper-left of this box to run the code! ###

> **Q:** Are the height of the phosphine lines similar to the unknown molecule spectral features (i.e., is your intial abundance guess correct)? If not, readjust, up or down, until the height of the spectral features are similar!

> **A:**

It is unlikely that your calculated data will perfectly match with the Venusian atmosphere (remember there is error on your data!).

> **Q:** What is the approximate distance between the wavenumber of your calculated phosphine data (green) and the unknown molecule spectral features in the Venusian atmosphere? Estimate it from the plot above!

> **A:**

# 4. Assessing the limitations of your spectral comparison procedure

Given the **error on your data recorded in the previous notebook** and **the approximate wavenumber difference between your calculated data and the data of the Venusian atmosphere**, you will assess whether or not you can determine if phosphine is in the Venusian atmosphere with your data. Double-click on the following text boxes to enter your answers.

> **Q:** Visually describe how your calculated phosphine data (green) matches the Venusian atmosphere.

> **A:**

> **Q:** What is the Average Frequency Difference you calculated in the previous notebook?

> **A:**

> **Q:** You just wrote down two *errors* on your data, the **Average Frequency Difference** that tells you the quality of your data with respect to experiment and the **approximate distance between the wavenumber of your calculated phosphine data and the unknown molecule spectral features in the Venusian atmosphere**. Is the first value smaller or larger than the second?

> **A:**



> **Q:** Recalling that astronomers (you!) need calculated data that has a Average Frequency Difference less than 1 wavenumber to make definitive detections of molecules using this method. Can you make a definitive detection of phosphine in the atmosphere with your data?

> **A:**

> **Q:** What factors are we missing when matching our calculated data to the Venusian atmosphere? Give two factors. Hint: We are only seeing the small wavenumber part of Venusian atmosphere spectrum!

> **A:**

> **Q:** Given your answers to the above questions, have you detected phosphine in the atmosphere of Venus? Choose a scenerio that best fits your analysis. Why do you agree with this scenerio? Is there anything your would add or change?

> 1. We have not detected phosphine in the Venusian atmosphere. The error on our calculated data is too large to assess whether or not the unknown spectral features correspond to phosphine.

> 2. Although we cannot definitively detect phosphine in the Venusian atmosphere with the error on our calculated data, our visual analysis and estimated difference between the spectral features in our calculated data and the Venusian atmosphere suggest that phosphine warrants further investigation. We suggest that higher quality calculated data is obtained and compared to the Venusian atmosphere.

> 3. We have definitively detected phosphine in the Venusian atmosphere. The error on our calculated data is small enough and the visual matching of the two spectra are good enough to make a defintive detection. We suggest follow-up observations at larger wavenumbers to confirm this detection.

> **A:**