# Cross Sections for an Ultra Hot jupiter

This notebook demonstrates how to download and process molecular opacity line lists into tabulated cross sections. There are four main steps to compute cross-section files:

1. [Fetch Line Lists](#1.-line-lists)
2. [Fetch Partition Functions](#2.-partition-functions)
3. [Compute TLI Files](#3.-tli-files)
4. [Sample Cross Sections](#4.-sample-cross-sections)


Note that the first three steps are typically executed only once, allowing you to reuse the output files across projects. However, [Step 4](#4.-sample-cross-sections) may need to be repeated on a per-project basis, depending on your specific requirements (e.g., different spectral, temperature, or pressure ranges; or varying resolutions).

---

## 1. Line lists

In this section, we'll download molecular line lists, typically sourced from the ExoMol or HITRAN/HITEMP databases. You will likely only need to complete this step once, unless a new or updated line list becomes available. Thus, it may be better to store this data in a general directory on your machine. To create a folder for storing line lists, run:

```shell
mkdir inputs
cd inputs
```

For this project we will focus on the molecular absorbers relevant for an ultra-hot Jupiter (WASP-18b).  The table below lists the the molecular line-lists to download and their sources.

| Molecule   | Source | Line List / Reference   |
|------------|--------|------------|
| CH4        | HITEMP | Hargreaves et al. (2020) |
| CO         | HITEMP | Li et al. (2019) |
| CO2        | HITEMP | Rothman et al. (2010) |
| H2O        | ExoMol | pokazatel |
| HCN        | ExoMol | larner/harris |
| NH3        | ExoMol | coyute/byte |
| TiO        | ExoMol | toto |
| VO         | ExoMol | vomyt |
| C2H2       | ExoMol | acety |


This file below contains the links to download all of the required data.

<details>
<summary>File: [uhj_line_lists_data.txt](uhj_line_lists_data.txt)</summary>

```
https://zenodo.org/records/14046762/files/C2H2_exomol_acety_1.0-500.0um_100-3500K_threshold_0.03_lbl.dat
https://zenodo.org/records/14046762/files/H2O_exomol_pokazatel_0.24-500.0um_100-3500K_threshold_0.01_lbl.dat
https://zenodo.org/records/14046762/files/HCN_exomol_harris-larner_0.56-500um_100-3500K_threshold_0.01_lbl.dat
https://zenodo.org/records/14046762/files/NH3_exomol_coyute-byte_0.5-500.0um_100-3500K_threshold_0.03_lbl.dat
https://zenodo.org/records/14046762/files/TiO_exomol_toto_0.33-500um_100-3500K_threshold_0.01_lbl.dat
https://zenodo.org/records/14046762/files/VO_exomol_vomyt_0.29-500um_100-3500K_threshold_0.01_lbl.dat
https://hitran.org/hitemp/data/bzip2format/05_HITEMP2019.par.bz2
https://hitran.org/hitemp/data/bzip2format/06_HITEMP2020.par.bz2
https://hitran.org/hitemp/data/HITEMP-2010/CO2_line_list/02_00000-00500_HITEMP2010.zip
https://hitran.org/hitemp/data/HITEMP-2010/CO2_line_list/02_00500-00625_HITEMP2010.zip
https://hitran.org/hitemp/data/HITEMP-2010/CO2_line_list/02_00625-00750_HITEMP2010.zip
https://hitran.org/hitemp/data/HITEMP-2010/CO2_line_list/02_00750-01000_HITEMP2010.zip
https://hitran.org/hitemp/data/HITEMP-2010/CO2_line_list/02_01000-01500_HITEMP2010.zip
https://hitran.org/hitemp/data/HITEMP-2010/CO2_line_list/02_01500-02000_HITEMP2010.zip
https://hitran.org/hitemp/data/HITEMP-2010/CO2_line_list/02_02000-02125_HITEMP2010.zip
https://hitran.org/hitemp/data/HITEMP-2010/CO2_line_list/02_02125-02250_HITEMP2010.zip
https://hitran.org/hitemp/data/HITEMP-2010/CO2_line_list/02_02250-02500_HITEMP2010.zip
https://hitran.org/hitemp/data/HITEMP-2010/CO2_line_list/02_02500-03000_HITEMP2010.zip
https://hitran.org/hitemp/data/HITEMP-2010/CO2_line_list/02_03000-03250_HITEMP2010.zip
https://hitran.org/hitemp/data/HITEMP-2010/CO2_line_list/02_03250-03500_HITEMP2010.zip
https://hitran.org/hitemp/data/HITEMP-2010/CO2_line_list/02_03500-03750_HITEMP2010.zip
https://hitran.org/hitemp/data/HITEMP-2010/CO2_line_list/02_03750-04000_HITEMP2010.zip
https://hitran.org/hitemp/data/HITEMP-2010/CO2_line_list/02_04000-04500_HITEMP2010.zip
https://hitran.org/hitemp/data/HITEMP-2010/CO2_line_list/02_04500-05000_HITEMP2010.zip
https://hitran.org/hitemp/data/HITEMP-2010/CO2_line_list/02_05000-05500_HITEMP2010.zip
https://hitran.org/hitemp/data/HITEMP-2010/CO2_line_list/02_05500-06000_HITEMP2010.zip
https://hitran.org/hitemp/data/HITEMP-2010/CO2_line_list/02_06000-06500_HITEMP2010.zip
https://hitran.org/hitemp/data/HITEMP-2010/CO2_line_list/02_06500-12785_HITEMP2010.zip
```

</details>

Note that for the ExoMol data we will fetch the  line lists after being processed with ``repack`` ([Cubillos 2017, ApJ 850](https://ui.adsabs.harvard.edu/abs/2017ApJ...850...32C)).  This package indentifies the strong lines that dominate the spectrum from the weak ones, which get discarded.  This speeds up the line sampling process by reducing the line lists from billions of transtition to only a few hundred millions.

On Linux/OSX you can copy this file and then download the line-list data using the ``wget`` shell command (note these are several GB of data):

```
wget -i uhj_line_lists_data.txt
```

Now unpack the HITEMP data:

```shell
bzip2 -d 05_HITEMP2019.par.bz2
bzip2 -d 06_HITEMP2020.par.bz2
unzip '*.zip'
rm -f *.zip
```

---

## 2. Partition Functions

In addition to the line-list data, to compute cross sections you will need the partition functions for each molecules.  This file below contains the links to the partition functions to extract from the ExoMol database (the rest we will source from HITRAN).

<details>
<summary>File: [partition_function_data.txt](partition_function_data.txt)</summary>

```
http://www.exomol.com/db/H2O/1H2-16O/POKAZATEL/1H2-16O__POKAZATEL.pf
http://www.exomol.com/db/HCN/1H-12C-14N/Harris/1H-12C-14N__Harris.pf
http://www.exomol.com/db/HCN/1H-13C-14N/Larner/1H-13C-14N__Larner.pf
http://www.exomol.com/db/C2H2/12C2-1H2/aCeTY/12C2-1H2__aCeTY.pf
https://www.exomol.com/db/TiO/46Ti-16O/Toto/46Ti-16O__Toto.pf
https://www.exomol.com/db/TiO/47Ti-16O/Toto/47Ti-16O__Toto.pf
https://www.exomol.com/db/TiO/48Ti-16O/Toto/48Ti-16O__Toto.pf
https://www.exomol.com/db/TiO/49Ti-16O/Toto/49Ti-16O__Toto.pf
https://www.exomol.com/db/TiO/50Ti-16O/Toto/50Ti-16O__Toto.pf
https://www.exomol.com/db/VO/51V-16O/VOMYT/51V-16O__VOMYT.pf
```
</details>


Copy this file to your *inputs* folder and then download the partition-function files with this shell command:

```
wget -i partition_function_data.txt
```

Now we need to format the ExoMol partition function files into the right format for ``Pyrat Bay``.  For that run this shell commands:

```shell
pbay -pf exomol 1H2-16O__POKAZATEL.pf
pbay -pf exomol 1H-12C-14N__Harris.pf 1H-13C-14N__Larner.pf
pbay -pf exomol 12C2-1H2__aCeTY.pf
pbay -pf exomol 46Ti-16O__Toto.pf 47Ti-16O__Toto.pf 48Ti-16O__Toto.pf 49Ti-16O__Toto.pf 50Ti-16O__Toto.pf
pbay -pf exomol 51V-16O__VOMYT.pf
```

For the other molecules, we will use the HITRAN partition functions (Gamache et al. [2017](https://ui.adsabs.harvard.edu/abs/2017JQSRT.203...70G), [2021](https://ui.adsabs.harvard.edu/abs/2021JQSRT.27107713G)), which are readily availabel in ``Pyrat Bay`` (no need to download files).  To generate the partition function run the following shell commands:

```shell
pbay -pf tips CO
pbay -pf tips CO2
pbay -pf tips CH4
pbay -pf tips NH3 as_exomol
```

Note that for NH3 we are using the HITRAN partition functions for the ExoMol line list (because this partition function samples up to 6000K, which we need for atmospheres of ultra-hot Jupiters).  Thus the `as_exomol` argument makes the ouput file to label the isotope names as in the ExoMol format.

---


## 3. TLI Files

Now we have all the needed inputs.  Lets return to our root directory (the one containing the `inputs/` folder).

The next step is to format the line-list and partition-function input data into the format for use in `Pyrat Bay`, these are called transmission line information (TLI) files.

Here below is the H2O/Exomol configuration files that run this step, for example:

<details>
<summary>File: [tli_exomol_H2O_pokazatel.cfg](tli_exomol_H2O_pokazatel.cfg)</summary>

```ini
[pyrat]

# run mode, select from: [tli atmosphere spectrum opacity radeq retrieval]
runmode = tli

# Output file:
logfile = Exomol_H2O_0.24-33.0um.log

# List of line-transtion databases:
dblist = inputs/H2O_exomol_pokazatel_0.24-500.0um_100-3500K_threshold_0.01_lbl.dat

# Type of line-transition database, select from: [hitran exomol repack]
dbtype = repack
# List of partition functions for each database:
pflist = inputs/PF_exomol_H2O.dat

# Wavelength ranges:
wllow = 0.24 um
wlhigh = 33.0 um

# Verbosity level [1--5]:
verb = 2
```
</details>

A couple of things to note:

- The configuration file indicates the wavelength range to consider. Best practice is to include the full wavelength range available from the line list.  That way you can create a single TLI file that you can use for all of your future projects.  In [Step 4](#4.-sample-cross-sections) you will have the option to fine tune the wavelength range for specific projects.

- The partition-function input is the one file determining what is the available temperature range.


Here are all the TLI configuration files ([config_files_tli.txt](config_files_tli.txt)):

```
https://pyratbay.readthedocs.io/en/ver2.0/cookbooks/cross_sections_uhj/tli_exomol_C2H2_acety.cfg
https://pyratbay.readthedocs.io/en/ver2.0/cookbooks/cross_sections_uhj/tli_exomol_H2O_pokazatel.cfg
https://pyratbay.readthedocs.io/en/ver2.0/cookbooks/cross_sections_uhj/tli_exomol_HCN_harris-larner.cfg
https://pyratbay.readthedocs.io/en/ver2.0/cookbooks/cross_sections_uhj/tli_exomol_NH3_coyute-byte.cfg
https://pyratbay.readthedocs.io/en/ver2.0/cookbooks/cross_sections_uhj/tli_exomol_TiO_toto.cfg
https://pyratbay.readthedocs.io/en/ver2.0/cookbooks/cross_sections_uhj/tli_exomol_VO_vomyt.cfg
https://pyratbay.readthedocs.io/en/ver2.0/cookbooks/cross_sections_uhj/tli_hitemp_CH4_2020.cfg
https://pyratbay.readthedocs.io/en/ver2.0/cookbooks/cross_sections_uhj/tli_hitemp_CO_li2019.cfg
https://pyratbay.readthedocs.io/en/ver2.0/cookbooks/cross_sections_uhj/tli_hitemp_CO2_2010.cfg
```

Copy this file to your current folder to download the TLI configuration files with:

```
wget -i config_files_tli.txt
```

Now you can compute the TLI files using this `Pyrat Bay` shell command:

```shell
pbay -c tli_exomol_C2H2_acety.cfg
pbay -c tli_exomol_H2O_pokazatel.cfg
pbay -c tli_exomol_HCN_harris-larner.cfg
pbay -c tli_exomol_NH3_coyute-byte.cfg
pbay -c tli_exomol_TiO_toto.cfg
pbay -c tli_exomol_VO_vomyt.cfg
pbay -c tli_hitemp_CH4_2020.cfg
pbay -c tli_hitemp_CO_li2019.cfg
pbay -c tli_hitemp_CO2_2010.cfg
```

This may take a while since you are processing several millions of line transitions, but once you have generated these TLI files, you wont likely need to run this step again.

---


## 4. Sample Cross Sections

The final step is to sample the line lists into tabulated data. In this step, each line transition is processed to compute its Voigt profile, which is then sampled over a specified wavelength range, and coadded with all other lines for the molecule. We do this across a regular grid of temperatures and pressures, enabling later use in radiative-transfer calculations.

Depending on the application for the cross-section data, you may need to set specific parameters. For example, radiative-equilibrium applications typically require broad wavelength coverage (~0.3–30 µm) to capture the spectral regions where most of the stellar and planetary flux is concentrated.  In constrast, atmospheric retrievals may focus only on the spectral range covered by the observations, thus allowing to have higher spectral resolutions than you could with radiative-equilibrium run.  It's all a trade-off between science requirements and computational constraints.

Here we will focus on a emission atmospheric retrieval for an ultra-hot Jupiter.  Lets use the H2O cross-section configuration file to walk through the relevant parameters to set:

<details>
<summary>File: [opacity_0250-4000K_0.35-12.0um_R020K_exomol_H2O.cfg](opacity_0250-4000K_0.35-12.0um_R020K_exomol_H2O.cfg)</summary>

```ini
[pyrat]

# run mode, select from: [tli atmosphere spectrum opacity radeq retrieval]
runmode = opacity

# Output file names:
logfile = cross_section_0250-4000K_0.35-12.0um_R025K_exomol_H2O.log

# Atmospheric model:
ptop = 1.0e-08 bar
pbottom = 1.0e+02 bar
nlayers = 56

tmodel = isothermal
tpars = 1000.0

chemistry = tea
species =
    H  He  C  O  N  Na  K  S  Si  Fe  Ti  V
    H2  H2O  CH4  CO  CO2  HCN  NH3  C2H2  C2H4  N2  TiO  VO  OH
    S2  SH  SiO  H2S  SO2  SO  TiO2  VO2

# TLI opacity files:
tlifile = Exomol_H2O_0.24-33.0um.tli

# Wavelength sampling boundaries:
wllow  = 0.35 um
wlhigh = 12.0 um
vextent = 300.0
resolution = 25000.0

tmin =  250
tmax = 4000
tstep = 150
ncpu = 128

# Verbosity level (<0:errors, 0:warnings, 1:headers, 2:details, 3:debug):
verb = 2
```
</details>

This is the boilerplate indicating what to run (`runmode`), the output file names (the output cross section file will have the same name as `logfile` but as a .npz file), and `ncpu` sets how many parallel CPUs you want to use (use as many as you can without crashing your machine).

```ini
# run mode, select from: [tli atmosphere spectrum opacity radeq retrieval]
runmode = opacity

# Output file names:
logfile = cross_section_0250-4000K_0.35-12.0um_R025K_exomol_H2O.log

# Parallel computing
ncpu = 128
# Verbosity level (<0:errors, 0:warnings, 1:headers, 2:details, 3:debug):
verb = 2
```

While the configuration file needs to define an atmosphere, the relevant parameters here are the ones for the pressure profile.  These will set the layers at which we will sample the cross sections.  The `tmodel` and `tpars` are just a filler here (the temperature grid will be defined later).  Similarly, for the composition (`species`) we only need to take care that the molecule being sample is in the atmospheric composition.

Note that the number of pressure layers of the cross section table does not need to be exactly that used later in a radiative-transfer calculation.  Here you can set a relatively coarser grid if needed (when you run retrievals, `Pyrat Bay` can evaluate over a finer pressure grid if requested). 

```ini
# Atmospheric model:
ptop = 1.0e-08 bar
pbottom = 1.0e+02 bar
nlayers = 56

tmodel = isothermal
tpars = 1000.0

chemistry = tea
species =
    H  He  C  O  N  Na  K  S  Si  Fe  Ti  V
    H2  H2O  CH4  CO  CO2  HCN  NH3  C2H2  C2H4  N2  TiO  VO  OH
    S2  SH  SiO  H2S  SO2  SO  TiO2  VO2
```

The following section defines the wavelength sampling. `wllow` and `wlhigh` set the ranges (we want to cover the TESS and JWST observing ranges), whereas `resolution` sets the resolving power of the spectra (we want a resolution >= 25K to avoid having sampling biases).  Lastly, the `vextent` parameter sets the extent of the Voigt profile when sampling each line transition (this is the distance in cm$^{-1}$ from the line center; for this we want at least something > ~300--500 cm$^{-1}$).

```ini
# Wavelength sampling boundaries:
wllow  = 0.35 um
wlhigh = 12.0 um
resolution = 25000.0
vextent = 300.0
```

Then we set the temperature grid. This is a linear grid from `tmin` to `tmax` with a step size of `tstep`.  Note that you cannot sample beyond the temperature ranges given in the partition functions of the inputs.  That would require extrapolation, which is not too scientific.

```ini
# Temperature cross-section grid:
tmin =  250
tmax = 4000
tstep = 150
```

---


TBD