# Retrievals: Emission Spectra Retrieval

Written by [Evert Nasedkin](mailto:nasedkinevert@gmail.com?subject=[petitRADTRANS]%20Retrievals).
Please cite pRT's retrieval package [(Nasedkin et al. 2024)](https://ui.adsabs.harvard.edu/abs/2024JOSS....9.5875N/abstract) in addition to pRT [(Mollière et al. 2019)](https://ui.adsabs.harvard.edu/abs/2019A%26A...627A..67M/abstract) if you make use of the retrieval package for your work.

This retrieval is based on the forward model used in [Mollière et al. (2020)](https://ui.adsabs.harvard.edu/abs/2020A%26A...640A.131M/abstract) for HR8799e, and shows a more realistic example of how to set up a retrieval.
Emission spectra retrievals, particularly when multiple datasets are included and scattering is taken into account, can take a lot of computational resources to run.
**Therefore, this notebook outlines the setup for such a retrieval, but we advise only running this retrieval on a cluster** (it can take multiple days on hundreds of cores).

For this example, the module ``chemistry.pre_calculated_chemistry `` is used to solve for disequilibrium chemistry (actually equilibrium chemistry with a simple quenching treatment), and the ``chemistry.clouds`` module is used for condensation. Note that you can import other models from [petitRADTRANS/retrieval/models.py](https://gitlab.com/mauricemolli/petitRADTRANS/-/blob/master/petitRADTRANS/retrieval/models.py), see the [API documentation](../../autoapi/petitRADTRANS/retrieval/models/index.html). This file is included in the pRT package folder. Alternatively you can access them through git (just click the link). Of course you can also write your own model. For this we recommend using existing models as template. Remember to respect the input and output format of the model functions described in ["Basic Retrieval Tutorial"](retrieval_basic.html#Model-Functions)

The model here uses a simple adaptive mesh refinement (AMR) algorithm to improve the pressure resolution around the location of the cloud bases.

## Getting started

**Please make sure to have worked through the ["Basic Retrieval Tutorial"](retrieval_basic.html) before looking at the material below.**

In this tutorial, we will outline the process of setting up a RetrievalConfig object, which is the class used to set up a pRT retrieval.
The basic process is always to set up the configuration, and then pass it to the Retrieval class to run the retrieval using, for example, pyMultiNest.
Like mentioned in the ["Basic Retrieval Tutorial"](retrieval_basic.html) several standard plotting outputs will also be produced by the retrieval class.
Most of the classes and functions used in this tutorial have more advanced features than what will be explained here, so it's highly recommended to take a look at the code and API documentation.
There should be enough flexibility built in to cover most typical retrieval studies, but if you have feature requests please get in touch, or open an issue on [gitlab](https://gitlab.com/mauricemolli/petitRADTRANS.git).

In [1]:
# Let's start by importing everything we need.
import os
# To not have numpy start parallelizing on its own
os.environ["OMP_NUM_THREADS"] = "1"

import numpy as np
import matplotlib.pyplot as plt

import petitRADTRANS as prt
from petitRADTRANS import physical_constants as cst

# Import the class used to set up the retrieval.
from petitRADTRANS.retrieval import Retrieval,RetrievalConfig
# Import Prior functions, if necessary.
from petitRADTRANS.retrieval.utils import gaussian_prior
# Import atmospheric model function
from petitRADTRANS.retrieval.models import emission_model_diseq

In [2]:
# Define the pRT run setup
retrieval_config = RetrievalConfig(
    retrieval_name="HR8799e_example", # give a useful name for your retrieval 
    run_mode="retrieve",  # 'retrieve' to run, or 'evaluate' to make plots
    amr=True,  # adaptive mesh refinement, slower if True
    scattering_in_emission=True  #  add scattering for emission spectra clouds
)

For this example we include the GRAVITY data as published in Mollière et al. (2020). To reproduce the published results, please also include the archival SPHERE and GPI data from [Zurlo et al. (2015)](https://www.aanda.org/articles/aa/full_html/2016/03/aa26835-15/aa26835-15.html) and [Greenbaum et al. (2016)](https://iopscience.iop.org/article/10.3847/1538-3881/aabcb8).

In [3]:
# Read in Data

# Here we import petitRADTRANS to find the path of the example files on your machine
# In general this is not required, you just put the files in the folder that you are running
# Your script in, for example
import petitRADTRANS # need to get the name for the example data
path_to_data = "./"

retrieval_config.add_data(
    'GRAVITY',
    f"{path_to_data}/retrievals/emission/observations/HR8799e_Spectra.fits",
    data_resolution=500,
    model_resolution=1000,
    model_generating_function = emission_model_diseq
)

# Note that in Mollière et al. (2020), additional data sets from SPHERE and GPI were used

## Photometric data

If we want to add photometry, we can do that as well! The photometry file should have the format:

`# Name, lower wavelength edge [um], upper wavelength edge [um], flux density [W/m2/micron], flux error [W/m2/micron]`

You are required to provide a model function for calculating the spectrum, as with spectral data, but also a photometric transformation function, which is used to convert the model spectrum into synthetic photometry. This would typically make use of the transmission function for a particular filter. We recommend the use of the `species` package (https://species.readthedocs.io/), in particular the `SyntheticPhotometry` module to provide these functions. If no function is provided, the RetrievalConfig will attempt to import `species` to use this module, using the `name` provided as the filter name.

If you are using transmission spectra, your photometric transformation function should model the difference between the clear and occulted stellar spectrum, returning the difference in (planet radius/stellar radius)^2.

In [4]:
retrieval_config.add_photometry(
    path_to_data + 'retrievals/emission/observations/hr8799e_photometry.txt',
    emission_model_diseq,
    model_resolution=40
)

species v0.8.3

 -> A new version (0.8.4) is available!
 -> It is recommended to update to the latest version
 -> See https://github.com/tomasstolker/species for details

Working folder: /mnt/c/Users/doria/OneDrive/Documents/programs/Python/petitRADTRANS/docs/content/notebooks

Creating species_config.ini... [DONE]
Creating species_database.hdf5... [DONE]
Creating data folder... [DONE]

Configuration settings:
   - Database: /mnt/c/Users/doria/OneDrive/Documents/programs/Python/petitRADTRANS/docs/content/notebooks/species_database.hdf5
   - Data folder: /mnt/c/Users/doria/OneDrive/Documents/programs/Python/petitRADTRANS/docs/content/notebooks/data
   - Magnitude of Vega: 0.03

Multiprocessing: mpi4py installed
Process number 1 out of 1...


Downloading data from 'https://archive.stsci.edu/hlsps/reference-atlases/cdbs/current_calspec/alpha_lyr_stis_011.fits' to file '/mnt/c/Users/doria/OneDrive/Documents/programs/Python/petitRADTRANS/docs/content/notebooks/data/alpha_lyr_stis_011.fits'.
100%|████████████████████████████████████████| 288k/288k [00:00<00:00, 446MB/s]


Adding spectrum: VegaReference: Bohlin et al. 2014, PASP, 126
URL: https://ui.adsabs.harvard.edu/abs/2014PASP..126..711B/abstract


## Parameters and Priors

Here we add all of the parameters used in the retrieval for HR 8799 e, following the prescription of Mollière 2020.
There are many other approaches we could take: varying the temperature structure parameterisation, retrieving different cloud properties, adding in a blackbody CPD and so on.
It is highly recommended to look at the API documentation of `models.py`, or at the [Retrieval Models Tutorial](retrieval_models.html) to get a better idea of what options are available.

In [5]:
# Add parameters, and priors for free parameters

# This run uses the model of Molliere (2020) for HR8799e
# The lambda function provide uniform priors

# Distance to the planet in cm
retrieval_config.add_parameter(
    name='D_pl', 
    free=False, 
    value=41.2925 * cst.pc
)

# Log of the surface gravity in cgs units.
retrieval_config.add_parameter(
    'log_g',
    True, 
    transform_prior_cube_coordinate = lambda x : 2 + 3.5 * x
)

# Planet radius in cm
retrieval_config.add_parameter(
    'planet_radius', 
    True,
    transform_prior_cube_coordinate=lambda x : ( 0.7 + 1.3 * x) * cst.r_jup_mean
)

# Temperature in Kelvin
retrieval_config.add_parameter(
    'T_int', 
    True,
    transform_prior_cube_coordinate=lambda x : 300.0 + 2000.0 * x,
    value=0.0
)

# Spline temperature structure parameters. T1 < T2 < T3
# As these priors depend on each other, they are implemented in the model function
retrieval_config.add_parameter(
    'T3', 
    True,
    transform_prior_cube_coordinate=lambda x : x,
    value=0.0
)
retrieval_config.add_parameter(
    'T2', 
    True,
    transform_prior_cube_coordinate=lambda x : x,
    value=0.0
)
retrieval_config.add_parameter(
    'T1', 
    True,
    transform_prior_cube_coordinate=lambda x : x
)
# Optical depth model
# power law index in tau = delta * press_cgs**alpha
retrieval_config.add_parameter(
    'alpha',
    True,
    transform_prior_cube_coordinate=lambda x :1.0 + x
)
# proportionality factor in tau = delta * press_cgs**alpha
retrieval_config.add_parameter(
    'log_delta', 
    True,
    transform_prior_cube_coordinate=lambda x : x
) 
# Chemistry
# A 'free retrieval' would have each line species as a parameter
# Using a (dis)equilibrium model, we only supply bulk parameters.
# Carbon quench pressure
retrieval_config.add_parameter(
    'log_pquench',
    True,
    transform_prior_cube_coordinate=lambda x : -6.0 + 9.0 * x
    )
# Metallicity [Fe/H]
retrieval_config.add_parameter(
    'Fe/H',
    True,
    transform_prior_cube_coordinate=lambda x : -1.5 + 3.0 * x
)
# C/O ratio
retrieval_config.add_parameter(
    'C/O',
    True,
    transform_prior_cube_coordinate=lambda x : 0.1+1.5*x
)
# Clouds
# Based on an Ackermann-Marley (2001) cloud model
# Width of particle size distribution
retrieval_config.add_parameter(
    'sigma_lnorm', 
    True,
    transform_prior_cube_coordinate=lambda x : 1.05 + 1.95 * x
) 
# Vertical mixing parameters
retrieval_config.add_parameter(
    'log_kzz',
    True,
    transform_prior_cube_coordinate=lambda x : 5.0 + 8.0 * x
) 
# Sedimentation parameter
retrieval_config.add_parameter(
    'fsed',
    True,
    transform_prior_cube_coordinate=lambda x : 1.0 + 10.0 * x
)

In [6]:
# Define opacity species to be included

retrieval_config.set_rayleigh_species(['H2', 'He'])
retrieval_config.set_continuum_opacities(['H2-H2', 'H2-He'])
retrieval_config.set_line_species(
    [
        'H2O__POKAZATEL', 
        'CO-NatAbund', 
        'CH4', 
        'CO2', 
        'HCN', 
        'FeH', 
        'H2S', 
        'NH3', 
        'PH3', 
        'Na', 
        'K', 
        'TiO', 
        'VO',
        'SiO'
    ],
    eq = True
)
retrieval_config.add_cloud_species('Fe(s)_crystalline__DHS', eq=True, abund_lim=(-3.5, 1.0))
retrieval_config.add_cloud_species('MgSiO3(s)_crystalline__DHS', eq=True, abund_lim=(-3.5, 1.0))

In [7]:
retrieval = Retrieval(
    retrieval_config,
    output_directory="",
    evaluate_sample_spectra=False
)

Setting up Radtrans object for data 'GRAVITY'...
Setting up AMR pressure grid.
Loading Radtrans opacities...
 Loading line opacities of species 'H2O__POKAZATEL.R1000' from file '/home/dblain/petitRADTRANS/input_data/opacities/lines/correlated_k/H2O/1H2-16O/1H2-16O__POKAZATEL.R1000_0.3-50mu.ktable.petitRADTRANS.h5'... Done.
 Loading line opacities of species 'CO-NatAbund.R1000' from file '/home/dblain/petitRADTRANS/input_data/opacities/lines/correlated_k/CO/C-O-NatAbund/C-O-NatAbund__HITEMP.R1000_0.1-250mu.ktable.petitRADTRANS.h5'... Done.
 Loading line opacities of species 'CH4.R1000' from file '/home/dblain/petitRADTRANS/input_data/opacities/lines/correlated_k/CH4/12C-1H4/12C-1H4__YT34to10.R1000_0.3-50mu.ktable.petitRADTRANS.h5'... Done.
 Loading line opacities of species 'CO2.R1000' from file '/home/dblain/petitRADTRANS/input_data/opacities/lines/correlated_k/CO2/12C-16O2/12C-16O2__UCL-4000.R1000_0.3-50mu.ktable.petitRADTRANS.h5'... Done.
 Loading line opacities of species 'HCN.R1000

In [8]:
# Before we run the retrieval, let's set up plotting.

# Define what to put into corner plot if run_mode == 'evaluate'
retrieval_config.parameters['planet_radius'].plot_in_corner = True
retrieval_config.parameters['planet_radius'].corner_label = r'$R_{\rm P}$ ($\rm R_{Jup}$)'
retrieval_config.parameters['planet_radius'].corner_transform = lambda x : x / cst.r_jup_mean
retrieval_config.parameters['log_g'].plot_in_corner = True
retrieval_config.parameters['log_g'].corner_ranges = [2., 5.]
retrieval_config.parameters['log_g'].corner_label = "log g"
retrieval_config.parameters['fsed'].plot_in_corner = True
retrieval_config.parameters['log_kzz'].plot_in_corner = True
retrieval_config.parameters['log_kzz'].corner_label = "log Kzz"
retrieval_config.parameters['C/O'].plot_in_corner = True
retrieval_config.parameters['Fe/H'].plot_in_corner = True
retrieval_config.parameters['log_pquench'].plot_in_corner = True
retrieval_config.parameters['log_pquench'].corner_label = "log pquench"

for spec in retrieval_config.cloud_species:
    cname = spec.split('_')[0]
    retrieval_config.parameters['eq_scaling_' + cname].plot_in_corner = True
    retrieval_config.parameters['eq_scaling_' + cname].corner_label = cname

# Define axis properties of spectral plot if run_mode == 'evaluate'
retrieval_config.plot_kwargs["spec_xlabel"] = 'Wavelength [micron]'

retrieval_config.plot_kwargs["spec_ylabel"] = "Flux [W/m2/micron]"
retrieval_config.plot_kwargs["y_axis_scaling"] = 1.0
retrieval_config.plot_kwargs["xscale"] = 'log'
retrieval_config.plot_kwargs["yscale"] = 'linear'
retrieval_config.plot_kwargs["resolution"] = 100.0  # maximum resolution, will rebin the data
retrieval_config.plot_kwargs["nsample"] = 100  # if we want a plot with many sampled spectra

# Define from which observation object to take P-T in evaluation mode (if run_mode == 'evaluate'), add PT-envelope plotting options
retrieval_config.plot_kwargs["take_PTs_from"] = 'GRAVITY'
retrieval_config.plot_kwargs["temp_limits"] = [150, 3000]
retrieval_config.plot_kwargs["press_limits"] = [1e2, 1e-5]

In [9]:
retrieval = Retrieval(
    retrieval_config,
    output_directory="./",
    evaluate_sample_spectra=False
)

Using provided Radtrans object for data 'GRAVITY'...
Using provided Radtrans object for data 'Keck/NIRC2.Ks'...
Using provided Radtrans object for data 'Paranal/NACO.Lp'...
Using provided Radtrans object for data 'Paranal/NACO.NB405'...
Using provided Radtrans object for data 'Paranal/SPHERE.IRDIS_B_J'...
Using provided Radtrans object for data 'Paranal/SPHERE.IRDIS_D_H23_2'...
Using provided Radtrans object for data 'Paranal/SPHERE.IRDIS_D_H23_3'...
Using provided Radtrans object for data 'Paranal/SPHERE.IRDIS_D_K12_1'...
Using provided Radtrans object for data 'Paranal/SPHERE.IRDIS_D_K12_2'...


As mentioned at the beginning of this tutorial, this retrieval is very complex and thus takes several days to run, with hundred of cores on a cluster.

To try to run the retrieval anyway, set `run_retrieval` below to `True`, then execute the cells below.

In [10]:
run_retrieval = False

if run_retrieval:
    retrieval.run(
        n_live_points=2000,
        sampling_efficiency=0.05,
        const_efficiency_mode=True,
        resume=False,
        seed=12345  # ⚠️ seed should be removed or set to -1 in a real retrieval, it is added here for reproducibility
    )

Once the retrieval is complete, the easiest way to generate standard output plots is to use the `plot_all` function.

In [11]:
if run_retrieval:
    retrieval.plot_all(contribution=True)

**Contact**

If you need any additional help, don't hesitate to contact [Evert Nasedkin](mailto:nasedkinevert@gmail.com?subject=[petitRADTRANS]%20Retrievals).