# Introduction

This notebook allows you to run a simplified version of the ANTARESS workflow, for the following purpose:
- format, clean, and correct time-series of echelle spectra (S2D) obtained during an exoplanet transit

This notebook takes as input the S2D fits file output by ESPRESSO-like DRS. At the end of the notebook, spectra are still in echelle format on their original spectral grid (defined in the solar barycentric rest frame), have comparable low-frequency spectral profiles, and are corrected for various instrumental, environmental, and stellar effects. The default version of this notebook is set up to process a real dataset, obtained with ESPRESSO during the transit of TOI-421c, but the settings can be adjusted to process your own dataset. In any case, set `working_path`, `star_name`, and `pl_name` to the same values that you used in the [set-up notebook](https://gitlab.unige.ch/spice_dune/antaress/-/blob/main/Notebooks/ANTARESS_nbook_setup.ipynb) to initialize the system and dataset.

To exploit the workflow to its full capabilities (e.g., to process multiple planets and datasets), run its executable with the [configuration files](https://obswww.unige.ch/~bourriev/antaress/doc/html/installation.html).

In [1]:
import ANTARESS_nbook_bground
input_nbook = {
    'working_path' : '/Users/bourrier/Travaux/ANTARESS/Working_dir/',
    'star_name' : 'TOI421',
    'pl_name'   : 'TOI421c'
}
input_nbook = ANTARESS_nbook_bground.load_nbook(input_nbook, 'sp_reduc')

FileNotFoundError: [Errno 2] No such file or directory: '/Users/bourrier/Travaux/ANTARESS/Working_dir//TOI421/TOI421c_Saved_data/init_sys.npz'

# General

**Modules**

The workflow runs each module successively. Outputs of a given module are stored on disk and used as inputs by subsequent modules. 
Running the cell of a given module will activate it. The `calc_X` field then allows you to set the module to *calculation* or *retrieval* mode, so that it either processes data (which can take some time with large datasets) or retrieves already processed data. This allows you to run a module (i.e., activating its cell and [running directly](#Launch_ANTARESS) the workflow), to check ANTARESS log and [plots](#Plot_display) to adjust the module settings, and once satisfied to set the module to *retrieval* mode before moving on to the next module.

If you change a module settings, do not forget to run the workflow again after setting this module and all modules depending on its outputs to *calculation* mode.

**Plots**

ANTARESS plots require intermediate data products that are only saved if the plots have been requested when running the workflow. 
If you want plots to be performed for a given module, you thus need to run this module's cell (in *calculation* mode) and the plot's cell before running the workflow.

Most plots are specific to a given module. You can however use the `sp_raw` and `trans_sp` 

either the transmission spectrum `sp_trans` or the raw spectrum `sp_raw` before and after the telluric correction.




In [None]:
input_nbook['sp_reduc'].update({
    'sp_raw'     : False,
    'trans_sp'   : False,
    'wav_range'  : [3000,7000],
    'y_range'    : None,
    'norm_prof'  : True,
    'plot_master': False,
    'multi_exp'  :True
})

# Data upload and formatting

Run this cell to initialize the module tasked to upload the dataset and to put in ANTARESS format. 
Then, [run directly](#Launch_ANTARESS) the workflow. 
Once the module task is done, set `calc_proc_data` to `False` so that the module remains activated but retrieves the processed data instead of processing them again.

All modules in the workflow work in this way, with a `calc_X` fields that sets them in *calculation* or *retrieval* mode (this is to avoid unecessary computation time). The outputs of a given module are used as inputs to other ones. It is recommended to run every module sequentially, checking the run log and the [plots](#Plot_display) before setting them to *retrieval* mode and moving on to the next module.

**Important**
- If you change a module settings, do not forget to run the workflow again after setting this module and all modules depending on its outputs to *calculation* mode.
- Plots must be requested for a module to save the required data products. If you want plots to be performed for a given module, it must be set to *calculation* mode before running the workflow.

In [7]:
input_nbook['sp_reduc']['calc_proc_data'] = True #& False

ANTARESS_nbook_bground.processing_mode(input_nbook)

# Spectral corrections

<a id='inst_cal'></a>
**Instrumental calibration**

Input spectra from the DRS are assumed to be corrected for standard instrumental effects, in particular the blaze from the spectrograph grating.
Run this cell to retrieve the blaze and detector noise profiles, which will be used throughout the workflow to scale back temporarily spectra from flux density to blazed count units, and to perform accurate weighted means.

For these operations to be accurate, you need to provide the `S2D_BLAZE_A.fits` files of your dataset in the same directory as the input data.

In [8]:
input_nbook['sp_reduc']['calc_gcal'] = True # & False

ANTARESS_nbook_bground.inst_cal(input_nbook)

Run the cell below to plot the retrieved calibration and detector noise profiles.

Define the indexes of the exposure and spectrograph order to be plotted.

In [9]:
input_nbook['sp_reduc']['iexp2plot'] = 25
input_nbook['sp_reduc']['iord2plot'] = 113

ANTARESS_nbook_bground.inst_cal_plot(input_nbook)

**Telluric correction**

Run this cell to correct spectra for telluric absorption lines. You can adjust: 

- `tell_species` (list, string): telluric molecules to considered (H$_2$O, O$_2$, CH$_4$, CO, CO$_2$)       
- `tell_thresh` (float): telluric lines with contrast deeper than this threshold (0 = no absorption, 1 = full absorption) are masked

In [None]:
input_nbook['sp_reduc'].update({
    'tell_species'     : ['H2O', 'O2'],
    'tell_thresh'      : 0.9
})

# Calculate/retrieve
input_nbook['sp_reduc']['calc_tell'] = True & False

ANTARESS_nbook_bground.tell_corr(input_nbook)

Run the cell below to plot:
- the telluric CCFs from the model, and from the raw and corrected data.
- the properties of the best-fit telluric model

Plots are stored in the `/Working_dir/Star/Planet_Saved_data/XX` directory.

In [9]:
ANTARESS_nbook_bground.tell_corr_plot(input_nbook)

**Masking of pixels**

Run this cell to mask pixels/ranges to exclude from the correction. 

Indicate which spectral orders to be masked and the corresponding wavelength range to mask

- `order` (list): spectral orders to be masked, e.g., [10,24]
- `range` (list): wavelength range to be masked for the spectral orders in the list `order`. The list must have the same length as the list in `order`, the format as $[[\lambda_{\mathrm{low}, \mathrm{ord}_{10}}, \lambda_{\mathrm{high}, \mathrm{ord}_{10}}], [\lambda_{\mathrm{low}, \mathrm{ord}_{24}}, \lambda_{\mathrm{high}, \mathrm{ord}_{24}}]]$, for the ranges to be masked in the spectral orders defined in `order`.

If you chosed to exclude spectral ranges from the correction, run this cell to indicate spectral orders or ranges to mask and then go to [ANTARESS retrieval mode](#process_data), activate `proc_data` and launch ANTARESS from the [Instrumental calibration](#inst_cal).




However, if pixels are later being masked and excluded from the correction after this step you will have to rerun the previous step in calculation mode again. At this point, the cell for masking of pixels would already have been executed, and you only have to activate and run this cell again. After you can rerun the instrumental calibration by running that cell above and launching the ANTARESS workflow. The instrumental calibration will then be performed on the observational data exluding the masked pixels. Again, deactivate the processing of observaitonal data in this cell and continue to run the cells below and perform the corrections once more but now with the pixels masked. Inspect the new corrections and continue with the workflow.


In [None]:
input_nbook['sp_reduc'].update({
    'order' : [],
    'range'  : [[]]
})

ANTARESS_nbook_bground.mask_pix(input_nbook)

**Global flux balance correction**

A first correction based on the flux ratio between each exposure and its visit master computed as the median of all exposures.

- `sigma_clip` (bool): activate to remove outliers (recommended to manually exclude orders in `ord_excl_fit`)
- `nord` (float): indicate number of spectral orders, if not known set it to `None` run this cell and the cell below to plot the flux colour balance. The top axis indicates the number of spectral orders.
- `ord_excl_fit` (list): indicate which orders to be excluded from fit (not used if sigma clipping is activated)
- `phantom_bins` (float): the range of phantom bins in $\nu$ to avoid divergence in the blue part of the spectrum.
- `unc_scaling` (float): variance of fitted bins is scaled to the chosen power (0 = equal weights: 1 = no scaling (original weights), increase to give more weight to data with low errors.
- `fit_mode` (string): `'pol'` for polynomial function or `'spline'` for 1D-smoothing spline
- `pol_deg` (integer): polynomial function degree (not used in when 'spline' is chosen)
- `smooth_fac` (float): spline smoothing factor, increase to smooth

Phantom bins can be used to mirror a linear fitting onto the blue part of the spectrum. This can be used to avoid divergence at the end of the spectrum. To find the best parameters for the model an iterative approach is best used varying the different parameters.


In [None]:
input_nbook['sp_reduc'].update({
    'sigma_clip'    : False,
    'nord'          : 170,
    'ord_excl_fit'  : [0,1,88,89,90,91,145,146,147,164,165],
    'phantom_range' : 10.,##
    'unc_scaling'   : 0.25,
    'fit_mode'      : 'spline',
#When using the polynomial function
    'pol_deg'       : 4.,
#When using the 1-D smoothing spline
    'smooth_fac'    : 1.5e-5
})

#True to calculate the correction, False to retrieve last result
input_nbook['sp_reduc'].update({
    'calc_Fbal' : False
})

ANTARESS_nbook_bground.fbal_corr(input_nbook)

**Plotting the global flux colour balance**

Plot the flux balance correction while separating the exposures (True), or plot all exposures on top of each other (Flase)

In [None]:
input_nbook['plots'].update({
    'gap_exp': False
})

#ANTARESS_nbook_bground.plot_fbal_corr(input_nbook)

In [None]:
#ANTARESS_launcher(working_path=input_nbook['working_path'], nbook_dic = input_nbook, exec_comm=False)

**Cosmics correction**

Run this cell to perform the cosmic hits correction.

The cosmic rays are detected by comparing the relative flux between exposures taken before and after. The threshold sets a limit for which a cosmic ray is considered detected. If the relative flux difference is higher than the cosmic threshold multiplied by the standard deviation of the measured or comparison spectra, a cosmic ray is considered detected.

- `align_method` (string): choose alignment method, `kep` to correct for the Keplerian curve with information entered in the system properties, `pip` to use pipeline RVs if available. Keplerian curve is preferred to avoid biased RVs due to RM effect.     
- `ncomp` (integer): number of comparison spectra to be used for the cocmic detection     
- `thresh` (float): set the threshold for which a cosmic hit is considered detected


In [None]:
input_nbook['sp_reduc'].update({
    'calc_cosm'    : False,
    'align_method' : 'kep',
    'ncomp'        : 10,
    'thresh'       : 10.
})

ANTARESS_nbook_bground.cosm_corr(input_nbook, plot=False)

In [None]:
#ANTARESS_launcher(working_path=input_nbook['working_path'], nbook_dic = input_nbook, exec_comm=False)

**ESPRESSO "wiggles" correction**

Run this cell to characterise and correct wiggles; only the screening and filter options are available in this notebook. To use the analytical version use the [configuration files](https://obswww.unige.ch/~bourriev/antaress/doc/html/installation.html).

In this cell, you use screening to visualise the wiggles and, from there, choose what spectral ranges in $\nu$ to include in the fitting. Generally, you will see large spurious features at the centre of the transmission spectrum located at the end and start frequencies of the red and blue detectors. Additionally, at the blue end of the spectrum, the noise levels will be much larger than the amplitude of the wiggle pattern. Hence, parts of these regions generally need to be removed from the fitting. The filter method is then used to characterise the wiggles using a Savitzky-Golay filter of the binned transmission spectrum in each exposure.

Screening:
- `screening` (bool): activate/deactivate the screening and plotting
- `fit_range` (list): indicate lower and upper boundaries of ranges to include in fit, the format is [[low 1, high 1], [low 2, high 2], [etc.]] in $\nu$, leave empty to use the full range
- `y_range` (list): list including the lower and upper y limit, if None automatic scaling is applied

Filter:
- `filter` (bool): activate/deactivate the filter correction and characterisation
- `window` (float): size of smoothing window in $\nu$
- `deg` (integer): polynomial degree used to fit the smoothed spectrum

When using the filter correction, be careful to choose an appropriate combination of window size and polynomial degree to avoid fitting noise and spurious features in the data.

In [None]:
#Screening
input_nbook['sp_reduc'].update({
    'screening'    : False,
    'fit_range'    : [[]],
    'y_range'      : None
})

#Filter
input_nbook['sp_reduc'].update({
    'filter'       : False,
    'window'       : 0.2,
    'deg'          : 3,
})

#To calculate wiggles set to True, to retrieve previous result set to False. First time it has to be True. 
#Set corr_wig to False to deactivate the module
input_nbook['sp_reduc'].update({
    'corr_wig':False,
    'calc_wig':True
})

ANTARESS_nbook_bground.wiggle_corr(input_nbook)

In [None]:
#ANTARESS_launcher(working_path=input_nbook['working_path'], nbook_dic = input_nbook, exec_comm=False)

# Spectral processing

At this step, the original S2D spectra has been cleaned from instrumental effects, telluric contamination, cosmic ray hits, and consits of a time-series of cleaned 2D-spectra. The following steps will allow us co compute a 1D master stellar spectrum from out-of-transit data, that could be used either directly for analysis or to build/or compute a custom CCF mask.

The following steps includes procedures for aligning, scaling, and weighting the spectra to finally produce a master 1D stellar spectrum.

**Detrending of spectral lines**

If this is the first time processing the spectral time series, skip this step since the trend characteristion will be performed in a separate notebook and is performed on the CCFs computed here. After identifying any trends in the [characterisation notebook](link) you can activate the module below and define the corrections to be applied on the S2D spectra, and then recompute the CCFs from the detrended S2D specta.

To perform the detrending in this step, the [characterisation notebook](link) must have been used to determine any trends in the data and appropriate corrections first. Here you will use the values for the coefficients determined using the [characterisation notebook](link). Define which proeprty to correct and the corresponding dependence. If multiple properties are being corrected add them to the list.

- `use` (bool): set to True to perform a trend correction
- `prop` (list, str): list each property to correct with the correcsponding dependece. Properties to correct here is either RV or contrast (ctrst), and as a function of either phase or snr (with ESPRESSO snrQ)
- `coeff` (list,list): indicate coefficients used for each model, add coefficients with decreasing order, add a new list if more than one property is corrected. The RV coefficient should be in km/s.


In [None]:
input_nbook['par'].update({
    'use'  : False,
    'prop' : ['ctrst_snrQ', 'RV_phase'],
    'mode' : ['pol', 'pol'],
    'coeff': [[-1.109190e-05], [-6.840494e-02]]
})

ANTARESS_nbook_bground.detrend(input_nbook)

**Converting spectra into CCFs**

Run this cell to cross-correlate the spectra with a chosen mask. You need to define:

+ `start_RV` (float): lower boundary of the CCF RV grid (in km/s), relative to the systemic velocity
+ `end_RV` (float): upper boundary of the CCF RV grid (in km/s), relative to the systemic velocity
+ `dRV` (float): step size of the CCF RV grid (in km/s). Set to `None` to use instrumental resolution
+ `mask_path` (string): path (relative to `'working_path'`) + name of the mask file
+ `calc_CCF` (bool): set to `False` to retrieve the CCFs and not calculate them again
  
As an example, we provide the [CCF mask](https://gitlab.unige.ch/spice_dune/antaress/-/blob/main/Notebooks/ESPRESSO_new_G9.fits) used by the ESPRESSO DRS for G9-type stars. 

In [None]:
input_nbook['par'].update({
    'start_RV' : -150.,
    'end_RV'   :  150.,
    'dRV'      : None,
    'mask_path': '/ESPRESSO_new_G9.fits',
    'calc_CCF' : True,
})
ANTARESS_nbook_bground.conv_CCF(input_nbook,'DI')

<a id='Launch_ANTARESS'></a>
# Running ANTARESS

Run this cell to run the ANTARESS workflow.

In [10]:
from antaress.ANTARESS_launch.ANTARESS_launcher import ANTARESS_launcher
ANTARESS_launcher(working_path=input_nbook['working_path'], nbook_dic = input_nbook, exec_comm=False)

****************************************
Launching ANTARESS
****************************************

Multi-threading: 16 threads available
Running with observational data
Study of: TOI421c
Accounting for Keplerian motion from all planets
Automatic definition of T14[TOI421c]=2.76 h
Default nsub_Dpl[TOI421c]=26

-----------------------
Processing instrument : ESPRESSO
-----------------------
  Reading and initializing 2D echelle spectra
   > Errors propagated from raw data
   > Data processed on individual spectral tables for each exposure
         Calculating data
         Initializing visit 20231106
vis_path_exp[iexp] 0 /Volumes/T7/Exoplanet_systems/TESS/TOI-421/ESPRESSO_S2D/2023-11-06/r.ESPRE.2023-11-07T03:22:53.685_S2D_A.fits
vis_path_blaze_exp[iexp] 0 /Volumes/T7/Exoplanet_systems/TESS/TOI-421/ESPRESSO_S2D/2023-11-06/r.ESPRE.2023-11-07T03:22:53.685_S2D_BLAZE_A.fits
vis_path_exp[iexp] 1 /Volumes/T7/Exoplanet_systems/TESS/TOI-421/ESPRESSO_S2D/2023-11-06/r.ESPRE.2023-11-07T03:28:48.23

SystemExit: 

<a id='Plot_display'></a>
# Plot display

In [None]:
from antaress.ANTARESS_launch.ANTARESS_launcher import ANTARESS_launcher
ANTARESS_launcher(working_path=input_nbook['working_path'], nbook_dic = input_nbook, exec_comm=False)

In [None]:
from IPython.display import Image
from IPython.core.display import HTML 

Run the cells below to show saved plots.