<img style="float: center;" src='https://github.com/spacetelescope/jwst-pipeline-notebooks/raw/main/_static/stsci_header.png' alt="stsci_logo" width="900px"/>

In [None]:
%matplotlib notebook

# RW-DDT Quicklook Analysis Template Notebook

**Authors**: Taylor James Bell (ESA/AURA for STScI)<br>
**Last Updated**: August 04, 2025<br>
**jwst Pipeline Version**: 1.18.0 (Build 11.3)<br>
**Eureka! Pipeline Version**: https://github.com/kevin218/Eureka/tree/tjb_rwddt

Note that additional contextual information can be found in `README_Quicklook.md`

**Purpose**:<br/>

It should not be necessary to edit any cells other than in the [1. Define your eventlabel and top directory](#1.-Define-your-eventlabel-and-top-directory) section and the [5.2 Setting up the Stage 5 EPF](#5.2-Setting-up-the-Stage-5-EPF) subsection unless you want to manually explore/optimize different data processing steps.

**Data**:<br/>
This notebook assumes the Stage 1 rateints files have already been downloaded from MAST using the `rocky-worlds-utils/download_JWST.py` script.

**JWST pipeline version and CRDS context**:<br/>
This notebook was written for the calibration pipeline version given above and uses the context associated with this version of the JWST Calibration Pipeline. Information about this and other contexts can be found in the JWST Calibration Reference Data System (CRDS) [server]((https://jwst-crds.stsci.edu/)). If you use different pipeline
versions, please refer to the table [here](https://jwst-crds.stsci.edu/display_build_contexts/) to determine what context to use. To learn more about the differences for the pipeline, read the relevant [documentation](https://jwst-docs.stsci.edu/jwst-science-calibration-pipeline/jwst-operations-pipeline-build-information)

## Table of Contents
- [0. Importing the required components](#0.-Importing-the-required-components)
- [1. Define your eventlabel and top directory](#1.-Define-your-eventlabel-and-top-directory)
- [2. Stage 2](#2.-Stage-2)
- [3. Stage 3 - Pixels to Lightcurve](#3.-Stage-3---Pixels-to-Lightcurve)
- [4. Stage 4 - Removing time-series outliers](#4.-Stage-4---Removing-time-series-outliers)
- [5. Stage 5 - Fitting the lightcurve](#5.-Stage-5---Fitting-the-lightcurve)

## 0. Importing the required components

There should be no need to change any of this

In [None]:
# Importing a bunch of Eureka! components
import eureka.lib.plots
import eureka.S2_calibrations.s2_calibrate as s2
import eureka.S3_data_reduction.s3_reduce as s3
import eureka.S4_generate_lightcurves.s4_genLC as s4
import eureka.S5_lightcurve_fitting.s5_fit as s5

# Set up some parameters to make plots look nicer. You can set usetex=True if you have LaTeX installed
eureka.lib.plots.set_rc(style='eureka', usetex=False, filetype='.png')

# Some imports to interact with outputs within the Jupyter notebook
from IPython.display import Image, display
from glob import glob
import numpy as np
import matplotlib.pyplot as plt
plt.ioff()

## 1. Define your eventlabel and top directory

Next, we need to choose a short, meaningful label (without spaces) that describes the data we're currently working on. This eventlabel will determine will give nicknames to all your output folders and files.

We also need to tell the notebook where all our data is going to be stored

In [None]:
# Enter in a custom eventlabel that will be used to distinguish the outputs
# from all subsequent processing
eventlabel = '' ## <- ENTER YOUR LABEL HERE

# Specify here the top directory that will contain all ingested and output files
topdir = '/home/rwddt' ## <- ENTER YOUR TOPDIR HERE (leave at /home/rwddt if using Docker)

# Specify your analysis name here (analysis if using Docker, Analysis_A otherwise)
analysis_name = 'analysis'

## 2. Stage 2

### 2.1 Setting up the Stage 2 ECF

For quick-look analyses, there is no need to change any of the Stage 2 settings beyond what is provided in the template, so we'll just read those defaults in.

In [None]:
s2_ecf_contents = f"""# Eureka! Control File for Stage 2: Data Reduction

# Stage 2 Documentation: https://eurekadocs.readthedocs.io/en/latest/ecf.html#stage-2

pmap				1364

# Project directory
topdir              {topdir}

# Directories relative to topdir
inputdir            MAST_Stage1
outputdir           {analysis_name}/Quicklook/Stage2
"""

# This will save the ECF as a file that the next cell can read-in
with open(f'./S2_{eventlabel}.ecf', 'w') as f:
    f.write(s2_ecf_contents)

### 2.2 Running Stage 2

Here we run the Eureka! Stage 2 pipeline using the settings we defined above. This should take <1 minute, but that will depend on the data volume of the observation you're working on and the specifics of your CPU

In [None]:
s2_meta = s2.calibrateJWST(eventlabel)

## 3. Stage 3 - Pixels to Lightcurve

### 3.1 Setting up the Stage 3 ECF

For quick-look analyses, there is no need to change any of the Stage 3 settings beyond what is provided in the template, so we'll just read those defaults in.

In [None]:
s3_ecf_contents = f"""# Eureka! Control File for Stage 3: Data Reduction

# Stage 3 Documentation: https://eurekadocs.readthedocs.io/en/latest/ecf.html#stage-3

ncpu            16
max_memory      1.5

pmap            1364

# Background parameters
ff_outlier      True        # Set False to use only background region (recommended for deep transits)
                            # Set True to use full frame (works well for shallow transits/eclipses)
bg_thresh       [5,5]
interp_method   linear      # Interpolate bad pixels. Options: None (if no interpolation should be performed), linear, nearest, cubic

# Centroiding parameters
centroid_method mgmc        # Method used for centroiding. Options: mgmc, fgc
ctr_guess		fits    	# Initial guess of centroid position. If None, will first perform centroiding on whole frame (can sometimes fail)

# Photometric extraction parameters
phot_method     photutils   # photutils (aperture photometry using photutils), poet (aperture photometry using code from POET), or optimal (for optimal photometric extraction)
aperture_edge   exact       # center (pixel is included only if its center lies within the aperture), or exact (pixel is weighted by the fractional area that lies within the aperture)
photap          5           # Size of photometry aperture radius in pixels
skyin           16          # Inner sky annulus edge, in pixels
skywidth        30          # Width of the sky annulus, in pixels

# Diagnostics
nplots          3

# Project directory
topdir          {topdir}

# Directories relative to topdir
inputdir        {analysis_name}/Quicklook/Stage2
outputdir       {analysis_name}/Quicklook/Stage3
"""

# This will save the ECF as a file that the next cell can read-in
with open(f'./S3_{eventlabel}.ecf', 'w') as f:
    f.write(s3_ecf_contents)

### 3.2 Running Stage 3

Here we run the Eureka! Stage 3 pipeline using the settings we defined above. This should take ~1 minute, but that will depend on the data volume of the observation you're working on and the specifics of your CPU

In [None]:
spec, s3_meta = s3.reduce(eventlabel)

### 3.3 Investigating the Stage 3 outputs

#### Figures 3306

These figures show the results of the centroiding algorithm. You should examine these and ensure that there is a red "x" approximately centered on the stellar PSF, a red circle encircling most of the core of the stellar PSF, and a white annulus that will be used for background-subtraction that does not contain other nearby stars or galaxies. If you do not see a stellar PSF under the red "x" or see other sources in the background annulus, reach out to your TSO Mentor and/or Taylor Bell to investigate this issue further.

In [None]:
figures = np.sort(glob(s3_meta.outputdir+'figs/fig3306*'))
for figure in figures:
    display(Image(filename=figure, embed=True, width=700))

#### Figure 3109

This is a summary plot of the centroiding results on all of the integrations, showing changes in the centroid position and PSF-width as a function of time. You can expect some noise/jitter in each of these parameters, and possibly a small number of 1-integration spikes that may be caused by unmasked cosmic rays. You can also expect a small, gradual drift in the PSF width (especially at the start of the observation). If, however, you see any sudden and large shifts in the centroid position or PSF width (that may be caused by failed tracking or a mirror tilt event), reach out to Taylor Bell to assess whether this is indicative of a failed observation.

In [None]:
figure = glob(s3_meta.outputdir+'figs/fig3109*')[0]
display(Image(filename=figure, embed=True, width=700))

#### Figure 3108

This is a first look at the lightcurve of our target. You may see a small number of large outliers which may be caused by unmasked cosmic rays or bad pixels; these will be taken care of in Stage 4 and are not immediately of concern. There will likely also be a gradual downward trend in the data, which may be especially steep in the first tens of integrations, as the detector and instrument settle and any persistence decays. If a substantial number of the integrations look to be extreme outliers, contact Taylor Bell to assess whether this is indicative of a major issue (think >10 points that are >100 sigma away or something, and don't worry about >5 sigma outliers at this point).

In [None]:
figure = glob(s3_meta.outputdir+'figs/fig3108*')[0]
display(Image(filename=figure, embed=True, width=700))

## 4. Stage 4 - Removing time-series outliers

### 4.1 Setting up the Stage 4 ECF

For quick-look analyses, there is no need to change any of the Stage 4 settings beyond what is provided in the template, so we'll just read those defaults in.

In [None]:
s4_ecf_contents = f"""# Eureka! Control File for Stage 4: Generate Lightcurves

# Stage 4 Documentation: https://eurekadocs.readthedocs.io/en/latest/ecf.html#stage-4

# Number of spectroscopic channels spread evenly over given wavelength range
nspecchan       1
compute_white   False

# Parameters for sigma clipping
clip_binned     True    # Whether or not sigma clipping should be performed on the binned 1D time series
sigma           3.5     # The number of sigmas a point must be from the rolling median to be considered an outlier
box_width       20      # The width of the box-car filter (used to calculated the rolling median) in units of number of data points
maxiters        20      # The number of iterations of sigma clipping that should be performed.
boundary        fill    # Use 'fill' to extend the boundary values by the median of all data points (recommended), 'wrap' to use a periodic boundary, or 'extend' to use the first/last data points
fill_value      mask    # Either the string 'mask' to mask the outlier values (recommended), 'boxcar' to replace data with the mean from the box-car filter, or a constant float-type fill value.

# Project directory
topdir          {topdir}

# Directories relative to topdir
inputdir        {analysis_name}/Quicklook/Stage3
outputdir       {analysis_name}/Quicklook/Stage4
"""

# This will save the ECF as a file that the next cell can read-in
with open(f'./S4_{eventlabel}.ecf', 'w') as f:
    f.write(s4_ecf_contents)

### 4.2 Running Stage 4

Here we run the Eureka! Stage 4 pipeline using the settings we defined above. This should take << 1 minute

In [None]:
spec, lc, s4_meta = s4.genlc(eventlabel)

### 4.3 Investigating the Stage 4 outputs

#### Figure 4102

This shows the sigma-clipped version of the lightcurve of our target, which should look very similar to Figure 3108 but without extreme outliers. These are the data points which will be fitted in Stage 5. If there are still substantial outliers, you will need to adjust the `sigma` and `box_width` settings in your Stage 4 ECF in order to properly catch all the outliers. This might take a bit of guess-and-check work, but if you can't find reasonable settings reach out to your TSO Mentor and/or Taylor Bell.

In [None]:
figure = glob(s4_meta.outputdir+'figs/fig4102*')[0]
display(Image(filename=figure, embed=True, width=700))

## 5. Stage 5 - Fitting the lightcurve

### 5.1 Setting up the Stage 5 ECF

For quick-look analyses, there is no need to change any of the Stage 5 ECF settings beyond what is provided in the template, so we'll just read those defaults in.

In [None]:
s5_ecf_contents = f"""# Eureka! Control File for Stage 5: Lightcurve Fitting

# Stage 5 Documentation: https://eurekadocs.readthedocs.io/en/latest/ecf.html#stage-5

ncpu            16    # The number of CPU threads to use when running emcee or dynesty in parallel

fit_par         ./S5_{eventlabel}.epf
fit_method      [dynesty]
run_myfuncs     [batman_ecl, polynomial, expramp, xpos, ypos, xwidth, ywidth]

#GP inputs
kernel_inputs   ['time']  # options: time
kernel_class    ['Matern32']  # options: ExpSquared, Matern32, Exp, RationalQuadratic for george, Matern32 for celerite (sums of kernels possible for george separated by commas)
GP_package      'celerite'  # options: george, celerite

# Manual clipping in time
manual_clip     [[None,50]]   # Remove the first 50 integrations which will be most affected by detector settling

# dynesty fitting parameters
run_nlive       'min'         # Must be > ndim * (ndim + 1) // 2. Use 'min' to use the minimum safe number
run_bound       'multi'
run_sample      'rwalk'
run_tol         0.01

# Plotting controls
interp          True    # Should astrophysical model be interpolated (useful for uneven sampling like that from HST)

# Diagnostics
isplots_S5      5       # Generate few (1), some (3), or many (5) figures (Options: 1 - 5)

# Project directory
topdir          {topdir}

# Directories relative to topdir
inputdir        {analysis_name}/Quicklook/Stage4
outputdir       {analysis_name}/Quicklook/Stage5
"""

# This will save the ECF as a file that the next cell can read-in
with open(f'./S5_{eventlabel}.ecf', 'w') as f:
    f.write(s5_ecf_contents)

### 5.2 Setting up the Stage 5 EPF

Update the astrophysical parameter priors based on those provided in the relevant Jira ticket

In [None]:
s5_epf_contents = """
# Stage 5 Fit Parameters Documentation: https://eurekadocs.readthedocs.io/en/latest/ecf.html#stage-5-fit-parameters

#Name         Value                 Free?            PriorPar1        PriorPar2    PriorType
# "Free?" can be free, fixed, white_free, white_fixed, shared, or independent
# PriorType can be U (Uniform), LU (Log Uniform), or N (Normal).
# If U/LU, PriorPar1 and PriorPar2 represent lower and upper limits of the parameter/log(the parameter).
# If N, PriorPar1 is the mean and PriorPar2 is the standard deviation of a Gaussian prior.
#-------------------------------------------------------------------------------------------------------
rp          YourRadiusHere      'fixed'
fp          YourFpHere          'free'           YourFpHere    2000e-6     N
# ------------------
# Orbital parameters
# ------------------
per         YourPerHere         'fixed'
t_secondary YourTsecHere        'fixed'
inc         YourIncHere         'fixed'
a           YourAHere           'fixed'
ecc         0.                  'fixed'
w           90.                 'fixed'
time_offset 0                   'independent'
# The following two lines are commented out, but you can uncomment them (while commenting out the ecc and w lines above) and edit them if needed for your planet
# ecosw       YourEcoswHere       'fixed'          YourEcoswHere YourEcoswUncertHere N
# esinw       YourEsinwHere       'fixed'          YourEsinwHere YourEsinwUncertHere N
# --------------------------------------------------------------------------
# Systematic variables (these can be left as-is for the Quick-Look analysis)
# --------------------------------------------------------------------------
# Polynomial Parameters
c0          0.999               'free'           0.999         0.01        N
c1          -0.002              'free'           0.0           0.1         N
# Ramp Parameters
r0          0.002               'free'           0.0           0.01        N
r1          50                  'free'           3             300         U
# Centroid decorrelation parameters
ypos        0.0                 'free'           0.0           0.5         N
xpos        0.0                 'free'           0.0           0.5         N
ywidth      0.0                 'free'           0.0           0.5         N
xwidth      0.0                 'free'           0.0           0.5         N
# White noise
scatter_mult 1.4                'free'           0.8           10          U
"""

# This will save the EPF as a file that the next cell can read-in
with open(f'./S5_{eventlabel}.epf', 'w') as f:
    f.write(s5_epf_contents)

### 5.3 Running Stage 5

Here we run the Eureka! Stage 5 pipeline using the settings we defined above. This should take ~1 minute or less, but that will depend on the data volume of the observation you're working on, the specifics of your CPU, and how well the model matches your data

In [None]:
s5_meta = s5.fitlc(eventlabel)

### 5.4 Investigating the Stage 5 outputs

#### Figure 5101 _dynestyStartingPoint

This is a plot showing the starting point of the fit to the observations. The top panel shows the raw data points in blue (the same data from Figure 4102), and the starting point of the model fit in grey. You should make sure that there is an eclipse signal shown in the grey model (don't worry whether that same eclipse depth is obvious in the blue points yet); if there is no eclipse signal shown in the grey model, then it is likely that you have incorrectly entered one of the astrophysical parameters, with the most likely culprit being that you've incorrectly specified t_secondary (note that this parameter should be in BMJD_TDB which is equal to BJD_TDB - 2,400,000.5). At this point, do not worry if the model doesn't do a very good job at fitting the data; this is just where the model was initialized, and the fitting process is pretty robust to bad initial guesses.

In [None]:
figure = glob(s5_meta.outputdir+'figs/fig5101*_dynestyStartingPoint*')[0]
display(Image(filename=figure, embed=True, width=700))

#### Figure 5101

This is the same style of plot as the _dynestyStartingPoint version, but instead shows the final fit to the data. The grey model points in the top panel will likely no longer look like a smooth line but instead exhibit a noticeable amount of jitter which is the result of the centroid-detrending model fit which is able to slightly reduce the final noise level in the lightcurve. In the middle panel, the systematic noise model has been removed from both the data (in blue) and the model (in grey), and the grey model will now show only the fitted eclipse signal which should do a reasonable job at fitting the observations. The residuals of the fit are shown in the bottom panel, which should be centered around 0 and should ideally not show any residual trends (only showing Gaussian scatter). If the model appears to be quite poorly fit to the data, reach out to Taylor Bell who will help investigate the cause of this poor fit.

In [None]:
figure = glob(s5_meta.outputdir+'figs/fig5101*dynesty.png')[0]
display(Image(filename=figure, embed=True, width=700))

#### Figure 5302

This figure shows a histogram of the residuals of the fit to the model in blue, along with a black curve showing the expected Gaussian distribution of the residuals given the fitted noise level. The blue empirical histogram and the black expected distribution should be reasonably well matched, and any points lying beyond -5 or +5 are indicative of outliers that were missed during the Stage 4 sigma-clipping. Related to this figure, you should also check the fitted value for scatter_mult that was printed to your terminal at the end of the quicklook analysis (which is also printed in the Stage5/S5_.../ap5_bg16_46/S5_....log file if you've lost access to the terminal outputs). This parameter is what scales the estimated uncertainties on each integration, with a value of 1.0 meaning that Stage 3 perfectly estimated the noise level. It is entirely expected that this value will be larger than 1.0 (likely something between 1-2) since Stage 3 doesn't account for background noise levels (instead only accounting for Poisson noise from our host star). If, however, the fitted scatter_mult value is very large (>3 or something), this is likely indicative of especially noisy data and may indicate a failed observation; contact Taylor Bell for help in investigating the source of this excess noise.

In [None]:
figure = glob(s5_meta.outputdir+'figs/fig5302*')[0]
display(Image(filename=figure, embed=True, width=700))

## The End!

By now you should have all the details you need to determine whether or not the observations were successful, and if they were not successful have at least some preliminary ideas of why the observations might have failed. Make sure to mark your progress on the relevant Jira ticket and write-up your findings for a quick report to share with the rest of the CIT.

P.S., many other figures are produced in Stages 3--5, but detailed investigations of those figures are beyond the scope of our quicklook analysis and will be examined in more detail during the deep-dive analysis.