<img src='https://eurekadocs.readthedocs.io/en/latest/_images/Eureka_logo.png' alt="eureka_logo" width="400px"/><img src='https://exoclimes.org/img/exoslam-bg.png' alt="ariel_france_logo" width="220px"/>

# ExoSLAM 2025
## Tutorial - Eureka Data Reduction of MIRI/LRS Data

**Authors**: Taylor James Bell (ESA/AURA for STScI)<br>
**Last Updated**: June 25, 2025<br>
**jwst Pipeline Version**: 1.18.0 (Build 11.3)<br>
**Eureka! Pipeline Version**: [exoslam2025](https://github.com/kevin218/Eureka/tree/exoslam2025) (based on Eureka! version 1.2.1)

**Purpose**:<br/>

The objective of this notebook is to gain familiarity with how space-telescope spectroscopy is "reduced", which is the term used to describe the process of going from raw pixellated images to measurements of the system's brightness as a function of time (called spectroscopic lightcurves).

**Data**:<br/>
This notebook is set up to use an example dataset is from [Program ID](https://www.stsci.edu/jwst/science-execution/program-information) 1366 (PI: Batalha, Natalie) which is the JWST Transiting Exoplanet Community ERS program. In particular, we will use the MIRI/LRS full-orbit phase curve of the hot Jupiter WASP-43b, which was first published in [Bell et al. (2024)](https://ui.adsabs.harvard.edu/abs/2024NatAs...8..879B/abstract). These observations continuously monitored the WASP-43 system for 26.5 hours and contain two eclipses of WASP-43b, one transit of WASP-43b, and the orbital phase variations caused by changes in the regions of WASP-43b's atmosphere that were pointed toward JWST throughout the planet's orbit.

For our purposes, we are only going to work on the first 6 segments of the first exposure to keep things fairly speedy while still getting a realistic sense of what it is like to reduce MIRI/LRS data.

**JWST pipeline version and CRDS context**:<br/>
This notebook was written for the calibration pipeline version given above and uses the context associated with this version of the JWST Calibration Pipeline. Information about this and other contexts can be found in the JWST Calibration Reference Data System (CRDS) [server]((https://jwst-crds.stsci.edu/)). If you use different pipeline
versions, please refer to the table [here](https://jwst-crds.stsci.edu/display_build_contexts/) to determine what context to use. To learn more about the differences for the pipeline, read the relevant [documentation](https://jwst-docs.stsci.edu/jwst-science-calibration-pipeline/jwst-operations-pipeline-build-information)

**Eureka! pipeline version**:<br/>
Eureka is a pipeline developed to reduce and analyze HST and JWST data ([Bell et al. 2023](https://joss.theoj.org/papers/10.21105/joss.04503.pdf)). 
The code is available on [Github](https://github.com/kevin218), and there is also detailed [online documentation](eurekadocs.readthedocs.io).
This notebook was written for the particular Eureka! version specified above. If you are not using that particular version, you must ensure that all of your ECF and EPF settings are adjusted as necessary (looking at the demos/JWST folder for the Eureka! version you have installed is a good way to check whether any parameters changed)

**Preparation**:<br/>
The installation instructions and data download instructions are specified in the Eureka! setup instructions email.

## Table of Contents
* [Configuration](#0.-Configuration)
* [Stage 1: jwst Pipeline (Eureka!'s wrapper)](#1.-Stage-1:-jwst-Pipeline-(Eureka!'s-wrapper))
* [Stage 2: jwst Pipeline (Eureka!'s wrapper)](#2.-Stage-2:-jwst-Pipeline-(Eureka!'s-wrapper))
* [Eureka!'s Stage 3 (spectral extraction)](#3.-Eureka!'s-Stage-3-(spectral-extraction))
* [Eureka!'s Stage 4 (spectral binning & outlier clipping)](#4.-Eureka!'s-Stage-4-(spectral-binning-&-outlier-clipping))
* [Bonus: Optimizing your reduction](#Bonus:-Optimizing-your-reduction)

___
## 0. Configuration

The first step is to setup the notebook and environment.

We'll first import Eureka! along with some other useful packages.

In [None]:
import eureka
import os
import numpy as np

Next, we need to choose a short, meaningful label (without spaces) that describes the data we're currently working on. For simplicity, we will just set `eventlabel = 'miri_exoslam'`. This same event label should be used throughout all stages.

In [None]:
eventlabel = ''  # <--- Please update this as described above

Please adjust the following variable to specify specify where you downloaded the data in the Setup.ipynb notebook!

In [None]:
path_to_data_folder_on_your_machine = '../data/'  # <--- Please update this variable if needed to match your data location
path_to_data_folder_on_your_machine = os.path.expanduser(path_to_data_folder_on_your_machine)

---
## 1. Stage 1: jwst Pipeline (Eureka!'s wrapper)

### 1.1 Setting the Stage 1 "Eureka! Control File"

Eureka works with control files (.ecf) where Stage-specific parameters are defined (e.g. aperture size, path of the data, etc.).

To begin, please first copy below the contents of the ECF template for MIRI/LRS from the `S1_miri_template.ecf` file in the ECF demos folder on [GitHub](https://github.com/kevin218/Eureka/tree/exoslam2025/demos/JWST).

The most important parameters and their recommended settings are described below, but more context can be found on the [Eureka! documentation website](https://eurekadocs.readthedocs.io/en/latest/ecf.html#stage-1).

For this notebook to work, you must:
1. Set `topdir` to `{path_to_data_folder_on_your_machine}` which will use the folder you specified above.

Some of the parameters that might be worth varying are as follows:

* `maximum_cores`: If want to limit the CPU usage of the stage, for example if you are running your data reduction on a shared sherver, you can set this to an integer number of cores or one of `'none'` (for single-threaded operations), `'quarter'`, `'half'`, or `'all'`.
* `skip_firstframe`: This might be worth changing between `True`/`False`. The MIRI detectors show marked persistence effects, which manifest as deviations from linearity at the start of each ramp. To mitigate the effect of persistence on the measured countrates, this step flags the first group in every integration, instructing the pipeline to ignore the first group when fitting the ramps. For TSOs, this step is skipped by default (`skip_firstframe = True`), which does not remove the first frame. In theory it'd be best to set this to `False` (which removes the first group, which is known to be noisy), but for integrations with small numbers of groups, it might be worth setting `skip_firstframe` to `True`. It is hard to say for sure without experimenting.
* `skip_lastframe`: This might be worth changing between `True`/`False`. In the case of MIRI detectors, the array is reset during the last frame read in each integration, which results in a reduced level of accumulated counts in the last frame. By default, the pipeline applies this step to all MIRI data (`skip_firstframe = False`), flagging the last group in every integration as bad (as long as the number of groups per integration is greater than 2). This step is strongly recommended for TSOs (`skip_firstframe = False`), given that the last frame effect has been shown to vary from integration to integration. However, for integrations with a very small number of groups it might be worth setting to `True`. It is hard to say for sure without experimenting.
* `skip_emicorr`: It could be worth changing this between `True`/`False`, but this likely won't have a very large impact. MIRI subarrays suffer from electromagnetic interference (EMI) noise patterns in the raw data that imprint periodic noise into each frame image, with the effect particularly apparent in the case of short ramps, which is typical of most MIRI TSOs. There are two algorithms that the pipeline can select from when applying this step: 'sequential' (the default) fits the EMI in the residuals from ramp fits, while 'joint' carries out a simultaneous fit to the ramps and EMI noise using a reference waveform for the EMI oscillations. The 'sequential' method typically requires 10 or more groups per integration for a reliable fit, while the 'joint' method has been demonstrated to successfully fit EMI noise for any ramp length. In practice, the background subtraction we'll do in Stage 3 ends up providing most of the benefits of this step, so it may not always be worth running. By default, the Eureka! pipeline skips this step (`skip_emicorr = True`), while the jwst pipeline's default is to run the step (`skip_emicorr = False`).
* `skip_rscd`: It could be worth changing this between `True`/`False`. This step mitigates nonlinear behavior at the beginning of each ramp due to transient effects on the detector after every reset. While the exact causes of these effects are not fully understood, the severity of the nonlinearity scales with increasing signal, suggesting a type of persistence on the detectors. Currently, this step in the pipeline flags the first N groups at the beginning of all 2nd and higher integrations, where N is fixed to different values for different observing modes (e.g., N=4 in the case of MIRI LRS slitless TSOs). Due to the short ramps of most TSOs, this step is skipped by default (`skip_rscd = True`), as the flagging of a significant fraction of each ramp results in a severe reduction in the signal-to-noise ratio. However, if the TSO includes relatively large-amplitude features (e.g., exoplanet transits), and if the ramps are long enough to allow for the `rscd` step to be run, turning on this step can ameliorate biases in the relative flux levels due to the signal-dependent nonlinearity effects across the time series.
* `skip_jump` & `jump_rejection_threshold`: The `jump` step flags outliers in the ramps due to cosmic ray hits. The algorithm uses an iterative process that examines the countrates between sequential pairs of groups and applies a sigma-clipping filter to flag groups that yield anomalously large countrates. The default sigma threshold used by the pipeline is `4.0`, though in practice, this settings results in a large number of false positives. Users are recommended to adjust the threshold to higher values (e.g., `8`-`10`) or skip this step altogether in favor of outlier masking at later stages of the data processing workflow. The threshold within Eureka! can be set with the `jump_rejection_threshold` parameter, and the step can be skipped by setting `skip_jump` to `True`.
* `skip_dark_current`: By default this step is run for MIRI TSO observations and is likely best left to run (set the skip parameter to `False`), but it's possible better results could be obtained by experimenting with setting the skip parameter to `True`.
* `skip_refpix`: By default this steps is run for MIRI TSO observations that are taken with the FULL array. For subarrays, including the LRS Slitless subarray, this step is automatically skpped and will not run even if set to `False`. It is possible better results could be obtained by experimenting with setting the skip parameters to `True` for data read with the FULL array.

In [None]:
s1_ecf_contents = f"""

# Fill this f-string text block with the contents of the S1 MIRI ECF template
# from https://github.com/kevin218/Eureka/tree/exoslam2025/demos/JWST
# and then adjust the values as described above.

"""

# This will save the ECF as a file that the next cell can read-in
with open(f'./S1_{eventlabel}.ecf', 'w') as f:
    f.write(s1_ecf_contents)

### 1.2 Running Stage 1

In [None]:
s1_meta = eureka.S1_detector_processing.s1_process.rampfitJWST(eventlabel)

---
## 2. Stage 2: jwst Pipeline (Eureka!'s wrapper)

### 2.1 Setting up the Stage 2 "Eureka! Control File"

To begin, please first copy below the contents of the ECF template for MIRI/LRS from the `S2_miri_lrs_template.ecf` file in the ECF demos folder on [GitHub](https://github.com/kevin218/Eureka/tree/exoslam2025/demos/JWST).

The most important parameters and their recommended settings are described below, but more context can be found on the [Eureka! documentation website](https://eurekadocs.readthedocs.io/en/latest/ecf.html#stage-1).

For this notebook to work, you must:
1. Set `topdir` to `{path_to_data_folder_on_your_machine}` which will use the folder you specified above.

For MIRI TSO data, there is very little else of importance in Stage 2. The main steps of note are:
* `assign_wcs`: This non-optional step is what assigns the wavelength solution to your spectra.
* `flat_field`: This step corrects for the non-uniform illumination and response of different pixels across the detector. Given that the location of the target of interest on the detector is typically extremely stable across the time series, this step often isn't very important for TSO data, as long as you are not concerned with producing absolutely calibrated data.
* `photom`: This step converts the units of the images from countrate units (DN/sec) of astrophysical flux units (MJy/sr). Since the analysis of TSO data generally does not require absolutely calibrated fluxes, we normally skip this step (by setting `skip_photom` to `True`) for exoplanet observations. Doing this has the added advantage of more easily being able to estimate the uncertainties in the time series later on. However, if you want to produce an absolutely calibrated spectrum of the host star in order to compare the observed stellar flux with stellar models, then the `photom` step should be turned on.
* `extract_1d`: This step extracts the stellar spectra and converts the 2D detector image to a 1D spectrum for each integration. The extraction methods used in this step are not ideal for TSO data, and the step can be quite slow, so we typically always skip this step (by setting `skip_extract1d` to `True`) when working on exoplanet TSO data.

In [None]:
s2_ecf_contents = f"""

# Fill this f-string text block with the contents of the S2 MIRI/LRS ECF template
# from https://github.com/kevin218/Eureka/tree/exoslam2025/demos/JWST
# and then adjust the values as described above.

"""

# This will save the ECF as a file that the next cell can read-in
with open(f'./S2_{eventlabel}.ecf', 'w') as f:
    f.write(s2_ecf_contents)

### 2.2 Running Stage 2

In [None]:
s2_meta = eureka.S2_calibrations.s2_calibrate.calibrateJWST(eventlabel)

### 2.3 Visualizing the resulting Stage 2 files

Before we start Stage 3 where we'll do spectroscopic extraction, we wish to visualize Stage 2 files to get a sense for what kind of data we are dealing with.

The images you have just produced are Stage 2 output images in Eureka!'s terms. This means calibrated images from MIRI LRS observations of WASP-43 b ready for spectroscopic extraction. These files are the inputs of Stage 3. Now let's import the necessary python packages.

In [None]:
# Import required packages
from astropy.io import fits
import matplotlib.pyplot as plt
import matplotlib.colors as colors
from mpl_toolkits.axes_grid1 import make_axes_locatable

# Setting the filename variable to point to one of the input files
filename = os.path.join(s2_meta.outputdir,
                        'jw01366011001_04103_00001-seg005_mirimage_calints.fits')

# Read in the file
s2_file = fits.open(filename)

# Let's visualize the first integration
integ1 = s2_file[1].data[0]

## Plotting the image
plt.figure(figsize=(3,5))
im = plt.imshow(integ1, cmap='bone', origin='lower',
                norm=colors.LogNorm(vmin=30, vmax=np.nanmax(integ1)))
plt.xlabel('x-axis (pixels)')
plt.ylabel('y-axis (pixels)')

# These lines will show some potentially reasonable ywindow values to use below
plt.axhline(141, color='red')
plt.axhline(390, color='red')

# These lines will show some potentially reasonable xwindow values to use below
plt.axvline(11, color='cyan')
plt.axvline(61, color='cyan')

# Add a colorbar
divider = make_axes_locatable(plt.gca())
cax = divider.append_axes("right", size="10%", pad=0.1)
plt.colorbar(im, cax=cax, label='Flux (DN/s)')

plt.show()

Now run the next cell to look at the lightcurve (flux vs time) of a couple of pixels summed together (to reduce noise).

In [None]:
pixel_lightcurve = np.nansum(s2_file[1].data[:,141:390,32:40], axis=(1,2))
pixel_lightcurve /= np.nanmedian(pixel_lightcurve)

# Plotting the data
plt.figure(figsize=(8,4))
plt.plot(pixel_lightcurve, '.', color='black')
plt.xlabel('Integration Number')
plt.ylabel('Flux (DN/s)')

plt.show()

Excitingly, there is already a clear astrophysical signal in the uncalibrated data!

For reference, the seven first segments show a secondary eclipse (when the planet passes behind the star from the observer's point of view) of the planet WASP-43b. What we are seeing in the plot above is the ingress into eclipse.

In the following cells, we'll work on getting clean and reliable spectroscopic light curves out of these input files using the Eureka! pipeline.

---
## 3. Eureka!'s Stage 3 (spectral extraction)

Stage 3 performs background subtraction and optimal spectral extraction. This will generate time series of 1D spectra.

### 3.1 Setting the Stage 3 "Eureka! Control File" (ECF)

**This determines what will happen during Stage 3**

To begin, please first copy below the contents of the ECF template for MIRI/LRS from the `S3_miri_lrs_template.ecf` file in the ECF demos folder on [GitHub](https://github.com/kevin218/Eureka/tree/exoslam2025/demos/JWST).

The most important parameters and their recommended settings are described below, but more context can be found on the [Eureka! documentation website](https://eurekadocs.readthedocs.io/en/latest/ecf.html#stage-3).

The recommended settings for this dataset are as follows (be careful to validate each setting, as not all are the same as what is in the template):
1.   Set `ncpu` to the number of CPU threads you want to use. If set to `1` no multiprocessing will be done, and this parameter can be increased to ~2x your CPU core count for faster runs.
2.   Set `ywindow` so that it captures the region containing the portion of the star's spectrum you want to extract. A reasonable setting might be `[10, 390]` which captures the entire nominal MIRI/LRS wavelength region (5–14 microns), but lets instead use `[141,390]` to capture the 5–12 micron wavelength range.
3.   Set `xwindow` so that it excludes obviously bad columns (e.g., columns 0–10). Because there is a known linear slope across the MIRI/LRS background, it is important that you either (a) ensure that there are an equal number of background pixels to the left and right of the source and set `bg_deg` equal to `0` or (b) use any reasonable `xwindow` values and set `bg_deg=1` (easier to do, but comes at a penalty of higher noise). A reasonable setting would be `[11, 61]` with `bg_deg` set to `0`.
4.   Choose the `gaussian` centroiding method to be able to measure the spatial position and width of the star's point-spread function (PSF).
5.   Set `record_ypos` to `True` to record the position and width of the star's PSF on the detector for all frames. This will give us useful values to decorrelate against during our light curve analysis in part 2.
6.   Set `dqmask` to `True` to mask bad pixels (e.g., cosmic ray hits) identified by the `jwst` pipeline and marked in the data quality (DQ) array of the input _calints files.
7.   Set `ff_outlier` to `False`. If set to `True`, this step would perform sigma clipping along the entire time axis for the entire frame, while this is only done to the background pixels if `ff_outlier` is set to `False`. Setting `ff_outlier` to `True` is only really safe for observations of shallow eclipses where the sigma-clipping algorithm is unlikely to confuse an astrophysical signal for an outlier.
8.   Set `bg_thresh` (background area outlier threshold) values to `[5,5]` to do two iterations of 5-sigma clipping along the time axis to remove artifacts like cosmic rays.
9.   Set `bg_hw` (background exclusion area half-width) to something large enough to not include significant starlight (something between `8` and `15` should do).
10.  Set `bg_deg` according to point #3 above.
11.  Set `p3thresh` to `5` to sigma-clip 5-sigma outliers along the spatial direction. This will help to remove any remaining starlight left.
12.  Set `spec_hw` (spectral aperture half-width) to something large enough to capture much of the starlight (something between `4` and `8` should do).
13.  Set `fittype` to `meddata` to use the median frame (median computed along the time axis) as the profile for the optimal extraction method.
14.  Set `windowlen` to `1` so that no spectral smoothing is applied to the optimal extraction profile (at least for now - you can experiment with smoothing later if you want).
15.  Set `median_thresh` to `5` to clip 5-sigma outliers along the spectral-axis when computing the median frame.
16.  You can safely ignore `prof_deg` (only used if fittype=poly) and `p5thresh` (only used when `fittype` is `smooth`, `poly`, or `gauss`)
17.  Set `p7thresh` to `10` to sigma-clip 10-sigma outlier pixels compared to the optimal profile.
18.  Set `isplots_S3` to `4` to get lots of useful diagnostic figures (increase this to 5 if you need more plots to investigate problems).
19.  Set `nplots` to `5` to make repetitive figures only for the first 5 integrations (you can increase this as needed if you want more figures for troubleshooting)
20.  Set `hide_plots` to `False` so that the figures pop up in this notebook as they're made (set to `True` if you are not running the code in a notebook; otherwise you will have a lot of windows popping up).
21.  Set `verbose` to `True` so you get lots of useful information printed out.
22.  Set `topdir` to `{path_to_data_folder_on_your_machine}` which will use the folder you specified above.

In [None]:
s3_ecf_contents = f"""

# Fill this f-string text block with the contents of the S3 MIRI/LRS ECF template
# from https://github.com/kevin218/Eureka/tree/exoslam2025/demos/JWST
# and then adjust the values as described above.

"""

# This will save the ECF as a file that the next cell can read-in
with open(f'./S3_{eventlabel}.ecf', 'w') as f:
    f.write(s3_ecf_contents)

### 3.2 Running Eureka!'s Stage 3

The following cell will run Eureka!'s Stage 3 using the settings you defined above. Note that your ECF will be copied to your output folder, making it easy to remember how you produced those outputs hours, days, or years after you reduced the data.

This stage of Eureka! will take ~3 minutes to complete for these particular data.

In [None]:
s3_spec, s3_meta = eureka.S3_data_reduction.s3_reduce.reduce(eventlabel)

---
## 4. Eureka!'s Stage 4 (spectral binning & outlier clipping)

Stage 4 uses Stage 3's output to generate spectroscopic light curves by binning the time series of 1D spectra along the wavelength axis. But first we will have a look at the white light curve (integrated over all wavelengths, here 5–10.5 micrometers). Again you will have to fill in the control file for Stage 4.

### 4.1 Setting the Stage 4 "Eureka! Control File" (ECF) for a white lightcurve

**This determines what will happen during Stage 4**

To begin, please first copy below the contents of the ECF template for MIRI/LRS from the `S4_template.ecf` file in the ECF demos folder on [GitHub](https://github.com/kevin218/Eureka/tree/exoslam2025/demos/JWST).

The most important parameters and their recommended settings are described below, but more context can be found on the [Eureka! documentation website](https://eurekadocs.readthedocs.io/en/latest/ecf.html#stage-4).

The recommended settings for this dataset are as follows (be careful to validate each setting, as not all are the same as what is in the template):
1.   Let's first produce just a single white lightcurve by setting `nspecchan` to `1`
2.   Since we're only computing the white lightcurve, set `compute_white` to `False` as this feature is meant for cases where `nspecchan` is greater than `1`.
3.   Let's set the `wave_min` and `wave_max` to the approximate minimum and maximum usable wavelengths from MIRI/LRS. Normally we can use the full 5--12 micron wavelength range, but some unusual systematics affected the >10.5 micron region, so set `wave_min` to `5.0` and `wave_max` to `10.5`.
4.   Set `recordDrift` and `correctDrift` to `False` since there are too few spectral lines in the MIRI wavelength range to reliably measure drift/jitter in the dispersion direction. The other parameters in that chunk of inputs can then be safely ignored as they're only used if `recordDrift` or `correctDrift` are `True`.
5.   Remove outliers along the time axis (e.g. cosmic rays) by setting `clip_binned` to `True` (basically always helpful). `clip_unbinned` is typically not helpful and should be set to `False`.


To sigma-clip outliers, we need to have a reference time-series against which we are comparing our observations. If we just took the whole time-series and sigma clipped compared to the median level of the observations, we may well sigma-clip the entire transit signal in cases where there is a strong transit. Instead, we use a [box-car](https://en.wikipedia.org/wiki/Boxcar_function) filter which acts as a [high-pass filter](https://en.wikipedia.org/wiki/High-pass_filter); this removes any low-frequency signals (e.g. transit ingress/egress, phase variations, linear trends in time) and leaves behind high-frequency noise (cosmic rays, HGA moves, etc.). The most important parameters that control this box-car clipping are `sigma` and `box_width`, but `boundary` and `fill_value` are also relevant parameters.

6.   Set `sigma` to a value low enough to clip any obviously errant points while ensuring you are not clipping an excessive number of points and not clipping the transit or eclipse's ingress/egress. For these particular data, something around `4` should do, but this is not a strict rule and will change for each different dataset. The focus here is to remove obviously errant points and not to clip a bunch of points.
7.   Set `box_width` to a value small enough to not sigma-clip the transit or eclipse's ingress/egress, while also not setting it so small that the smoothed copy of the signal is excessively noisy. For these particular data, something around `20` should do, but again this is not a strict rule and will change for each different dataset.
8.   Set `boundary` to `fill` as this typically results in reasonable behaviour at the start and end of the observations.
9.   Set `fill_value` to `mask` in order to mask the clipped outliers without replacement. This ensures you remove bad values without requiring you to replace the masked values with a guess as to the value the point should have (a potentially dangerous endeavour).


Finally, there are some plotting/logging controls you should adjust:

10.  Set `compute_ld` to `False`. Since we're working on an eclipse observation, there is no need to compute theoretical limb-darkening coefficients for the star. The other inputs related to computing limb-darkening coefficients can then safely be ignored.
11.  Set `isplots_S4` to `3` to get some useful diagnostic figures (increase this to 5 if you need more plots to investigate problems).
12.  Set `hide_plots` to `False` so that the figures pop up in this notebook as they're made (set to `True` if you're running the code in the terminal instead of a notebook, otherwise you'll have a lot of windows popping up).
13.  Set `verbose` to `True` so you get lots of useful information printed out.
14.  Set `topdir` to `{path_to_data_folder_on_your_machine}` which will use the folder you specified above.
15.  Set `inputdir` to `Stage3`. If you end up running multiple version of Eureka!'s Stage 3, you can select the exact one you want as an input to Stage 4 by specifying the folder name in more detail (e.g. `Stage3/S3_2025-06-09_miri_run1`).
16.  Set `outputdir` to `Stage4_white` so that we can distinguish between this white lightcurve from the spectroscopic lightcurves we'll compute later

In [None]:
s4_ecf_contents = f"""

# Fill this f-string text block with the contents of the S4 ECF template
# from https://github.com/kevin218/Eureka/tree/exoslam2025/demos/JWST
# and then adjust the values as described above.

"""

# This will save the ECF as a file that the next cell can read-in
with open(f'S4_{eventlabel}.ecf', 'w') as f:
    f.write(s4_ecf_contents)

### 4.2 Running Eureka!'s Stage 4 for a white lightcurve

The following cell will run Eureka!'s Stage 4 using the settings you defined above. Note that your ECF will be copied to your output folder, making it easy to remember how you produced those outputs hours, days, or years after you reduced the data.

This stage of Eureka! will take &lt;1 minute to complete for these particular data.

In [None]:
s4_spec, s4_lc, s4_meta = eureka.S4_generate_lightcurves.s4_genLC.genlc(eventlabel)

As you can see the flux variations of the planet with its phase are very clear. The two secondary eclipses and the transit particularly stand out at maximum and minimum planetary flux. 

___
### 4.3 Setting the Stage 4 "Eureka! Control File" (ECF) for spectropscopic lightcurves

Now that you have successfully computed a white lightcurve, let's compute spectroscopic lightcurves. To do this, copy-paste your Stage 4 ECF contents from above and then:

1.   Change `nspecchan` to `11` to get 11 spectral channels across the specified 5–10.5 micron wavelength range, each of which will have a width of 0.5 microns.
2.   Change `compute_white` to `True` (not critical now, but may be useful to you in Part 2 or during your group project).
3.   Change `outputdir` to `Stage4` (to keep it separate from the white lightcurve outputs in the Stage4_white folder)

That's all we need to change to get spectroscopic lightcurves!

In [None]:
s4_ecf_contents = f"""

# You can copy-paste your Stage 4 ECF contents from above here,
# and then just adjust nspecchan, compute_white, and outputdir as described above

"""

# This will save the ECF as a file that the next cell can read-in.
# This will overwrite your previous Stage 4 ECF, but you'll still have all the
# information here in this notebook as well as a copied version of your ECF in
# your output folder.
with open(f'S4_{eventlabel}.ecf', 'w') as f:
    f.write(s4_ecf_contents)

### 4.4 Running Eureka!'s Stage 4 for spectroscopic lightcurves

The following cell will run Eureka!'s Stage 4 using the settings you defined above. Note that your ECF will be copied to your output folder, making it easy to remember how you produced those outputs hours, days, or years after you reduced the data.

This stage of Eureka! will take &lt;1 minute to complete for these particular data.

In [None]:
s4_spec, s4_lc, s4_meta = eureka.S4_generate_lightcurves.s4_genLC.genlc(eventlabel)

Now you can see how the phase curve shape and particularly the instrumental systematic effect (increasing or decreasing exponential trend at the beginning of the observations) varies with the various channel (wavelengths). In the part 2 on this hands-on we will focus on finding the best-fit model for the instrumental and astrophysical signals. In particular our goal will be to derive the spectral emission flux from the planet at various phases. 

---
## Bonus: Optimizing your reduction

Seeking the "best" reduction is where we venture into the art of data reduction, with various parameters sometimes having appreciable impacts on the precision we can get from our observations and (frighteningly) sometimes significantly impacting our final planetary spectra. At all times, think critically about the settings you are using and the potential for unintended consequences.

If you want to try to get even better results, focus on trying different values in Stage 3 for:

**Most impactful:**
*   `spec_hw` (with a larger aperture you capture more of the star's light but also capture more noisy background light - there is a trade-off here which will likely change for every different target you observe)
*   `bg_hw` (with a smaller value you have more background pixels but also more stellar contamination - there is a trade-off here which will likely change for every different target you observe)

**Intermediate impact:**
*   `p3thresh` (the best value will likely change depending on the value you use for `bg_hw`)
*   `p7thresh` (try larger/smaller values)
*   `window_len` (try values roughly in the range of `2`–`11` - larger values may help to smooth over noise in your profile, but if the value is too large it can result in bad extraction profiles and potentially biased results)
*   `median_thresh` (try larger/smaller values - note that this is only used if `fittype` is set to `meddata` and `window_len` is larger than `1`).

**Uncertain impact:**
*   `bg_thresh` (you likely won't see large changes, but might slightly improve the overall scatter)
*   `fittype` (try `smooth`)

<br/>

Typically the "best" value for each parameter is independent of the values of other parameters, but sometimes there are interactions (e.g. a smaller `bg_hw` may require smaller `p3thresh`). The Stage 4 MAD (Median Absolute Deviation) value is an approximate tool you can use to determine how much your reduction has improved (typically the lower the value, the lower the noise in your lightcurve, and the "better" the reduction). A better indication of an improved reduction is the RMS of the residuals from your Stage 5 fit (discussed in the Hands-On Session #2) since the Stage 4 MAD doesn't account for noise that can be decorrelated when fitting, but depending on your dataset it can be prohibitively time-intensive to continuously run all reduction options through Stage 5.

Again though, remain critical and vigilant at all times when optimizing your reduction. For example, you'll be able to decrease your MAD value by setting your Stage 4 `sigma` value to `0.01`, but you're going to end up discarding nearly all of your data which you definitely don't want to do. A less obvious danger might be setting Stage 3's `bg_hw` parameter to too small a value, resulting in self-subtraction and potentially biased transit or eclipse depths.