# SESAMME Demo

## 1) Goals and introduction <a name="Introduction"></a>

This notebook will walk users through the functionality of **SESAMME - 
Simultaneous Estimates of Star-cluster Age, Metallicity, Mass, and Extinction -** v1.0, which is a Python script for Bayesian inference of the properties of extragalactic star clusters.

SESAMME is built on a combination of the affine invariant Markov chain Monte Carlo algorithm `emcee` (Foreman-Mackey et al. 2013) and user-specified suites of simple stellar population models (SSPs - for example, BPASS v2.2 "Tuatara" (Stanway & Eldridge 2018) and/or v2.3 "Broc" (Bryne+ 2022)). As the name implies, SESAMME v1.0 marginalizes over four physical parameters - the age, metallicity, and sometimes the mass of the star cluster, as well as the degree of reddening along the line of sight.

By the end of this tutorial, you will have a working example of how to:
1. Load in a cube of SSP models to compare against your data;
2. Set up and execute a SESAMME run; and
3. Perform standard post-processing analyses of the resulting fits.


For an example of how to prepare a SESAMME model cube, please see the accompanying notebook `SESAMME Cubemaker Demo`. For illustrations of other steps in the modeling process (for example, how we rebinned our high-resolution spectra to the 1 A resolution of BPASS, or one way to create mock noise-added spectra of extragalactic star clusters), please see the accompanying notebook `Resampling Demo`.

1. [Introduction](#Introduction)
2. [Imports](#Imports)
3. [Setting up SESAMME's meta-parameters](#Setup)
4. [Starting a SESAMME run](#Run)

5. [Post-run analyses](#Post)
    1. [Cleaning the MCMC chain](#Burn)
    2. [Corner plots and statistics](#Corner)
    3. [Visualizing a likely model](#Visualization)

## 2) Imports <a name="Imports"></a>

In [None]:
import sesamme.models as models
import sesamme.mcmc as stats
import sesamme.vis as vis
import emcee
import corner

This notebook (and SESAMME generally) uses the following packages:

* **numpy** and **scipy** for handling array functions
* **astropy** for handling FITS files
* **emcee** (https://emcee.readthedocs.io/en/stable/) for the MCMC machinery
* **extinction** (https://extinction.readthedocs.io/en/latest/) and **dust_extinction** (https://dust-extinction.readthedocs.io/en/stable/) for accessing extinction curves
* **matplotlib** and **corner** for data visualization 

If you do not have these packages installed, you can install them using pip or conda.

## 3) Setting up SESAMME's meta-parameters

Before using SESAMME to model your spectrum, there are several meta-parameters that should be tweaked to fit your use case. These include specifying the attenuation curve used in the modeling; the size of the walker ensemble; the "ball" of initial positions for the walker ensemble; and ranges on the priors for each of the four variables.


**Choosing an attenuation curve**

During the model process, SESAMME will redden a model SSP by a specified amount using one of several attenuation curves. See the documentation for reference information on each curve. The options include:
* Five options for a Milky Way-like curve ('CCM', 'ODonnell', 'Fitzpatrick99', 'FitzMassa07', 'Gordon23')
* Curves based on the Large Magellanic Cloud ('LMC') and Small Magellanic Cloud ('SMC')
* A starbursting galaxy curve ('Calzetti')

The default option in the script version of SESAMME is the Milky Way attenuation curve of Cardelli, Clayton, & Mathis (1989). 

In [None]:
# Set the extinction curve
models.set_ext_law('CCM')

**Creating and initializing the emcee walker ensemble**

emcee, the statistical backbone of SESAMME, uses an ensemble of "walkers" to explore the parameter space and determine what values provide likely fits to your data. You must specify the number of walkers prior to running SESAMME, as well as the number of steps in the chain. In general, more walkers iterating for fewer steps gives better constraints on the properties of a star cluster than few walkers iterating for many steps. The default number of walkers is 128. 

You must also initialize the walkers in parameter space. Here, we do so using normal distributions centered on arbitrary values of `log(age)`, `log(Z)`, `E(B-V)`, and amplitude `log(A)`, each with a relatively narrow dispersion. The values used below are reasonable approximations for a young, massive, and sub-solar metallicity star cluster.

In [None]:
# Set the dimensionality of the emcee walker ensemble to N x 4
stats.set_walker_size(128)

# Set the desired chain length
stats.set_chain_size(5000)
    
# Set the initial positions of the walker ensemble
stats.set_initial_positions([7., -2.1, 0.3, -2.])

**Establish the parameter space and constrict the priors**

In order to generate models during the MCMC process, SESAMME needs to know what its options are for values of metallicity and age. These need to be explicitly set by the user (as a dictionary) according to the SSP model set being used. We show an example of the required formatting below, using the BPASS metallicity and age ranges.

Given these intervals, the parameter space explore by the MCMC walker ensemble in a given run can be further restricted through setting priors. By default, priors for all four variables have relatively broad ranges, but can be further restricted according to expectations about the age of the star cluster to be modeled, its metallicity, etc.

Priors on the age, metallicity, and rescaling parameter A should be given in log-units, while E(B-V) is linear. E.g., if you're confident that your cluster is between 1 and 10 Myr old, you should respectively set 'age_low' and 'age_hi' to 6.0 and 7.0 in the dictionary defined below. In our example data for M83-8, Hernandez+ 19 estimate a solar-ish metallicity $Z$ = 0.013, so we can reasonably constrain the metallicity to be between $0.008 < Z < 0.03$.

In [None]:
# Set lower and upper boundaries for priors on the four parameters, IN ORDER:
# log(age), log(Z), E(B-V), log(Ampl.)

prior_lowbounds = [6.0, np.log10(0.008), 0.01, -20.]
prior_highbounds = [7.5, np.log10(0.03), 1.0, 1.0]

stats.set_prior_bounds(stats.prior_dict, prior_lowbounds, prior_highbounds)

## 4) Starting a SESAMME run

Load in 1) the model cube containing the SSP suite and 2) the table of ionizing production rates you want to use in your modeling. See the supplementary script **cube_maker.py** for information about how these files are created. 

In [None]:
modelcube = models.load_ssp_cube("/path/to/cube_file.fits")

iontable = models.load_ionization_table("/path/to/qtable_file.txt")

Load in the spectrum of your star cluster. SESAMME assumes the spectrum has already been pre-processed (i.e., smoothed and rebinned, if necessary) and shifted to the rest frame. 

The example spectrum file contains HST/COS observations (with the G130M and G160M gratings) of a star cluster in M83. The spectrum has been corrected for the HI absorption profile around Lyman alpha; de-reddened according to the Galactic extinction maps of Schlafly & Finkbeiner 2011; shifted to the rest frame; and smoothed and regridded to the resolution of a BPASS model suite (1 A per pix). 

In [None]:
# Natively comes in flux density units (erg s-1 cm-2 A-1)...
specfile = Table.read("M83-8_fhrsg.txt", format='ascii')
wl = specfile['WL']
flux = specfile['FLUX']
flux_err = specfile['ERROR']

# But rescaling to luminosity density units (L_Sun A-1) allows SESAMME to infer a stellar mass for the cluster
lum = flux * 4*np.pi * (4.8*u.Mpc.to(u.cm))**2 / 3.83e33
lum_err = flux_err * 4*np.pi * (4.8*u.Mpc.to(u.cm))**2 / 3.83e33

windowlist = np.array([[np.min(wl), 1133], [1172, 1176.5],  [1188, 1202], [1203.8, 1222], 
                       [1257, 1262], [1299, 1305], [1331, 1336.2], [1276, 1286], [1454, 1456], [1465, 1469], 
                       [1390.5, 1393.5], [1399.5, 1403.3],
                       [1523.5, 1529], [1543, 1550.], [1608.5, 1621], [1656, 1659], [1666, 1674], [1795, np.max(wl)] ])

mask = models.get_mask(windowlist, wl)

In [None]:
filename = "M83_test.h5"
runname = 'Default'

In [None]:
stats.run_sesamme(filename, runname, wl, lum, lum_err, modelcube, iontable, mask, True)

## 5) Post-run analyses

To retrieve the results of an MCMC run, load in the relevant .H5 file. Don't forget to use the *name* parameter to specify which run to use, if multiple exist.

In [None]:
reader = emcee.backends.HDFBackend(filename, name = 'Default')
samples = reader.get_chain()

Create a time-series plot that shows the parameter values for each walker at each step in the chain. This will also allow you to determine visually how many "burn-in steps" should be excluded from the final analysis (see below).

In [None]:
fig, axes = plt.subplots(4, figsize=(9, 7), sharex=True)

labels = ["log(age/yr)", r"log(Z/Z$_{\odot}$)", "E(B-V)", "log(A)"]

for i in range(stats.ndim):
    ax = axes[i]
    ax.plot(samples[:, :, i], "k", alpha=0.3)
    ax.set_xlim(0, len(samples))
    
    ax.set_ylabel(labels[i])
    ax.yaxis.set_label_coords(-0.1, 0.5)

axes[-1].set_xlabel("step number");

Check the autocorrelation times for each parameter using the function `get_autocorr_time()` from `emcee`, which will determine the amount by which you should 'thin' the chain (see below).

Note that if the MCMC chain is shorter than 50 times the autocorrelation time for any of the four parameters, `get_autocorr_time()` will throw an error. This is an indication that your chain should be run for longer if formal convergence is desired, though in practice we find that chains of length 50,000 steps are often sufficient regardless of formal convergence.

In [None]:
tau = reader.get_autocorr_time()
print(tau)

Collapse the array of samples into a 4 x `nsteps` array. This is also where you:
- Discard the burn-in steps, or early iterations in the chain where the walker ensemble begins to explore the parameter space
- Thin the chain. Because the MCMC process retains some memory of previous iterations, you should only consider every *i*-th sample if you want a truly independent sampling of the posterior PDF. 

In [None]:
# Edit the discard and thin parameters according to the results of your run
flat_samples = reader.get_chain(discard=40, thin=10, flat=True) 
print(flat_samples.shape)

Inspect the 1-D distributions of the four parameters and their covariances. Values in `range` can be tweaked to better show the breadth of values in the posterior PDF.

You can also print the median and $\pm$1$\sigma$ values for each of the four parameters.

In [None]:
fig = corner.corner(
    flat_samples, labels=labels, truths=[np.log10(10e6), np.log10(0.0127), None, None], quantiles = [.16, .50, .84],
    range=[(6.0,7.5), (-2.2, -1.5), (0.0,0.2), (-3, 0), ], smooth = 0.5, smooth1d = 0.5,
    title_kwargs={"fontsize": 14, 'weight':'semibold'}, label_kwargs={"fontsize": 14, 'weight':'semibold'},
    hist2d_kwargs={"fontsize": 14, 'weight':'semibold'}
);

In [None]:
vis.print_stats(flat_samples)