# Mapping likelihood parameters to simulation parameters

Our likelihood is parameterize with a set of parameters $\theta$ describing the linear power spectrum at $z_\star=3$.

To evaluate the likelihood, at each redshift we use the parameters $\theta$ and the nuisance parameters $\phi$ to get the corresponding value of the emulator parameters $\mu$. The emulator looks for the snapshots in our arxiv with similar $\mu$ values, and returns the emulated 1D power spectrum. 

While setting up the emulator, and when preparing a "refinement step", we need to decide what simulations to run. These simulations are not specified by either likelihood parameters ($\theta$,$\phi$), or by emulator parameters $\mu$. Instead, the simulations are specified by a different set of parameters $\eta$, including for instance the redshift of reionization or the traditional cosmological parameters.

In other notebooks we have discussed how to go from cosmological parameters to likelihood parameters, or from likelihood parameters to emulator parameters. Here we will describe the inverse mapping, from linear power parameters $\theta$ to simulation parameters $\mu$, in particular the cosmological parameters. In a different notebook we will discuss how to map the nuisance parameters in the likelihood, $\phi$, to the astrophysical parameters in the simulation.

### Likelihood parameters: linear power $\theta$

We will use a maximum of five parameters to describe the linear power spectrum across the redshift range of interest.

$$ \theta = \{ \Delta_p^2, n_p, \alpha_p, f_\star, g_\star \} $$

Three of them will describe the linear power at $z_\star$, in velocity units (km/s). We will use a Taylor expansion of the logarithm of the linear power around $k_p = 0.009$ s/km. 

$$ \Delta_p^2 = \frac{k_p^3}{2 \pi^2} P_L(z_\star,k_p) $$

$$ n_p = \frac{\partial \log P_L(z_\star, k)}{\partial \log(k) } \Bigr\rvert_{k_p} $$

$$ \alpha_p = \frac{\partial^2 \log P_L(z_\star, k)}{\partial \log^2(k) } \Bigr\rvert_{k_p} $$

The other two will describe the redshift evolution of the linear growht $D(z)$ and the Hubble expansion $H(z)$, with respect $z_\star=3$, normalized by the expected evolution in an Einstein-de Sitter universe. 

$$ f_\star = f(z_\star) = \frac{\partial \log D(z)}{\partial \log a(z)} \Bigr\rvert_{z_\star} $$

$$ g_\star = g(z_\star) = \frac{\partial \log H(z)}{\partial \log (1+z)^{3/2}} \Bigr\rvert_{z_\star} $$


### Likelihood parameters: nuisance $\phi$
- Mean flux: We will have a handful of parameters to describe the mean transmitted flux fraction (or mean flux) across the relevant redshift range. We will refer to these parameters as the $\phi^\tau$ parameters, since they will parameters the effective optical depth $\tau_{\rm eff} = - \log \bar F$. There are several parameterizations possible (take a look at mean_flux.ipynb), but for now we choose a polynomial in $x_z = \log ((1+z)/(1+z_\star))$, of order $N_\tau-1$.

$$ \ln \tau_{\rm eff}(x_z) = \sum_{n=0}^{N_\tau-1} \phi^\tau_n \, x_z^n  \qquad \qquad x_z = \log \frac{1+z}{1+z_\star} $$

A popular choice in the literature is to assume a power law, i.e., use $N_\tau=2$. We probably want to go beyond that.

- Temperature / density relation (TDR): it is common to describe the termal state of the IGM with two parameters, describing a power law relation between temperature and density, valid around the mean density:
$$ T(\rho)= \left(\frac{\rho}{\rho_0}\right)^{\gamma-1} T_0$$
Even though this is a over-simplified description, we hope that this parameterization should be enough to marginalize over the uncertainties on the thermal state at a given point. 
Similarly to the mean flux case, we will describe the redshift evolution of $T_0$ and $\gamma$ with polynomials in $x_z$:
$$ \ln T_0(x_z) = \sum_{n=0}^{N_T-1} \phi^T_n \, x_z^n  \qquad \qquad \ln \gamma(x_z) = \sum_{n=0}^{N_\gamma-1} \phi^\gamma_n \, x_z^n $$

Here as well a popular choice is to assume a power law ($N_T=N_\gamma=2$). We might also go beyond that.

- Pressure smoothing: the last nuisance relevant for the emulator is the effect of pressure in the small scale distribution of the gas. 
This is different than the effect of "instantaneous" temperature, since the amount of pressure smoothing at a given time depends on the past thermal history. 
The effect of pressure is often parameterized using a "filtering length" $\lambda_F$, in velocity units, or its Fourier equivalent $k_F = 2 \pi / \lambda_F$. There are different ways of defining this length, and of measuring it from simulations, but we will not worry about it here. We assume that whatever we do is enough to marginalize over this uncertainty.
Similarly to the mean flux case, we will describe the redshift evolution of filtering length with a polynomial in $x_z$:
$$ \ln \lambda_F(x_z) = \sum_{n=0}^{N_T-1} \phi^\lambda_n \, x_z^n $$

Note that this is often not the approach used in the literature, where it is more common to see the effect of pressure smoothing either ignored or parameterized with the redshift of reionization. 
The relation between the two of them is discussed in another notebook.
For now we assume that we will use a polynomial for $\lambda_F$, with $N_\lambda=2$ or $3$.

### Emulator parameters $\mu$

The relation between the likelihood parameters ($\theta$,$\phi$) and the emulator parameters $\mu$ is discussed in the notebook full_likelihood. The emulator will have training set containing a large number of simulated spectra, identified by a list of parameters $\mu$:
$$ \mu = \{ \mu_P, \lambda_F, \bar F, \sigma_T, \gamma \} $$
where $\mu_P$ is a set of (3) parameters describing the linear power in the snapshot, in units of Mpc, and $\sigma_T$ is the thermal broadening corresponding to $T_0$:

$$ \sigma_T (T_0) = 9.1  \sqrt{ \frac{T_0}{10000K} } \, \mathrm{km/s} $$

Note that in the emulator $\sigma_T$ will be converted to comoving units in the same way that we have converted the linear power from velocity to comoving units.


### Simulation parameters $\eta$

From each simulation we run, we will output a number of snapshots, of order 10. 

Moreover, for each snapshot we extract different sets of simulated Lyman-$\alpha$ skewers (normalized quasar spectra), after applying different rescalings of the temperatures in the snapshot. 
These skewers are written to disk.

Finally, from each set of skewers we measure different 1D power spectra, after rescaling the mean optical depth in the spectra. 
This optical depth rescaling is trivial, and it is done on the fly. 
Each of these measured power spectra is fed to the emulator. 


Let's discuss the parameterization of each of the simulation packages:

- As we discuss above, the emulator labels the measured power with the set of parameters $ \mu = \{ \mu_P, \lambda_F, \bar F, \sigma_T, \gamma \} $.

- Each set of simulated skewers, written to disk, is described by a subset of these $ \{ \mu_P, \lambda_F, \sigma_T, \gamma \} $, since we will have different values of $\bar F$ from the skewers.

- Each snapshot is described by an even smaller subset of parameters, $ \{ \mu_P, \lambda_F \} $, since we will reprocess the snapshot for different temperature-density relations.

If we assumed that we can do as much rescaling of mean flux and temperature as we wanted, then this last set of parameters $ \{ \mu_P, \lambda_F \} $ would be the only relevant ones. If we can not rescaled as much as we would like to (because the rescaling breaks down?), then we would need to label the snapshots with a "central temperature" and "central mean flux", around which we perturb. 

The question is: how do we relate the emulator parameters $\mu$, or snapshot parameters, with the parameters specified in the configuration file of the simulation?