Header Box: Title, names, etc.

#### Monte Carlo scattering calculations


#### Basic generative model

The goal of the generative model is to calculate reflection spectra from structural color samples defined by a set of parameters. The Monte Carlo model described above generates theoretical spectra but these spectra do not correspond to experimental measurements. The full generative model incorporates the Monte Carlo calculations as well as additional factors to produce reflection spectra that resemble experimental data.

All of the arguments taken by the Monte Carlo model can be measured for typical samples, except for particle volume fraction $\phi$, which is included in the generative model as an inference parameter. The Monte Carlo model returns a theoretical reflection calculation $R^t(\lambda, \phi)$, where $\lambda$ is the light wavelengths at which reflectance is calculated. A spectrum $\left \{ R^t(\phi) \right \}$ consists of a set of $R^t_i(\lambda_i, \phi)$ values, where $\left \{ \lambda_i \right \}$ represents all wavelengths used to produce the spectrum.

The theoretical reflectance $R^t$ must be transformed into a corrected reflectance $R^c$ before it can be compared with experimental data. Not all of the incident light on a non-absoribing sample is recorded as being reflected or transmitted, although this must be true. Some light is lost in the experimental, likely due to reflections off coverslip  interfaces, escape from the integrating sphere, or absorption within the system. Since the exact reasons for loss are not known, they cannot be accurately modeled. Instead, We assume only that there is some loss in intensity $L$, and that the loss is not constant at different wavelengths. Therefore a corrected spectrum $\left \{ R^c(\phi) \right \}$ is given by the elementwise product of $\left \{ R^c(\phi) \right \}$ and $\left \{ L \right \}$, or:

$R^c_i=L_iR^t_i(\lambda_i, \phi)$

The $L_i$ values are not generally known (or physically interesting), so they are marginalized over in the inference calculation (performed with MCMC so marginalization is trivial).

#### Generative model extensions

There are multiple ways to extend the basic model described above. All require modification of the underlying Monte Carlo model.

The highest priority extension is to calculate transmission spectra $\left \{ T^c(\phi) \right \}$ as well as reflection spectra. Both spectra can be measured experimentally from the same sample, so this extension would increase the amount of information that the generative model can output. By providing more information to inform our likelihood function, we hope to obtain a better estimate of sample volume fraction $\phi$. In general, there are some losses associated with both transmission and reflection, and there is no obvious simple relation between these values. Separate loss parameters $\left \{ L_R \right \}$ and $\left \{ L_T \right \}$ will be introduced at each wavelength and marginalized over in the inference calculation. This addition would require large changes to the Monte Carlo code, but few changes to the generative model, and is expected to dramatically increase the usefulness of the inference calculation.

Other extensions involve increasing the amount of information that goes into the Monte Carlo calculation by allowing parameters to vary that are currently held fixed. Examples include sphere polydispersity and incident beam divergence, both of which are currently fixed at zero. Both values could be measured experimentally or sampled as parameters and marginalized over in an inference calculation. These types of extensions would help to make the generative model more physical by accounting for phenomena that are currently neglected. They would require relatively minor changes to the Monte Carlo code, but the generative model may need to be reworked for computation time concerns. The basic model includes only a single parameter ($\phi$) that affects the Monte Carlo output, but addition of more parameters would increase the dimensionality and computation time required to adequately sample parameter space.

#### Model uncertainties


#### Likelihood Function

Due to computation time concerns, we hope to calculate our likelihood in two steps. Walkers will explore a posterior distribution in the parameter space defined by the loss values between each step they take in the volume fraction dimension. For this intermediate likelihood calculation, a theoretical spectrum $\left \{ R^t(\phi) \right \}$ has already been calculated. There are uncertainties associated with both the stochastic Monte Carlo model and the experimental data. These two uncertainties can be combined into a likelihood as in Gregory eq. 4.51. 

In [2]:
def likelihood(spect_theory, loss, spect_data, model_sigma, meas_sigma):
    """
    returns likelihood
    
    Parameters:
        spect_theory: a spectrum calculated from the MC model (array of length N) 
        loss: reflection loss parameters (array of length N)
        wavelength: independent variable (array of length N)
        relection: dependent variable - experimental measurements (array of length N)
        model_sigma: uncertainty associated with the probabilistic MC model (array of length N)
        meas_sigma: uncertainty assocated with the experimental measurements (array of length N)
    """
    
    
    residual = (spect_data - (1-loss)*spect_theory)
    var_eff = model_sigma**2 + meas_sigma**2
    chi_square = np.sum(residual**2/var_eff)
    prefactor = 1/np.prod(np.sqrt(2*np.pi*var_eff))
    return prefactor * np.exp(-chi_square/2)