# Analysis of DPPC X-ray Reflectometry Data

This is an example notebook showing the use of the surf_monolayer class for the analysis of surface-active molecules, such as lipids. 

In this example, the lipid DPPC is studied at the air-water interface. This X-ray reflectivity data was measured at Diamond Light Source, and shared openly. 

The SurfMono class constrains the model based on the following relationship. 

$$1-\phi_s = \frac{SLD_td_tb_h}{SLD_hd_hb_t}$$

where, $\phi_s$ is the fractional solvent volume in the head layer, $t$ and $h$ indicate tail or head layer, and $SLD$, $d$, and $b$, identify the scattering length density, thickness and scattering length respectively.

This example only shows one contrast however this same class is also capable of analysing multiple contrasts with the constraint that they all have the same underlying model (e.g. the surface excess should be the same).

In [None]:
# Standard libraries to import
%matplotlib inline
import numpy as np 
import matplotlib.pyplot as plt
from matplotlib import rcParams, rc
from scipy.stats import norm
from IPython.display import Markdown as md
import corner

# The refnx library, and associated classes
import refnx
from refnx.reflect import ReflectModel, SLD
from refnx.dataset import ReflectDataset
from refnx.analysis import Transform, CurveFitter, Objective, process_chain

# The SurfMono class to constain the monolayer model. 
from surfmono import SurfMono

The DPPC monolayer model is built, the SurfMono class constrains the number density of the lipid heads and tails such that they are held constant throughout the fitting process. This allows the volume fraction of solvent in the head region to be greater than 0, but keeps the volume fraction of solvent in the tails as 0 throughout. 

In [None]:
# Reading dataset into refnx format
dataset = ReflectDataset('dppc_water_xrr.dat')

# area per molecule
apm = 90

# Scattering length of the lipid head group in Angstrom
# (found from summing the electrons in the head group 
# and multiplying by the classical radius of an electron)
head_sl = 4674e-6
# volume of head group
v_head = 319
head_group_thickness = 12.5
# Scattering length of the lipid tail group (Angstrom)
tail_sl = 6897e-6
# volume of tail group
v_tail = 850
tail_group_thickness = 11.6

rough_head_tail = 3
rough_air_heads = 3

# reverse_monolayer = True says that the tail region is closest to the fronting
# medium (air). If reverse_monolayer = False, then the headgroups are
# closest to the fronting medium.
surfmono = SurfMono(apm, head_sl, v_head, head_group_thickness,
                    tail_sl, v_tail, tail_group_thickness, rough_head_tail,
                    rough_air_heads, name='dppc', reverse_monolayer=True)

# SLD's for fronting and backing medium
air = SLD(0, 'air')
water = SLD(9.45, 'h2o')

structure_dppc = air(0, 0) | surfmono | water(0, 3)

# the solvation of the membrane is controlled by the structure_dppc.solvent attribute
# By default this is assumed to be the SLD of the backing medium, but you can use
# any value you'd like.

# specify which parameters you'd like to vary, and their bounds.
surfmono.apm.setp(vary=True, bounds=(40, 300))
surfmono.vm_tails.setp(vary=True, bounds=(10, 1200))
surfmono.thickness_heads.setp(vary=True, bounds=(10, 18))
surfmono.thickness_tails.setp(vary=True, bounds=(10, 18))
surfmono.rough_preceding_mono.setp(vary=True, bounds=(1, 9))
surfmono.rough_head_tail.setp(vary=True, bounds=(1, 8))

# roughness of water - monolayer interface
structure_dppc[-1].rough.setp(vary=True, bounds=(1, 8))

model_dppc = ReflectModel(structure_dppc, name='dppc', dq=0)
model_dppc.scale.setp(vary=True, bounds=(0.7, 1.3))
# The background for held constant to a value determined from a previous fitting
model_dppc.bkg.setp(3.52703e-10, vary=True, bounds=(1e-10, 5e-10))

objective = Objective(model_dppc, dataset, use_weights=True, transform=Transform('logY'))
# A differential evolution algorithm is used to obtain an best fit
fitter = CurveFitter(objective)
# # A seed is used to ensure reproduciblity
res = fitter.fit('differential_evolution')

In [None]:
objective.plot()

In [None]:
plt.plot(*model_dppc.structure.sld_profile())

This is where the Markov Chain Monte Carlo (MCMC) sampling begins. This allows the parameter probability density functions to be determined. 

In [None]:
fitter.initialise('jitter')
fitter.sample(1, nthin=200)
fitter.sampler.reset()

In [None]:
# The collection is across 10*200 samples
# The random_stateseed is to allow for reproducibility
res = fitter.sample(30, nthin=50)

The 1D probability density functions of each of the parameters is then ploted, and 2D pdfs are used to show the correlations that are present between the different parameters. 

In [None]:
print(objective)

In [None]:
process_chain(objective, fitter.chain, nburn=5, nthin=2)

for pvec in objective.pgen(ngen=50):
    objective.setp(pvec)
    calc = objective.generative() * dataset.x**4
    plt.plot(dataset.x, calc, color='k', linewidth=1, alpha=0.05)
data = dataset.y * dataset.x**4
data_err = dataset.y_err * dataset.x**4
plt.errorbar(dataset.x, data, yerr=data_err, linestyle='', marker='x', markersize=5, 
             markeredgecolor='k', markerfacecolor='none', ecolor='k')

plt.ylabel('$Rq^4$/Å$^{-4}$')
plt.yscale('log')
plt.xlabel('$q$/Å$^{-1}$')
plt.tight_layout()

In [None]:
process_chain(objective, fitter.chain, nburn=5, nthin=2)
plt.plot(*structure_dppc.sld_profile(), color='b', linewidth=2)
for pvec in objective.pgen(50):
    objective.setp(pvec)
    plt.plot(*structure_dppc.sld_profile(), color='k', linewidth=2, alpha=0.01)

plt.xlabel('$z$/Å')
plt.ylabel('SLD/$10^{-6}Å^{-2}$');

In [None]:
corner.corner(fitter.chain.reshape(-1, len(objective.varying_parameters())),
             labels=objective.varying_parameters().names());