This notebook is a simple walkthrough of what's possible with XArray for resonant scattering, and a demonstration of the library I've been developing.

The library has a few discrete modules:

1) File loaders.  These abstract away the details of getting raw data into a raw xarray and correcting the intensity (which lives here because it tends ot be metadata-intensive).  To add support for a new beamline you simply write a class that inherits from FileLoader with one method that opens the file, formats it into an xarray, and cleans up the metadata to include standard terms like exposure, energy, pos_x, pos_th, etc.

(optionally, data preprocessors like HDRSoXS exposure stitching can live between these two steps.)

2) Integrators.  These abstract away the details of conversion from a raw xarray to a chi,q xarray.  I provide a pyFAI based general purpose integrator (no parameters change within an array) and an energy series integrator (energy changes in an array and is handled efficiently).  Implementing other geometries (e.g. a detector moving) would be handled by subclassing the general integrator.  Implementing other backends (for example, pyGix) is also easy by just supporting the same integrateSingleImage(img) and integrateImageStack(img_stack) methods.

(you can slice the data here easily with standard xarray selector methods - as demonstrated below - or process it)

3) Fitting.  The fits are very simple to write, I provide a few demos here for lorentz and gauss peaks.  I think there are better curve fitting backends than scipy.optimize.curve_fit (lm_fit?) but that's what I used.


First, we import the modules we're going to use: (note the hack to put the library directory on the path)

In [None]:
import sys
sys.path.append("../PyHyperScattering/")

from ALS11012RSoXSLoader import ALS11012RSoXSLoader

from PFEnergySeriesIntegrator import PFEnergySeriesIntegrator

Next, we configure the loader with dark images and correction details.

In [None]:
loader = ALS11012RSoXSLoader(corr_mode='none')
loader.loadSampleSpecificDarks("../example_data/CCD/PSg/",md_filter={'sampleid':11})

Loading a single image is fairly straightforward, and let's pause here to look at the datastructure:

In [None]:
test_single_image = loader.loadSingleImage('../example_data/CCD/PSg/L2_PSg_B_PZS_62757-00094.fits')
test_single_image.plot()
test_single_image

As you can see, the image is an xarray with dimensions pix_x and pix_y.  The metadata are all in "attributes" with some extra entries created with standard language.

My stack loader can use any entry in the single image attribute field as a coordinate of the dataset - meaning it can be selected on, sorted, etc.  You can also add arbitrary user data to the attributes via a simple interface.  Here's an example of that:

This is an example of how one might make an arbitrary user coordinate from info like the file name, in this case we'll just munge the exposure number from the name

In [None]:
import os 
files = os.listdir('../example_data/CCD/PSg')

filenumber_coord = {}
for file in files:
    if '.fits' in file:
        filenumber_coord.update({file:int(file[-10:-5])})

What we just made is a dictionary of filename:number pairs - a coordinate!  You could use the filename to get other data, e.g. temperature, from an outside source.

To use this coordinate, you do two things. 

(1) Pass the {filename:value,filename:value} dict you created in coords, as an entry of another dict where the key is the name of the parameter, e.g.
'filenumber':filenumber coord.

(2) Tell the loader you want your new attribute to be used as a dimension of the resulting dataset by adding the name to dims.

So, next we're going to load a series.  The arguments here are path, dimensions (what axes we want the resulting xarray to have), coordinates (extra axes we generated, as above), and finally a metadata filter to suppress certain files.  In this case we only want a single sample number so we say to ignore files where sampleid is not 11.

In [None]:
raw_xr = loader.loadFileSeries(
                                '../example_data/CCD/PSg/',
                               ['energy','polarization','exposure','filenumber'],
                               coords = {'filenumber':filenumber_coord},
                               md_filter={'sampleid':11,'CCD Shutter Inhibit':0}
                              )

Let's look at this xarray:

In [None]:
raw_xr

So, this is now a stack of 184 images in a single xarray.  We can use xarray selectors to punch out single images or subsets really easily:

In [None]:
raw_xr.sel(energy=320,polarization=100,method='nearest')

In [None]:
raw_xr.sel(energy=320,polarization=100,exposure=0.003,method='nearest').plot()

OK, let's try integrating this to get something useful!

We set up an integrator - the integrator supports calibrations from Nika and masks from Nika to make life easier.

In [None]:
integrator = PFEnergySeriesIntegrator(maskmethod = "nika",maskpath = "../example_data/LowQ_mask.hdf",
                                 geomethod="nika",NIdistance=131.06, NIbcx=(561.76), NIbcy=(1024-452.33),
                                 
                                 integration_method='csr_ocl')


Let's take our test image from before and integrate it:

In [None]:
test_single_image.plot()

In [None]:
integrator.integrateSingleImage(test_single_image).plot()

We can integrate a whole stack of images really easily:

Note, because we're using the energy series integrator, we're internally pre-allocating integrator objects for each energy then using those for each image for speed.  All these details are abstracted.

In [None]:
int_xr = integrator.integrateImageStack(raw_xr)

In [None]:
int_xr

A side warning here: if you are using my example data (see [Obtaining Example Data](../docs/source/getting_started/example_data.rst)), at this point the memory usage will be kind of insane (~20 GB peak, about 15.5 GB at rest).  We can clean up by tossing the raw data and integrator, which will take us down to something ~2 GB:

In [None]:
raw_xr = None
integrator = None

As with the raw data, we can slice this using standard select commands, and even plot using the core xarray plotting.

In [None]:
int_xr.sel(energy=270,polarization=100,exposure=1.002,method='nearest').sel(chi=0,method='nearest').plot(xscale='log',yscale='log')

We can use slicing to punch out energy scans, for example: (this gets a little gross, probably I need an xarray cleanup function)

In [None]:
pol100chi0 = int_xr.sel(polarization=100,chi=0,method='nearest')
pol100chi0

The one crude thing we have to do - remember that 'filenumber' axis we put in?  We need to get rid of it for auto-plotting to work... so we'll unstack the system multiindex (like a compound axis, done above) and select all values of the filenumber coordinate.  We also need to put the array in the right order for auto-plotting.

In [None]:
pol100chi0 = pol100chi0.sel(filenumber=slice(0,500)).sel(exposure=slice(0.08,0.11))
pol100chi0 = pol100chi0.drop('chi').sortby('energy').sortby('q')
pol100chi0

In [None]:
from matplotlib.colors import LogNorm
pol100chi0.plot(norm=LogNorm(1e-1,5e5))

For fitting, we can use the xarray split-apply-combine paradigm to automate things nicely.
Skipping the array prep for now, but here is the syntax.  This part is very much work in progress.

In [None]:
import Fitting
from Fitting import fit_lorentz_bg

In [None]:
lor0p002_p100_e0p1_chi0 = (pol100e0p1
       .coarsen(chi=10).mean()      #you can either coarsen chi (where number is an integer number of bins
                                     #over which to do a moving average), or
       #.sel(chi=0,method='nearest')  #you can just select a single chi stripe, if you do this, you can't stack
       .stack(echi=['energy','chi']) 
       .sel(q=slice(0.0018,0.0025))
       .groupby('echi')
       .map(fit_lorentz_bg,guess=[0,0,0.0002,2e-8],pos_int_override=True)
       .unstack('echi'))
lor0p002_p100_e0p1_chim90 = (pol100e0p1
       #.coarsen(chi=15).mean()      #you can either coarsen chi (where number is an integer number of bins
                                     #over which to do a moving average), or
       .sel(chi=-90,method='nearest')  #you can just select a single chi stripe, if you do this, you can't stack
       .stack(echi=['energy'])#,'chi'])
       .sel(q=slice(0.0018,0.0025))
       .groupby('echi')
       .map(fit_lorentz,guess=[0,0,0.0002],pos_int_override=True)
       .unstack('echi'))

