# Sacla - RIXS example

In this notebook, some simple code is provided as an example to produce some Resonant Inelastic X-Ray Scattering (RIXS) maps.

The basic steps to be performed are:
- select a suitable set of runs
- scan for finding the scanned monochromator energies
- produce the spectra for each energy, and the on/off maps

*The aim of this tutorial is not to produce extremely efficient code, but code as simple and as fast as possible to support the data quality evaluation during beamtime and after.*

We'll start with the usual set of imports:

In [1]:
import numpy as np
import matplotlib as mpl
# loading customized matplotlib style. If not available, it does nothing
try:
    mpl.rcParams = mpl.rc_params_from_file("/swissfel/photonics/sala/sacla/utilities/matplotlibrc")
except:
    pass

# using the NBAgg backend, which allows interactive plots in the Notebook
mpl.use("nbagg")

import matplotlib.pyplot as plt
import sys
import h5py
import pandas as pd

# Loading SACLA tools 
SACLA_LIB = "/swissfel/photonics/sala/sacla"
sys.path.append(SACLA_LIB)
import utilities as ut

# specific converters for the 2014-11 data taking. These should be customized per each beamtime!
from utilities import beamtime_converter_201411XX as sacla_converter

DIR = "/swissfel/photonics/data/2014-11-26_SACLA_ZnO/hdf5/"

Then, we define:
* the SACLA datasets
* $t_0$
* the runs to be analyzed

In [2]:
# Define SACLA quantities - they can change from beamtime to beamtime
daq_labels = {}
daq_labels["I0_down"] = "event_info/bl_3/eh_4/photodiode/photodiode_I0_lower_user_7_in_volt"
daq_labels["I0_up"] = "event_info/bl_3/eh_4/photodiode/photodiode_I0_upper_user_8_in_volt"
daq_labels["TFY"] = "event_info/bl_3/eh_4/photodiode/photodiode_sample_PD_user_9_in_volt"
daq_labels["photon_mono_energy"] = "event_info/bl_3/tc/mono_1_position_theta"
daq_labels["delay"] = "event_info/bl_3/eh_4/laser/delay_line_motor_29"
daq_labels["ND"] = "event_info/bl_3/eh_4/laser/nd_filter_motor_26"
daq_labels["photon_sase_energy"] = "event_info/bl_3/oh_2/photon_energy_in_eV"
daq_labels["x_status"] = "event_info/bl_3/eh_1/xfel_pulse_selector_status"
daq_labels["x_shut"] = "event_info/bl_3/shutter_1_open_valid_status"
daq_labels["laser_status"] = "event_info/bl_3/lh_1/laser_pulse_selector_status"
daq_labels["tags"] = "event_info/tag_number_list"

# the t0, to be found experimentally
t0 = 220.86

# these runs should correspond to delay=5ps
runs = [str(x) for x in range(258852, 258883)]
runs = sorted(runs)

In principle, a single run can contain *multiple mono settings*, so we need to load data from all the runs, and the group them by mono energy. `Pandas` can help us with that...

We load all data from files, place it in a `DataFrame`, and then add some useful derived quantities. At last, we use `tags` as index for the `DataFrame`

In [3]:
# create a DataFrame
df = pd.DataFrame(columns=daq_labels.keys(), )

for run in runs:
    mydict = {}  # temporary dict, where to store data
    fname = DIR + str(run) +"_roi.h5"  # the file name
    f = h5py.File(fname, "r")
    main_dset = f["run_" + str(run)]
    
    # Loading data from the specified datasets
    for k, v in daq_labels.iteritems():
        if k == "delay":
            # delays are in motor steps
            mydict[k] = sacla_converter.convert("delay", main_dset[v][:], t0=t0)
        elif k == "photon_mono_energy":
            # mono energy settings are in motor steps
            mydict[k] = sacla_converter.convert("energy", main_dset[v][:])
        elif k == "photon_sase_energy":
            mydict[k + "_mean"] = main_dset[v][:].mean()
        else:
            mydict[k] = main_dset[v][:]
    
    tmp_df = pd.DataFrame(data=mydict)
    tmp_df["run"] = run
    # Append the data to the dataframe
    df = df.append(tmp_df)

# round mono energy and delay
df.photon_mono_energy = np.round(df.photon_mono_energy.values, decimals=3)
df.delay = np.round(df.delay.values, decimals=2)

# create total I0 and absorption coefficients
df["I0"] = df.I0_up + df.I0_down
df["absorp"] = df.TFY / df.I0
df["is_laser"] = (df['laser_status'] == 1)

# set tag number as index
df = df.set_index("tags")

The last preliminary step is to filter out garbage data. As a bonus, you can also find out at which `tag` the mono scan setting changed:

In [4]:
# preparing the is_data mask
is_data = (df.x_shut == 1) & (df.x_status == 1) & (df.photon_mono_energy > 9.6)
is_data = is_data & (df.I0_up > 0.01) & (df.I0_down > 0.01) & (df.ND > -1)

# filtering out garbage
df = df[is_data]

# getting quantities when a variable changes
tmp = df.photon_mono_energy.values[1:] - df.photon_mono_energy.values[0:-1]
mask =tmp!=0
mask = np.insert(mask, 0, True, )
# this is where it changes
print df[mask].index.tolist()

[412626300.0, 412648260.0, 412659300.0, 412670340.0, 412681380.0, 412692480.0, 412703460.0, 412717500.0, 412728540.0, 412739580.0, 412761060.0, 412772100.0, 412783200.0, 412794240.0, 412816380.0, 412827420.0, 412838460.0, 412849500.0, 412860540.0, 412871580.0, 412886402.0, 412897440.0, 412908480.0, 412919520.0, 412941480.0, 412952580.0, 412963680.0, 412974840.0]


Now we can run the analysis. For each energy value and each run, a *list of tags* is created, such that events hage the same mono energy and they are part of the same run (as each run is in a separated file). For each of these lists, we run the `AnalysisProcessor` and create the required spectra, for laser on and off. 

In [5]:
%%time
# the mono energies contained in the files
energies_list = sorted(df.photon_mono_energy.unique().tolist())

# The AnalysisProcessor
an = ut.analysis.AnalysisProcessor()
# if you want a flat dict as a result
an.flatten_results = True
    
# add analysis
an.add_analysis("image_get_spectra", args={'axis': 1, })  #'thr_low': thr,})
an.add_analysis("image_get_mean_std", )  #args={'thr_low': thr})
#bins = np.arange(-150, 300, 5)
#an.add_analysis("image_get_histo_adu", args={'bins': bins})

# set the dataset, and the threshold
dataset_name = "detector_2d_1"
thr = 70
an.set_sacla_dataset(dataset_name)

# add preprocess steps
an.add_preprocess("image_set_thr", args={'thr_low': thr})
        
# run the analysis
n_events = -1
spectrum_on = None
spectrum_off = None

fnames = [DIR + str(run) +"_roi.h5" for run in runs]

#reload(ut)
#reload(ut.analysis)



[INFO] Setting a new dataset, removing stored preprocess functions. To overcome this, use remove_preprocess=False
CPU times: user 7 ms, sys: 0 ns, total: 7 ms
Wall time: 6.4 ms


In [6]:
#%%time
# initialization of the RIXS maps
rixs_map_on = np.zeros((len(energies_list), 1024))
rixs_map_off = np.zeros((len(energies_list), 1024))

for run in runs:
    # mono loop
    for energy in energies_list:
        df_run = df[df.run == run]
        energy_mask = df_run[df_run.photon_mono_energy == energy]
        fname = DIR + str(run) +"_roi.h5"  # the file name
        
        # if no values, just skip
        if len(energy_mask.index.values) == 0:
            continue
        
        # run the analysis
        results = an.analyze_images(fname, n=n_events, tags=energy_mask.index.values)
        
        # create a laser on/off mask, and spectrum
        laseron_mask = energy_mask.is_laser.values[:n_events]
        print results["spectra"][laseron_mask].shape, laseron_mask.shape, df_run.shape, df_run["I0"][laseron_mask].shape
        spectrum_on = np.nansum(results["spectra"][laseron_mask] / df_run[laseron_mask]["I0"], axis=0)
        rixs_map_on[energies_list.index(energy)] += spectrum_on
        spectrum_off = np.nansum(results["spectra"][~laseron_mask] / df_run[~laseron_mask]["I0"], axis=0)
        rixs_map_off[energies_list.index(energy)] += spectrum_off


(2642, 1024) (5307,) (5308, 15) (2642,)


ValueError: Item wrong length 5307 instead of 5308.

**BONUS:** To save some time, we can try a trivial parallelization on the analysis process, using the `multiprocessing` module. On a busy multicore system, this should take ~2:00 (compared to ~5:00 with the simple loop) minutes with 4 or more parallel jobs. In order to do this, we need to restructure a bit the loop, with the outer loop on energies, and the parallelism applied to the runs (i.e. on the files).

One difference w.t.r. of the simple loop is that in this case we are calling `an`, and not `an.analyze_images`: this is because of some technicalities of the `multiprocessing` module regarding pickling and unpickling, which are of no interest here. Just keep in mind that *for an AnalysisProcessor object calling* `an()` *or* `an.analyze_images()` *is the same*.

In [None]:
%%time
from multiprocessing import Pool
from multiprocessing.pool import ApplyResult

# initialization of the RIXS maps
rixs_map_on = np.zeros((len(energies_list), 1024))
rixs_map_off = np.zeros((len(energies_list), 1024))


n_events = -1
spectrum_on = None
spectrum_off = None

for i, energy in enumerate(energies_list):
    async_results = []  # list for results

    energy_masks = []
    # creating the pool
    pool = Pool(processes=4)
    # looping on the runs
    for j, run in enumerate(runs):
        df_run = df[df.run == run]
        energy_masks.append(df_run[df_run.photon_mono_energy == energy])
        # apply the analysis 
        #print energy_masks[-1].index.values
        async_results.append(pool.apply_async(an, (fnames[j], n_events, energy_masks[j].index.values)))

    # closing the pool
    pool.close()
    
    # waiting for all results
    results = [r.get() for r in async_results]
    
    print "Got results for energy", energy
    
    # producing the laser on/off maps
    for j, run in enumerate(runs):
        
        if not results[j].has_key("spectra"):
            continue

        energy_mask = energy_masks[j]
        laseron_mask = energy_mask.is_laser.values[:n_events]
        spectrum_on = np.nansum(results[j]["spectra"][laseron_mask], axis=0)
        rixs_map_on[energies_list.index(energy)] += spectrum_on
        spectrum_off = np.nansum(results[j]["spectra"][~laseron_mask], axis=0)
        rixs_map_off[energies_list.index(energy)] += spectrum_off

As a check, let's plot the last spectra, with and without smoothing:

In [None]:
from scipy.signal import savgol_filter

plt.figure()
plt.plot(spectrum_on)
plt.plot(spectrum_off)
plt.plot(spectrum_on - spectrum_off)
plt.plot(savgol_filter(spectrum_on - spectrum_off, 11, 3), "k", label="on - off smoothed", linewidth=2)
plt.show()

That's basically it! Now we can plot the unsmoothed RIXS map:

In [None]:
plt.figure()
ax = plt.subplot(111)

imgplot = ax.imshow(rixs_map_on - rixs_map_off, 
           origin="lower",
           extent=(rixs_map_on.shape[0], rixs_map_on.shape[1], 1000 * energies_list[0], 1000 * energies_list[-1]),
           aspect="auto",
           #interpolation="bilinear",
           cmap="bwr",
           vmax=3000, vmin=-3000
           )

ax.get_yaxis().set_major_formatter( mpl.ticker.FuncFormatter(lambda x, p: format(int(x), ',')))
plt.colorbar(imgplot)
plt.title("RIXS On - Off, delay=5ps")
plt.ylabel("Incoming energy (eV)")
plt.show()

and its smoothed version:

In [None]:
plt.figure()
ax = plt.subplot(111)

rixs_map = np.zeros(rixs_map_on.shape)

for e in range(rixs_map.shape[0]):
    rixs_map[e] = savgol_filter(rixs_map_on[e] - rixs_map_off[e], 11, 3)

imgplot = ax.imshow(rixs_map, 
            origin="lower",
           extent=(rixs_map.shape[0], rixs_map.shape[1], 1000 * energies_list[0], 1000 * energies_list[-1]),
           aspect="auto",
           #interpolation="bilinear",
           cmap="bwr",
           vmax=3000, vmin=-3000
           )

ax.get_yaxis().set_major_formatter( mpl.ticker.FuncFormatter(lambda x, p: format(int(x), ',')))
plt.colorbar(imgplot)
plt.title("RIXS On - Off, delay=5ps (smoothed)")
plt.ylabel("Incoming energy (eV)")
plt.show()

We can also try a 3D version:

In [None]:
from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure()
ax = fig.gca(projection='3d')
X = np.arange(rixs_map.shape[0])
Y = np.arange(rixs_map.shape[1])
X, Y = np.meshgrid(Y, X)
surf = ax.plot_surface(X, Y, rixs_map, rstride=1, cstride=1, 
        linewidth=0, antialiased=False, cmap="seismic")
fig.colorbar(surf, shrink=0.5, aspect=5)

plt.show()