# Sans2d data reduction

## Introduction

This notebook gives a concise overview of how to use the `esssans` package with Sciline, on the example of the data reduction of a Sans2d experiment.
We begin with relevant imports:

In [None]:
import numpy as np
import scipp as sc
import sciline
import scippneutron as scn
import plopp as pp
import esssans as sans
from esssans.types import *

## Define reduction parameters

We define a dictionary containing the reduction parameters, with keys and types given by aliases or types defined in `esssans.types`:

In [None]:
params = sans.sans2d.default_parameters.copy()

params[FileList[BackgroundRun]] = ['SANS2D00063159.hdf5']
params[FileList[TransmissionRun[BackgroundRun]]] = params[FileList[BackgroundRun]]
params[FileList[SampleRun]] = ['SANS2D00063114.hdf5']
params[FileList[TransmissionRun[SampleRun]]] = params[FileList[SampleRun]]
params[FileList[EmptyBeamRun]] = ['SANS2D00063091.hdf5']
params[DirectBeamFilename] = 'DIRECT_SANS2D_REAR_34327_4m_8mm_16Feb16.hdf5'
params[OutFilename] = 'reduced.nxs'

band = sc.linspace('wavelength', 2.0, 16.0, num=2, unit='angstrom')
params[WavelengthBands] = band
params[WavelengthBins] = sc.linspace(
    'wavelength', start=band[0], stop=band[-1], num=141
)

params[sans.sans2d.LowCountThreshold] = sc.scalar(100, unit='counts')

mask_interval = sc.array(dims=['wavelength'], values=[2.21, 2.59], unit='angstrom')
params[WavelengthMask] = sc.DataArray(
    sc.array(dims=['wavelength'], values=[True]),
    coords={'wavelength': mask_interval},
)

params[QBins] = sc.linspace(dim='Q', start=0.01, stop=0.6, num=141, unit='1/angstrom')
params[NonBackgroundWavelengthRange] = sc.array(
    dims=['wavelength'], values=[0.7, 17.1], unit='angstrom'
)
params[CorrectForGravity] = True
params[UncertaintyBroadcastMode] = UncertaintyBroadcastMode.upper_bound

## Create pipeline using Sciline

We use all providers available in `esssans` as well as the `sans2d`-specific providers, which include I/O and mask setup specific to the [Sans2d](https://www.isis.stfc.ac.uk/Pages/sans2d.aspx) instrument:

In [None]:
providers = sans.providers + sans.sans2d.providers
pipeline = sciline.Pipeline(providers, params=params)

## Use the pipeline

### Compute final result

We can get the graph for computing the background-subtracted $I(Q)$:

In [None]:
iofq = pipeline.get(BackgroundSubtractedIofQ)

Before we compute the result, we can visualize the pipeline:

In [None]:
# left-right layout works better for this graph
iofq.visualize(graph_attr={'rankdir': 'LR'})

Now we can compute the result:

In [None]:
result = iofq.compute()
result.plot()

In the above we used an upper bound for the uncertainties of the normalization factors.
We can also compute the result with dropped normalization-factor uncertainties.
This is incorrect, but is useful for understanding whether the normalization factors significantly contribute to the uncertainty of the result:

In [None]:
params[UncertaintyBroadcastMode] = UncertaintyBroadcastMode.drop
pipeline_drop = sciline.Pipeline(providers, params=params)
result_drop = pipeline_drop.compute(BackgroundSubtractedIofQ)
sc.DataGroup(upper_bound=result, dropped=result_drop).plot(norm='log')

### Save reduced data to file

`esssans` provides a function for saving the reduced data as an [NXcanSAS](https://manual.nexusformat.org/classes/applications/NXcanSAS.html) file.
It could be used directly with the `result` computed above, but we would have to provide the required metadata ourselves.
Instead, we use Sciline to get all required information directly from the pipeline: (See also the [File output](https://scipp.github.io/sciline/recipes/recipes.html#File-output) docs.)

In [None]:
from esssans.io import save_background_subtracted_iofq

pipeline.bind_and_call(save_background_subtracted_iofq)

### Compute intermediate results

For inspection and debugging purposes we can also compute intermediate results.
To avoid repeated computation (including costly loading of files) we can request multiple results at once, including the final result, if desired.
For example:

In [None]:
monitors = (
    WavelengthMonitor[SampleRun, Incident],
    WavelengthMonitor[SampleRun, Transmission],
    WavelengthMonitor[BackgroundRun, Incident],
    WavelengthMonitor[BackgroundRun, Transmission],
)
parts = (CleanSummedQ[SampleRun, Numerator], CleanSummedQ[SampleRun, Denominator])
iofqs = (IofQ[SampleRun], IofQ[BackgroundRun], BackgroundSubtractedIofQ)
keys = monitors + (MaskedData[SampleRun],) + parts + iofqs

results = pipeline.compute(keys)

display(sc.plot({str(key): results[key] for key in monitors}, norm='log'))

display(
    scn.instrument_view(
        results[MaskedData[SampleRun]].hist(),
        pixel_size=0.0075,
        norm='log',
        camera=pp.graphics.Camera(position=(0, 0, 22)),
    )
)

parts = {str(key): results[key] for key in parts}
parts = {key: val if val.bins is None else val.hist() for key, val in parts.items()}
display(sc.plot(parts, norm='log'))

iofqs = {str(key): results[key] for key in iofqs}
iofqs = {key: val if val.bins is None else val.hist() for key, val in iofqs.items()}
display(sc.plot(iofqs, norm='log'))

### Avoiding duplicate computation with parameter tables

We have seen above that Sciline can avoid duplicate computation by requesting multiple results.
However, this is not always possible, for example if we want to compute the final result with different parameters.
In this case we can use parameter tables to avoid duplicate computation.
For example, we can compute the final result with different values for handling the uncertainties of the normalization factors.
This will avoid repeating loading files as well as some computation steps:

In [None]:
from typing import NewType

Mode = NewType('Mode', str)
param_table = sciline.ParamTable(
    Mode,
    {
        UncertaintyBroadcastMode: [
            UncertaintyBroadcastMode.upper_bound,
            UncertaintyBroadcastMode.drop,
        ]
    },
    index=[Mode('upper_bound'), Mode('drop')],
)
del params[UncertaintyBroadcastMode]
pl = sciline.Pipeline(providers, params=params)
pl.set_param_table(param_table)
results = pl.compute(sciline.Series[Mode, BackgroundSubtractedIofQ])
sc.DataGroup(results).plot(norm='log')

## Wavelength bands

We can also compute $I(Q)$ inside a set of wavelength bands, instead of using the full wavelength range in one go.
This is useful for debugging purposes.

To achieve this, we need to turn the `WavelengthBands` parameter into a two-dimensional variable,
representing the wavelength range for each band.

In [None]:
band = np.linspace(2.0, 16.0, num=11)
params[WavelengthBands] = sc.array(
    dims=['band', 'wavelength'],
    values=np.vstack([band[:-1], band[1:]]).T,
    unit='angstrom',
)

We then need to re-build the pipeline with the updated parameters

In [None]:
# Re-add the deleted param
params[UncertaintyBroadcastMode] = UncertaintyBroadcastMode.upper_bound

pipeline = sciline.Pipeline(providers, params=params)

and compute the result

In [None]:
result = pipeline.compute(BackgroundSubtractedIofQ)
result

The result is two-dimensional and we over-plot all the bands onto the same axes:

In [None]:
pp.plot(sc.collapse(result, keep='Q'), norm='log')