# Sans2d data reduction

## Introduction

This notebook gives a concise overview of how to use the `esssans` package with Sciline, on the example of the data reduction of a Sans2d experiment.
We begin with relevant imports:

In [None]:
import scipp as sc
import sciline
import plopp as pp
import esssans as sans
import esssans.isis as isis
from esssans.types import *

## Define reduction parameters

We define the reduction parameters, with keys and types given by aliases or types defined in `esssans.types`:

In [None]:
params = {}

params[DirectBeamFilename] = 'DIRECT_SANS2D_REAR_34327_4m_8mm_16Feb16.dat'
params[Filename[SampleRun]] = 'SANS2D00063114.nxs'
params[Filename[BackgroundRun]] = 'SANS2D00063159.nxs'
params[Filename[EmptyBeamRun]] = 'SANS2D00063091.nxs'
params[OutFilename] = 'reduced.nxs'

params[NeXusMonitorName[Incident]] = 'monitor2'
params[NeXusMonitorName[Transmission]] = 'monitor4'

params[sans.isis.SampleOffset] = sc.vector([0.0, 0.0, 0.053], unit='m')
params[sans.isis.MonitorOffset[Transmission]] = sc.vector([0.0, 0.0, -6.719], unit='m')

params[WavelengthBins] = sc.linspace(
    'wavelength', start=2.0, stop=16.0, num=141, unit='angstrom'
)

params[sans.isis.sans2d.LowCountThreshold] = sc.scalar(100, unit='counts')

mask_interval = sc.array(dims=['wavelength'], values=[2.21, 2.59], unit='angstrom')
params[WavelengthMask] = sc.DataArray(
    sc.array(dims=['wavelength'], values=[True]),
    coords={'wavelength': mask_interval},
)

params[QBins] = sc.linspace(dim='Q', start=0.01, stop=0.6, num=141, unit='1/angstrom')
params[NonBackgroundWavelengthRange] = sc.array(
    dims=['wavelength'], values=[0.7, 17.1], unit='angstrom'
)
params[CorrectForGravity] = True
params[UncertaintyBroadcastMode] = UncertaintyBroadcastMode.upper_bound
params[sans.ReturnEvents] = True

## Create pipeline using Sciline

We use all providers available in `esssans` as well as the `isis` and `sans2d`-specific providers, which include I/O and mask setup specific to the [Sans2d](https://www.isis.stfc.ac.uk/Pages/sans2d.aspx) instrument:

In [None]:
providers = (
    sans.providers + isis.providers + isis.data.providers + isis.sans2d.providers
)
providers = providers + (
    sans.transmission_from_background_run,
    sans.transmission_from_sample_run,
)

pipeline = sciline.Pipeline(providers=providers, params=params)

## Use the pipeline

### Compute final result

We can get the graph for computing the background-subtracted $I(Q)$:

In [None]:
iofq = pipeline.get(BackgroundSubtractedIofQ)

Before we compute the result, we can visualize the pipeline:

In [None]:
# left-right layout works better for this graph
iofq.visualize(graph_attr={'rankdir': 'LR'})

Now we can compute the result:

In [None]:
result = iofq.compute()
result.hist().plot(scale={'Q': 'log'}, norm='log')

As the result was computed in event-mode, we can also use a different $Q$-binning, without re-reducing the data:

In [None]:
result.hist(Q=60).plot(scale={'Q': 'log'}, norm='log')

In the above we used an upper bound for the uncertainties of the normalization factors.
We can also compute the result with dropped normalization-factor uncertainties.
This is incorrect, but is useful for understanding whether the normalization factors significantly contribute to the uncertainty of the result:

In [None]:
pipeline[UncertaintyBroadcastMode] = UncertaintyBroadcastMode.drop
result_drop = pipeline.compute(BackgroundSubtractedIofQ)
# Reset the UnsertaintyBroadcastMode to the old value
pipeline[UncertaintyBroadcastMode] = UncertaintyBroadcastMode.upper_bound
sc.DataGroup(upper_bound=result, dropped=result_drop).hist().plot(norm='log')

### Save reduced data to file

`esssans` provides a function for saving the reduced data as an [NXcanSAS](https://manual.nexusformat.org/classes/applications/NXcanSAS.html) file.
It could be used directly with the `result` computed above, but we would have to provide the required metadata ourselves.
Instead, we use Sciline to get all required information directly from the pipeline: (See also the [File output](https://scipp.github.io/sciline/recipes/recipes.html#File-output) docs.)

In [None]:
from esssans.io import save_background_subtracted_iofq

pipeline.bind_and_call(save_background_subtracted_iofq)

### Compute intermediate results

For inspection and debugging purposes we can also compute intermediate results.
To avoid repeated computation (including costly loading of files) we can request multiple results at once, including the final result, if desired.
For example:

In [None]:
from esssans.isis import plot_flat_detector_xy

monitors = (
    WavelengthMonitor[SampleRun, Incident],
    WavelengthMonitor[SampleRun, Transmission],
    WavelengthMonitor[BackgroundRun, Incident],
    WavelengthMonitor[BackgroundRun, Transmission],
)
parts = (CleanSummedQ[SampleRun, Numerator], CleanSummedQ[SampleRun, Denominator])
iofqs = (IofQ[SampleRun], IofQ[BackgroundRun], BackgroundSubtractedIofQ)
keys = monitors + (MaskedData[SampleRun],) + parts + iofqs

results = pipeline.compute(keys)

display(sc.plot({str(key): results[key] for key in monitors}, norm='log'))

display(
    plot_flat_detector_xy(
        results[MaskedData[SampleRun]]['spectrum', :61440].hist(), norm='log'
    )
)

wavelength = pipeline.compute(WavelengthBins)
display(
    results[CleanSummedQ[SampleRun, Numerator]]
    .hist(wavelength=wavelength)
    .transpose()
    .plot(norm='log')
)
display(results[CleanSummedQ[SampleRun, Denominator]].plot(norm='log'))
parts = {str(key): results[key].sum('wavelength') for key in parts}
display(sc.plot(parts, norm='log'))

iofqs = {str(key): results[key] for key in iofqs}
iofqs = {key: val if val.bins is None else val.hist() for key, val in iofqs.items()}
display(sc.plot(iofqs, norm='log'))

### Avoiding duplicate computation with parameter tables

We have seen above that Sciline can avoid duplicate computation by requesting multiple results.
However, this is not always possible, for example if we want to compute the final result with different parameters.
In this case we can use parameter tables to avoid duplicate computation.
For example, we can compute the final result with different values for handling the uncertainties of the normalization factors.
This will avoid repeating loading files as well as some computation steps:

In [None]:
from typing import NewType

Mode = NewType('Mode', str)
param_table = sciline.ParamTable(
    Mode,
    {
        UncertaintyBroadcastMode: [
            UncertaintyBroadcastMode.upper_bound,
            UncertaintyBroadcastMode.drop,
        ]
    },
    index=[Mode('upper_bound'), Mode('drop')],
)
pipeline.set_param_table(param_table)
results = pipeline.compute(sciline.Series[Mode, BackgroundSubtractedIofQ])
sc.DataGroup(results).hist().plot(norm='log')

## Wavelength bands

We can also compute $I(Q)$ inside a set of wavelength bands, instead of using the full wavelength range in one go.
This is useful for debugging purposes.

To achieve this, we need to supply the `WavelengthBands` parameter (as a two-dimensional variable),
representing the wavelength range for each band.

In [None]:
pipeline[WavelengthBands] = sc.linspace(
    'wavelength', start=2.0, stop=16.0, num=11, unit='angstrom'
)

Compute the result:

In [None]:
result = pipeline.compute(BackgroundSubtractedIofQ)
result

The result is two-dimensional and we over-plot all the bands onto the same axes:

In [None]:
pp.plot(sc.collapse(result.hist(), keep='Q'), norm='log')

## Loading local files

The data files used above are hosted on an external server, and downloaded on-the-fly (and cached) when computing the result.

It is also possible to load local files from your hard drive, by using the `DataFolder` parameter.
We also need to insert the `isis.io.to_path` provider which supplies the file paths to the files in the folder.

As an example, we will save our current direct beam to disk, and then re-load it using a pipeline that reads local files.

**Note** that is it not currently possible to mix local and cloud files in the same pipeline with the present setup.

In [None]:
# Direct beam computation currently uses the `get_path` provider which
# fetches files from the remote server
direct_beam = pipeline.get(DirectBeam)
direct_beam.visualize()

In [None]:
# Save the direct beam to disk
db_filename = 'my_local_direct_beam_file.h5'
direct_beam.compute().save_hdf5(db_filename)

We now modify our pipeline by setting the `DataFolder` parameter,
as well as our new direct beam filename. Finally, we insert the local file provider `to_path`.

In [None]:
pipeline[DataFolder] = '.'
pipeline[DirectBeamFilename] = db_filename

# Insert provider for local files
pipeline.insert(isis.io.to_path)

We can now see that `to_path` uses both the file name and the local folder to create a file path:

In [None]:
db_local = pipeline.get(DirectBeam)
db_local.visualize()

In [None]:
db_local.compute().plot()