# DREAM advanced data reduction

- Audience: Instrument (data) scientists, instrument users
- Prerequisites: Basic knowledge of [Scipp](https://scipp.github.io/), [Sciline](https://scipp.github.io/sciline/)

This notebook builds on the [basic powder workflow](./dream-data-reduction.rst) and demonstrates how the workflow can be used to compute different results and how alternative steps can be used.

This notebook uses the same data as the basic notebook, a McStas + GEANT4 simulation.
The data is available through the ESSdiffraction package but accessing it requires the `pooch` package.
If you get an error about a missing module `pooch`, you can install it with `!pip install pooch`:

In [None]:
import scipp as sc
from ess import dream, powder
import ess.dream.data  # noqa: F401
from ess.powder.types import *

## Compute intensity as a function of scattering angle

The basic notebook sums over all detector voxels and produces a 1D curve.
Here, we instead bin by scattering angle $2\theta$.

First, define the same workflow as in the [basic example](./dream-data-reduction.rst#create_and_configure_the_workfow):

In [None]:
workflow = dream.DreamGeant4Workflow(run_norm=powder.RunNormalization.monitor_histogram)

workflow[Filename[SampleRun]] = dream.data.simulated_diamond_sample()
workflow[Filename[VanadiumRun]] = dream.data.simulated_vanadium_sample()
workflow[Filename[BackgroundRun]] = dream.data.simulated_empty_can()
workflow[CalibrationFilename] = None

workflow[MonitorFilename[SampleRun]] = dream.data.simulated_monitor_diamond_sample()
workflow[MonitorFilename[VanadiumRun]] = dream.data.simulated_monitor_vanadium_sample()
workflow[MonitorFilename[BackgroundRun]] = dream.data.simulated_monitor_empty_can()
workflow[CaveMonitorPosition] = sc.vector([0.0, 0.0, -4220.0], unit="mm")

workflow[dream.InstrumentConfiguration] = dream.InstrumentConfiguration.high_flux
# Select a detector bank:
workflow[NeXusDetectorName] = "mantle"
# We drop uncertainties where they would otherwise lead to correlations:
workflow[UncertaintyBroadcastMode] = UncertaintyBroadcastMode.drop
# Edges for binning in d-spacing:
workflow[DspacingBins] = sc.linspace("dspacing", 0.3, 2.3434, 201, unit="angstrom")

# Do not mask any pixels / voxels:
workflow = powder.with_pixel_mask_filenames(workflow, [])

And then add the desired bin edges for $2\theta$:

In [None]:
workflow[TwoThetaBins] = sc.linspace(
    dim="two_theta", unit="rad", start=0.8, stop=2.4, num=201
)

Now we can compute the intensity as a function of $2\theta$ and $d$-spacing by requesting `IofDspacingTwoTheta`:

In [None]:
grouped_dspacing = workflow.compute(IofDspacingTwoTheta)
grouped_dspacing

In [None]:
grouped_dspacing.hist().plot(norm="log")

## Alternative run normalizations

The [basic example](./dream-data-reduction.rst) normalizes the detector data by a monitor that was histogrammed in wavelength.
ESSdiffraction provides some alternatives.

### Normalize by integrated monitor

Instead of computing a histogram of the monitor data, we can integrate over all bins to get a single intensity value for the monitor.
To do so, specify `ess.powder.RunNormalization.monitor_integrated` when constructing the workflow.
This will insert [normalize_by_monitor_integrated](../../generated/modules/ess.powder.correction.normalize_by_monitor_integrated.rst) into the workflow.

In [None]:
workflow = dream.DreamGeant4Workflow(run_norm=powder.RunNormalization.monitor_integrated)

Then set all parameters as before:

In [None]:
workflow[Filename[SampleRun]] = dream.data.simulated_diamond_sample()
workflow[Filename[VanadiumRun]] = dream.data.simulated_vanadium_sample()
workflow[Filename[BackgroundRun]] = dream.data.simulated_empty_can()
workflow[CalibrationFilename] = None

workflow[MonitorFilename[SampleRun]] = dream.data.simulated_monitor_diamond_sample()
workflow[MonitorFilename[VanadiumRun]] = dream.data.simulated_monitor_vanadium_sample()
workflow[MonitorFilename[BackgroundRun]] = dream.data.simulated_monitor_empty_can()
workflow[CaveMonitorPosition] = sc.vector([0.0, 0.0, -4220.0], unit="mm")

workflow[dream.InstrumentConfiguration] = dream.InstrumentConfiguration.high_flux
# Select a detector bank:
workflow[NeXusDetectorName] = "mantle"
# We drop uncertainties where they would otherwise lead to correlations:
workflow[UncertaintyBroadcastMode] = UncertaintyBroadcastMode.drop
# Edges for binning in d-spacing:
workflow[DspacingBins] = sc.linspace("dspacing", 0.3, 2.3434, 201, unit="angstrom")

# Do not mask any pixels / voxels:
workflow = powder.with_pixel_mask_filenames(workflow, [])

And compute the result:

In [None]:
result = workflow.compute(IofTof)
result.hist().plot()

### Normalize by proton charge

We can normalize the detector data by the accumulated proton charge.
This works similarly to normalizing by a monitor, but we pass `ess.powder.RunNormalization.proton_charge` when building the workflow.
This will insert [normalize_by_proton_charge](../../generated/modules/ess.powder.correction.normalize_by_proton_charge.rst) into the workflow.

In [None]:
workflow = dream.DreamGeant4Workflow(run_norm=powder.RunNormalization.proton_charge)

workflow[Filename[SampleRun]] = dream.data.simulated_diamond_sample()
workflow[Filename[VanadiumRun]] = dream.data.simulated_vanadium_sample()
workflow[Filename[BackgroundRun]] = dream.data.simulated_empty_can()
workflow[CalibrationFilename] = None

workflow[dream.InstrumentConfiguration] = dream.InstrumentConfiguration.high_flux
# Select a detector bank:
workflow[NeXusDetectorName] = "mantle"
# We drop uncertainties where they would otherwise lead to correlations:
workflow[UncertaintyBroadcastMode] = UncertaintyBroadcastMode.drop
# Edges for binning in d-spacing:
workflow[DspacingBins] = sc.linspace("dspacing", 0.3, 2.3434, 201, unit="angstrom")

# Do not mask any pixels / voxels:
workflow = powder.with_pixel_mask_filenames(workflow, [])

And compute the result as normal:

In [None]:
result = workflow.compute(IofTof)
result.hist().plot()

We can also inspect the workflow graph to see that the proton charge normalization has been inserted:

In [None]:
workflow.visualize(IofTof, graph_attr={"rankdir": "LR"})

## Compute intermediate results

For inspection and debugging purposes, we can also compute intermediate results.
To avoid repeated computation (including costly loading of files), we can request multiple results at once, including the final result, if desired.
For example:

In [None]:
intermediates = workflow.compute(
    (
        DataWithScatteringCoordinates[SampleRun],
        MaskedData[SampleRun],
    )
)

intermediates[DataWithScatteringCoordinates[SampleRun]]

In [None]:
two_theta = sc.linspace("two_theta", 0.8, 2.4, 301, unit="rad")
intermediates[MaskedData[SampleRun]].hist(
    two_theta=two_theta, wavelength=300
).plot(norm="log")

## Process all detector banks

The other sections only use a single detector bank.
In practice, we want to process all banks.
This section demonstrates how to do this, except for the sans detector which requires a different workflow.

We construct the workflow as before but this time **without specifying a detector name**:

In [None]:
workflow = dream.DreamGeant4Workflow(run_norm=powder.RunNormalization.monitor_histogram)

workflow[Filename[SampleRun]] = dream.data.simulated_diamond_sample()
workflow[Filename[VanadiumRun]] = dream.data.simulated_vanadium_sample()
workflow[Filename[BackgroundRun]] = dream.data.simulated_empty_can()
workflow[CalibrationFilename] = None

workflow[MonitorFilename[SampleRun]] = dream.data.simulated_monitor_diamond_sample()
workflow[MonitorFilename[VanadiumRun]] = dream.data.simulated_monitor_vanadium_sample()
workflow[MonitorFilename[BackgroundRun]] = dream.data.simulated_monitor_empty_can()
workflow[CaveMonitorPosition] = sc.vector([0.0, 0.0, -4220.0], unit="mm")

workflow[dream.InstrumentConfiguration] = dream.InstrumentConfiguration.high_flux
# We drop uncertainties where they would otherwise lead to correlations:
workflow[UncertaintyBroadcastMode] = UncertaintyBroadcastMode.drop
# Edges for binning in d-spacing:
workflow[DspacingBins] = sc.linspace("dspacing", 0.3, 2.3434, 201, unit="angstrom")

# Do not mask any pixels / voxels:
workflow = powder.with_pixel_mask_filenames(workflow, [])

Now, we [map](https://scipp.github.io/sciline/user-guide/parameter-tables.html) the workflow over the desired detector names to apply it to each bank separately.
We could do this at some intermediate step, but it is easiest to map the final result.
Finally, we stack the data arrays for the individual detectors into a single data array.

In [None]:
detector_names = ["mantle", "endcap_forward", "endcap_backward", "high_resolution"]
mapped = workflow[IofTof].map({NeXusDetectorName: detector_names})
workflow[IofTof] = mapped.reduce(func=powder.grouping.stack_detectors)

Now compute the result:

In [None]:
result = workflow.compute(IofTof)

In [None]:
result

We can plot the detectors individually with the help of `sc.collapse`:

In [None]:
split = sc.DataGroup({
    da.coords['detector'].value: da
    for da in sc.collapse(result, keep='tof').values()
})
split.hist().plot()

Or we sum over detector banks:

In [None]:
result.hist(tof=300).plot()