Save tof lookup table as a dataclass that also contains chopper info#279
Save tof lookup table as a dataclass that also contains chopper info#279
Conversation
docs/user-guide/tof/dream.ipynb
Outdated
| "table = lut_wf.compute(TimeOfFlightLookupTable)\n", | ||
| "\n", | ||
| "# Overlay mean on the figure above\n", | ||
| "table[\"distance\", 13].plot(ax=fig2.ax, color=\"C1\", ls=\"-\", marker=None)" | ||
| "table[\"data\"][\"distance\", 13].plot(ax=fig2.ax, color=\"C1\", ls=\"-\", marker=None)" |
There was a problem hiding this comment.
Before reviewing further, I really wonder if TimeOfFlightLookupTable should be a (data)class instead of a DataGroup? What do you think?
There was a problem hiding this comment.
Or maybe even pydantic could be useful here, since it is about ser/deser.
There was a problem hiding this comment.
DataGroup makes it very easy to save to hdf5.
What do we gain from using a dataclass instead?
There was a problem hiding this comment.
Seems much cleaner (and avoids the NewType stuff), since we have fixed structure with expected fields. You can write proper docstrings or field descriptions.
You can still convert to a DataGroup for ser/deser.
There was a problem hiding this comment.
You can still convert to a DataGroup for ser/deser.
So you would do something like sc.DataGroup(asdict(table)).save_hdf5('file.h5') ?
Ok, I'll have a go at that.
I feel pydantic would be a bit overkill.
There was a problem hiding this comment.
Any good suggestions for a name for the field I named "data"?
There was a problem hiding this comment.
data? Once you have it as a field name it will be very easy to rename via the IDE anyway ;)
There was a problem hiding this comment.
Or maybe it should be lut : TimeOfFlightLUT, and the dataclass should should have higher level name since it is not just the LUT itself?
There was a problem hiding this comment.
I'm going with 'array', I think it's quite common to use that for the thing that actually holds the data.
| """Save the lookup table to an HDF5 file.""" | ||
| sc.DataGroup(asdict(self)).save_hdf5(filename) | ||
|
|
||
| def plot(self, *args, **kwargs) -> Any: |
There was a problem hiding this comment.
It was a good suggestion to switch to a (data)class. I was able to add these methods so that changes are now very minimal to existing workflow.
I also made the loader compatible with the old file format so that we don't break all existing notebooks/workflows 👍
SimonHeybrock
left a comment
There was a problem hiding this comment.
👍 (see optional comment)
src/ess/reduce/time_of_flight/lut.py
Outdated
| out = { | ||
| "array": table, | ||
| "pulse_period": pulse_period, | ||
| "pulse_stride": pulse_stride, | ||
| "distance_resolution": table.coords["distance"][1] | ||
| - table.coords["distance"][0], | ||
| "time_resolution": table.coords["event_time_offset"][1] | ||
| - table.coords["event_time_offset"][0], | ||
| "error_threshold": error_threshold, | ||
| } | ||
|
|
||
| if simulation.choppers is not None: | ||
| out['choppers'] = sc.DataGroup( | ||
| {k: sc.DataGroup(ch.as_dict()) for k, ch in simulation.choppers.items()} | ||
| ) | ||
|
|
||
| return TimeOfFlightLookupTable(**out) |
There was a problem hiding this comment.
Why do we need the intermediate dict? Is there some internal default used that would not work if we set choppers=None?
We change the format of the tof lookup table from a DataArray to a dataclass, which enables us to also save the chopper info as a nested DataGroup (we convert to/from a DataGroup for serialization/deserialization).
The chopper info is crucial to have alongside the table, so we can add validation when trying to use the table, that it applies for the correct chopper settings.
Edit: found a way to make this compatible with the old format, so we don't break existing workflow and we can continue to use existing tables.
See #212 for a discussion as to why in the end I went with scipp.DataGroup/hdf5 instead of a NeXus file.
Needs scipp/scippneutron#649 & scipp/tof#105