Skip to content

Save tof lookup table as a dataclass that also contains chopper info#279

Merged
nvaytet merged 15 commits intomainfrom
datagroup-tof-lut
Nov 28, 2025
Merged

Save tof lookup table as a dataclass that also contains chopper info#279
nvaytet merged 15 commits intomainfrom
datagroup-tof-lut

Conversation

@nvaytet
Copy link
Member

@nvaytet nvaytet commented Nov 13, 2025

We change the format of the tof lookup table from a DataArray to a dataclass, which enables us to also save the chopper info as a nested DataGroup (we convert to/from a DataGroup for serialization/deserialization).

The chopper info is crucial to have alongside the table, so we can add validation when trying to use the table, that it applies for the correct chopper settings.

⚠️ Note that unfortunately, this will break all workflows that currently use a tof lookup table stored in a hdf file.

Edit: found a way to make this compatible with the old format, so we don't break existing workflow and we can continue to use existing tables.

See #212 for a discussion as to why in the end I went with scipp.DataGroup/hdf5 instead of a NeXus file.

Needs scipp/scippneutron#649 & scipp/tof#105

@nvaytet nvaytet marked this pull request as ready for review November 27, 2025 18:47
Comment on lines 414 to 417
"table = lut_wf.compute(TimeOfFlightLookupTable)\n",
"\n",
"# Overlay mean on the figure above\n",
"table[\"distance\", 13].plot(ax=fig2.ax, color=\"C1\", ls=\"-\", marker=None)"
"table[\"data\"][\"distance\", 13].plot(ax=fig2.ax, color=\"C1\", ls=\"-\", marker=None)"
Copy link
Member

@SimonHeybrock SimonHeybrock Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before reviewing further, I really wonder if TimeOfFlightLookupTable should be a (data)class instead of a DataGroup? What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe even pydantic could be useful here, since it is about ser/deser.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataGroup makes it very easy to save to hdf5.
What do we gain from using a dataclass instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems much cleaner (and avoids the NewType stuff), since we have fixed structure with expected fields. You can write proper docstrings or field descriptions.

You can still convert to a DataGroup for ser/deser.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can still convert to a DataGroup for ser/deser.

So you would do something like sc.DataGroup(asdict(table)).save_hdf5('file.h5') ?
Ok, I'll have a go at that.
I feel pydantic would be a bit overkill.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any good suggestions for a name for the field I named "data"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data? Once you have it as a field name it will be very easy to rename via the IDE anyway ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe it should be lut : TimeOfFlightLUT, and the dataclass should should have higher level name since it is not just the LUT itself?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going with 'array', I think it's quite common to use that for the thing that actually holds the data.

"""Save the lookup table to an HDF5 file."""
sc.DataGroup(asdict(self)).save_hdf5(filename)

def plot(self, *args, **kwargs) -> Any:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a good suggestion to switch to a (data)class. I was able to add these methods so that changes are now very minimal to existing workflow.

I also made the loader compatible with the old file format so that we don't break all existing notebooks/workflows 👍

@nvaytet nvaytet changed the title Save tof lookup table as a Datagroup that also contains chopper info Save tof lookup table as a dataclass that also contains chopper info Nov 28, 2025
Copy link
Member

@SimonHeybrock SimonHeybrock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 (see optional comment)

Comment on lines 375 to 391
out = {
"array": table,
"pulse_period": pulse_period,
"pulse_stride": pulse_stride,
"distance_resolution": table.coords["distance"][1]
- table.coords["distance"][0],
"time_resolution": table.coords["event_time_offset"][1]
- table.coords["event_time_offset"][0],
"error_threshold": error_threshold,
}

if simulation.choppers is not None:
out['choppers'] = sc.DataGroup(
{k: sc.DataGroup(ch.as_dict()) for k, ch in simulation.choppers.items()}
)

return TimeOfFlightLookupTable(**out)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need the intermediate dict? Is there some internal default used that would not work if we set choppers=None?

@nvaytet nvaytet enabled auto-merge November 28, 2025 11:47
@nvaytet nvaytet merged commit bced1a3 into main Nov 28, 2025
4 checks passed
@nvaytet nvaytet deleted the datagroup-tof-lut branch November 28, 2025 11:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants