-
Notifications
You must be signed in to change notification settings - Fork 3
Add batch_processor and batch_compute for batch reduction #177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| def scale_reflectivity_curves_to_overlap( | ||
| curves: Sequence[sc.DataArray], | ||
| def scale_for_reflectivity_overlap( | ||
| reflectivities: sc.DataArray | Mapping[str, sc.DataArray] | sc.DataGroup, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allowing DataGroup is a nice improvement of the interface.
So is only returning the scaling factors instead of both the scaled curves and the factors.
|
The latest update, after in-person discussion, is that we will merge the changes from #155 in this PR. The helper is seen as a function most users should call. I added a simpler notebook to the docs for Amor that uses the |
| Filename[SampleRun]: amor.data.amor_run(610), | ||
| }, | ||
| '611': { | ||
| SampleRotationOffset[SampleRun]: sc.scalar(0.05, unit='deg'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably good if the docs mention that you can pass a list of values to Filename[SampleRun].
src/ess/reflectometry/tools.py
Outdated
| return datasets | ||
| for fname in set(reference_filenames.values()): | ||
| wf[Filename[ReferenceRun]] = fname | ||
| reference_results[fname] = wf.compute(ReducedReference) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two ReducedReference are not guaranteed to be the same even if they have the same file name. Other parameters also affect the reference. For example the wavelength interval and detector region of interest etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it's true that usually those should stay constant across all runs that use the same reference file, so this will likely be correct most of the time. But since it's not guaranteed it's probably better to let the user make the decision to cache or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But since it's not guaranteed it's probably better to let the user make that decision.
How do we achieve that? Just let the user cache the reference on the workflow manually as is currently done in the notebook?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we achieve that? Just let the user cache the reference on the workflow manually as is currently done in the notebook?
Yes that's what I had in mind.
…teed that the same filename yields the same reference result
batch_processor and batch_compute for batch reduction
batch_processor and batch_compute for batch reduction| for name, wf in self.workflows.items(): | ||
| try: | ||
| out[t][name] = wf.compute(t, **kwargs) | ||
| except sl.UnsatisfiedRequirement as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When does this exception occur?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This happens for example if you are trying to compute DetectorData[SampleRun] but your workflow has been mapped over multiple files. The single DetectorData is no longer in the workflow, only the names of the mapped nodes will be (those are obtained using sciline.get_mapped_node_names).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, I think that in the old version of the helper (in #155), if you gave a tuple of filenames, and then computed DetectorData[SampleRun] (or any target before the event lists are concatenated), then it would fail with sl.UnsatisfiedRequirement saying it can't find the type in the pipeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha I see 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if the target is upstream of the reduction operation (for example DetectorData[SampleRun]) does it now return a list of the result, one for each file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👌
src/ess/reflectometry/tools.py
Outdated
| reflectivity curve for each of the provided runs. | ||
| scale_to_overlap: bool | ||
| | tuple[sc.Variable, sc.Variable] | ||
| | list[sc.Variable, sc.Variable] = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(minor) I don't think list[T, T] is a valid python type annotation
…couple of tests on batch processor when working with a mapped workflow

This is an alternative to #166
Instead of trying to create a master workflow with the new
groupbymechanism from Cyclebane/Sciline, we go back to making a class that holds a collection of workflows.Reasons behind the motivation:
ilocin many places.groupbyfor the rotation angle is not really supported out of the box, because the rotation angle is loaded from file, not inputted by the user in the parameter table.groupbydoes not work out-of-the-box with Scipp variables, so we had to convert them to float in a new column before being able to groupI still believe that
groupbyis a great and essential addition to Sciline.However, all the reasons above lead to an implementation that got very messy, and still did not support all intended behaviour for the
BatchProcessor.The desired behaviour is roughly:
Caching intermediate results:
Concatenating event lists from multiple runs is done by supplying a list of filenames for one entry (this is what was done by
groupbyin the aforementioned PR):Update:
In addition to the initial changes here, we included the helper introduced in #155 to compute targets in one go, without having to handle a multi-workflow
BatchProcessorobject; the functionfrom_measurementshas been renamed tobatch_compute.The two approaches complement each other nicely, and the
batch_computeis aimed at more novice users of the reduction workflow.