-
Notifications
You must be signed in to change notification settings - Fork 2
Performance #71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance #71
Conversation
|
Converted to draft, since it looks like I messed up memory consumption (at least CI is having trouble, it seems). |
|
Looking at https://github.com/scipp/essdiffraction/actions/workflows/ci.yml it seems the CI timings were a perfectly normal outlier. Ready for review once more! |
| "intermediates[MaskedData[SampleRun]].bins.concat().hist(\n", | ||
| " two_theta=300, wavelength=300\n", | ||
| ").plot(norm=\"log\")" | ||
| "two_theta = sc.linspace(\"two_theta\", 0.8, 2.4, 301, unit=\"rad\")\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the explicit limits? Does this really matter so much for performance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we have some voxels with NaN positions, and get an exception otherwise.
| res = da.group("sector", *elements) | ||
| else: | ||
| res = da.group(*elements) | ||
| res.coords['position'] = res.bins.coords.pop('position').bins.mean() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you check whether this changes the final result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hoping there are tests for that...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no tests for the complete
The problem is that the workflow is incomplete. Most updates in the near future will break regression tests.
| def _drop_grouping_and_bin( | ||
| data: sc.DataArray, *, dims_to_reduce: tuple[str, ...] | None = None, edges: dict | ||
| ) -> sc.DataArray: | ||
| all_pixels = data if dims_to_reduce == () else data.bins.concat(dims_to_reduce) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def _drop_grouping_and_bin( | |
| data: sc.DataArray, *, dims_to_reduce: tuple[str, ...] | None = None, edges: dict | |
| ) -> sc.DataArray: | |
| all_pixels = data if dims_to_reduce == () else data.bins.concat(dims_to_reduce) | |
| def _drop_grouping_and_bin( | |
| data: sc.DataArray, *, edges: dict | |
| ) -> sc.DataArray: | |
| all_pixels = data.bins.concat(dims_to_reduce) |
because dims_to_reduce is never used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was using it for the two-theta case, but then refactored. Since I wanted to consider moving this to ESSreduce I thought I'd keep it.
src/ess/powder/grouping.py
Outdated
| # inferior performance when binning (no/bad multi-threading?). | ||
| # We operate on the content buffer for better multi-threaded performance. | ||
| if all_pixels.ndim == 0: | ||
| content = all_pixels.bins.constituents['data'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it safer to use this? Because it excludes constituents that are out of range?
| content = all_pixels.bins.constituents['data'] | |
| content = all_pixels.value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, good point that looks simpler, I will try.
src/ess/powder/grouping.py
Outdated
| # inferior performance when binning (no/bad multi-threading?). | ||
| # We operate on the content buffer for better multi-threaded performance. | ||
| if all_pixels.ndim == 0: | ||
| content = all_pixels.bins.constituents['data'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this should be done by bin automatically. Is this possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See what I wrote in the PR:
Bypass the same single-threading problem encountered in SANS, I think we either have to put a helper into ESSreduce, or see if the underlying problem should be fixed in Scipp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Can this be done in Scipp now instead of adding extra code here and potentially in other projects? If not, please open an issue!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the larger issue is that we have to concat first, which in itself is avoiding other performance issues. Making many->many mapping is a long-standing issues and has seen a couple improvements in the past, but is apparently not fully solved (See scipp/scipp#1846).
What could probably be done in Scipp is bypass the single-threading issue, but it also required some thought (not sure there aren't any subtleties yet), but it is slightly odd to put an optimization for a solution that bypasses another problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes
tofandwavelengthevent coord that are not needed any more.Baseline
Baseline for Geant4 workflow:
I artificially increased the data size by modifying the data after loading:
The timings reported for this function are thus irrelevant in reality.
This branch
RawDetectorData[SampleRun]is now 6 GByte instead of 10 GByte: