-
Notifications
You must be signed in to change notification settings - Fork 3
Proton current correction #115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a112f15 to
5b15893
Compare
8446ed3 to
db56047
Compare
src/ess/reflectometry/conversions.py
Outdated
| fill_value=sc.scalar(float('nan'), unit=pc.unit), | ||
| ) | ||
| # Useful for comparing the proton current to what is typical | ||
| da.coords['median_proton_current'] = sc.median(pc).data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lower down we mask based on this. What if the beam was bad for some time, but there were few events. Shouldn't we compute based on the proton current of the events? Otherwise the median can be dragged down a lot and we end up masking the valid data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you mean.
Shouldn't we compute based on the proton current of the events?
Compute the mask? Or something else?
median can be dragged down a lot and we end up masking the valid data?
If the median is dragged down we mask less, not more. The mask is for events when the proton current is too low. Masking less can ofc also be an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point was: We would like to have the median of when the beam is "good", not the median based on time intervals. Say we have a 1 hour run but only 5 minutes good beam, wouldn't the current approach end up masking everything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case "good beam" is defined by what is typical.
Say we have a 1 hour run but only 5 minutes good beam, wouldn't the current approach end up masking everything?
The current approach would end up masking nothing in that case.
The reason this is needed to begin with is that the Amor beam intensity periodically goes down briefly, and we want to mask regions with no/weak beam because otherwise we just amplify the background.
In normal operation the beam being 1/5 of the typical intensity is a good heuristic that the beam was not up.
We could also use some other heuristic here like a fixed cut-off that can be defined by the user.
src/ess/amor/workflow.py
Outdated
| if len(proton_current) != 0: | ||
| add_proton_current_coord(da, proton_current) | ||
| add_proton_current_mask(da) | ||
| correct_by_proton_current(da) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially thought this provider is doing an in-place modification, which would be problematic. But on closer looks it seems like add_coords make (at least) a shallow copy? The way things are written now this is a bit hard to see though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does modify the weights in-place, but that's what we do everywhere right? (Since we don't want to copy the weights)
add_coords makes a shallow copy because that is what transform_coords returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but that's what we do everywhere right
As far as I am aware we are usually (but not always) avoiding input modifications in providers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems you are right, I'll change it to update inplace.
src/ess/amor/workflow.py
Outdated
| da = add_coords(da, graph) | ||
| da = add_masks(da, ylim, zlims, bdlim, wbins) | ||
| add_masks(da, ylim, zlims, bdlim, wbins) | ||
|
|
||
| # Copy before applying corrections | ||
| da.data = da.data.copy() | ||
| correct_by_footprint(da) | ||
| return da | ||
|
|
||
| # For some older Amor files there are no entries in the proton current log | ||
| if len(proton_current) != 0: | ||
| add_proton_current_coord(da, proton_current) | ||
| add_proton_current_mask(da) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking through this once more:
add_coordsmakes a shallow copy (including bin contents).add_masksdoes an in-place modification, despite the similar name.- We deep copy everything, including coords that do not change (this is wasteful).
- Apply corrections on data in-place.
Consider whether this is good. For example, as it stands now, one should not call the functions in another other order. For example, it looks like add_masks, despite being in a different module, is an implementation detail of this function and "risky" to use in other contexts.
Performance wise, you might be better off changing correct_by_footprint and correct_by_proton_current to not modify the input. Then you can avoid the da.data.copy(), which also copies all event coords. Optionally, you may also consider combining all corrections before multiplying them into the data, but that may or may not make any difference, depending on the dimensions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add_coords makes a shallow copy (including bin contents).
Not sure what you mean here. My understanding was that transform_coords makes a shallow copy and doesn't copy bin contents.
Then you can avoid the da.data.copy(), which also copies all event coords.
Why would da.data.copy() copy the event coords? Isn't da.data just a variable and .copy() copies that one variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, da.data is a variable. But (if) da.data is a binned variable then the event coords are part of da.data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know 👍 I'll change it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.

Fixes #78
Supersedes #84