Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hitfinder update enabling individual thresholds for each channel. #234

Merged
merged 19 commits into from Mar 2, 2020

Conversation

WenzDaniel
Copy link
Collaborator

Modified find_hits, that thresholds can be set individually for each channel. Additionally we introduced a second and optional threshold which is based on the baseline RMS. This second threshold is applied if the RMS times a to be set factor exceeds the adc-height threshold. The baseline RMS is computed in the function "baseline" of the pulse_processing file. By default the RMS based threshold of find_hits is set to zero and the adc height threshold is set to the previous default 15 adc counts.

We also adjusted the tests accordingly. A second pull-request which contains the necessary changes for straxen will follow.

Copy link
Member

@JelleAalbers JelleAalbers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for these improvements Daniel! I took the liberty of making a few changes:

  • The field rms is renamed to baseline_rms. We compute it in baseline now, not in a separate function, to avoid some duplicate logic (this is already in master so it doesn't show up in the diff of this PR). Straxen has been changed so both baseline and find_hits are used in PulseProcessing, not DAQReader.
    • Currently the rms is just computed as the .std of the baseline region. We can bring back your baseline-estimator using negative samples, but using more samples is better on average (as your study confirmed) and if there is a sub-ADC-threshold hit in the pre-selftrigger window we are in a bad situation anyway, since we just use the mean for baseline estimation.
  • Thresholds are specified by two separate arguments min_area and min_height_over_noise. If just a number is given, it is converted to an array. We could add an extra argument to avoid the records['channel'].max(), but I imagine we will always specify channel-dependent thresholds in production, at least eventually.
  • Compute threshold once per record rather than once per sample
  • Look up channel thresholds through thresholds[channel], not thresholds[(threshold['channel'] == r['channel'])[0]]. This is mostly for performance: creating a few-100 element boolean array (or even just looping over the few-100 channel thresholds) for each record we process will slow things down. In the new convention threshold arrays must now be as long the highest channel number + 1. For the MV/NV hitfinder I imagine this you will have a thousand or so zeros at the start. Since we do not loop over them anymore it does not impact performance.

@JelleAalbers JelleAalbers merged commit 9116d2b into AxFoundation:master Mar 2, 2020
@JelleAalbers
Copy link
Member

Forgot to add the notes about your in-code comments:

  • "can ignore the flat part since > and [0,1)"
    Did you refer to the fact that the hitfinder spuriously went into the zeroed-out part of records beyond their length? That shouldn't have happened (though I agree it makes no difference) and should be fixed now.
  • "No idea if we want to keep this after tuning..." Sure, hits don't go to disk anyway, and it seems a small addition that could save some time later.

@WenzDaniel
Copy link
Collaborator Author

Hej Jelle, I read through the modifications and everything looks perfectly fine. Thanks for your hint:

Look up channel thresholds through thresholds[channel], not thresholds[(threshold['channel'] == r['channel'])[0]]. This is mostly for performance: creating a few-100 element boolean array (or even just looping over the few-100 channel thresholds) for each record we process will slow things down. In the new convention threshold arrays must now be as long the highest channel number + 1. For the MV/NV hitfinder I imagine this you will have a thousand or so zeros at the start. Since we do not loop over them anymore it does not impact performance.

I was not aware of this. I found a nice stackoverflow discussion about this topic in case anyone is Interested:
https://stackoverflow.com/questions/46041811/performance-of-various-numpy-fancy-indexing-methods-also-with-numba

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants