Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing peaklet baseline bias #486

Merged
merged 50 commits into from Aug 25, 2021
Merged

Fixing peaklet baseline bias #486

merged 50 commits into from Aug 25, 2021

Conversation

WenzDaniel
Copy link
Collaborator

@WenzDaniel WenzDaniel commented Jul 14, 2021

What is the problem / what does the code in this PR do
Our current treatment of the floating point part of our baseline leads to an area (and summed waveform) bias of several 10ish % in peaklets. The main reason for the positive bias is coming form the fact, that we are also adding the floating point part of the baseline in zero-padded regions. A more detailed note can be found here.

Can you briefly describe how it works?
The solution is rather simple: We make our sum_waveform hit dependent and only build peaklets over the regions in which we found hits before.

In terms of performance is the new algorithm ~ x1.5 slower compared to the old one.

@WenzDaniel WenzDaniel marked this pull request as ready for review July 22, 2021 16:02
@WenzDaniel WenzDaniel marked this pull request as draft July 23, 2021 05:41
@WenzDaniel WenzDaniel closed this Jul 23, 2021
@WenzDaniel WenzDaniel reopened this Jul 24, 2021
@JelleAalbers
Copy link
Member

Hi Daniel, Great work spotting this! I'm amazed such a large area bias could have slipped our validation tests on 1T data. CES peak positions below saturation territory agreed between pax and strax (e.g. xenon:xenonnt:analysis:strax_clustering_classification#s1_s2_events), but maybe that's driven mostly by the S2s? The 1PE peak also seemed about right in #261, at least if the gain calibration includes underamplified / sub-threshold hits.

With the 'zero-padded part' of the waveform, you mean the parts zeroed out by cut_outside_hits, right? There is also a part of the record data field beyond record['length'] (for the final record_i in a pulse), but that shouldn't do anything; no routine should even look at it.

Still, sorry.. I should have realized we can't zero an integer-valued waveform if the true zero level is a floating value. Ultimately I guess this is payback for not copying the record data into the hit datatype, as we did in pax, or as in your hitlets..

If the hit-dependent sum waveform somehow doesn't work out, an alternative might be to let cut_outside_hits assign not zero, but a special value way outside the dynamic range of the digitizer (e.g. define TRUE_ZERO = -32768), and have sum_waveform ignore samples with that value. To avoid reprocessing records, you could run the modified cut_outside_hits again after hitfinding in peaklets (it should be idempotent).

But that's just a random idea. I think your solution is to essentially merge cut_outside_hits with sum_waveform; that should certainly work too. It would make cut_outside_hits in PulseProcessing really only a disk space saver.

@WenzDaniel
Copy link
Collaborator Author

Okay I added Tianyu's comments. I run also all tests in straxen with XENONnT/straxen#601 and all tests pass. We should not merge this PR without merging XENONnT/straxen#601. Waiting for Jorans review now.

Copy link
Member

@JoranAngevaare JoranAngevaare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Daniel, the changes look good!

Just a few minor comments from my side

strax/processing/peak_building.py Outdated Show resolved Hide resolved
strax/processing/peak_building.py Show resolved Hide resolved
strax/processing/peak_building.py Outdated Show resolved Hide resolved
strax/processing/peak_building.py Show resolved Hide resolved
strax/processing/peak_building.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants