New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing peaklet baseline bias #486
Conversation
This reverts commit 81fdc11.
Hi Daniel, Great work spotting this! I'm amazed such a large area bias could have slipped our validation tests on 1T data. CES peak positions below saturation territory agreed between pax and strax (e.g. xenon:xenonnt:analysis:strax_clustering_classification#s1_s2_events), but maybe that's driven mostly by the S2s? The 1PE peak also seemed about right in #261, at least if the gain calibration includes underamplified / sub-threshold hits. With the 'zero-padded part' of the waveform, you mean the parts zeroed out by cut_outside_hits, right? There is also a part of the record data field beyond Still, sorry.. I should have realized we can't zero an integer-valued waveform if the true zero level is a floating value. Ultimately I guess this is payback for not copying the record data into the hit datatype, as we did in pax, or as in your hitlets.. If the hit-dependent sum waveform somehow doesn't work out, an alternative might be to let But that's just a random idea. I think your solution is to essentially merge |
Okay I added Tianyu's comments. I run also all tests in straxen with XENONnT/straxen#601 and all tests pass. We should not merge this PR without merging XENONnT/straxen#601. Waiting for Jorans review now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Daniel, the changes look good!
Just a few minor comments from my side
What is the problem / what does the code in this PR do
Our current treatment of the floating point part of our baseline leads to an area (and summed waveform) bias of several 10ish % in peaklets. The main reason for the positive bias is coming form the fact, that we are also adding the floating point part of the baseline in zero-padded regions. A more detailed note can be found here.
Can you briefly describe how it works?
The solution is rather simple: We make our sum_waveform hit dependent and only build peaklets over the regions in which we found hits before.
In terms of performance is the new algorithm ~ x1.5 slower compared to the old one.