Routines for lone hit integration #261

JelleAalbers · 2020-04-27T13:17:39Z

This adds strax.integrate_lone_hits, which computes the area of lone hits while taking into account (left, right) extensions from the original region above threshold. See XENONnT/straxen#103 for a description of the current problem.

Description

A sample should be integrated exactly once if it occurs within (left, right) of a sample above threshold in that channel, and not integrated otherwise.

cut_outside_hits already takes care of the 'not integrated otherwise' part (by setting samples not near a threshold crossing to zero).
The peaklets (left, right) extension is now forced to be equal to the hitfinder's extension. Since strax's peaks are non-overlapping, if a hit forms part of a peak, the samples around it are integrated once.
For lone hits, _find_hit_integration_bounds determines the windows. These are based on the (left, right) extension, but shorter if that would cause overlap with another hit's window or with a peak. Finally, integrate_hits just computes the integral in these windows.

In detail, the 'dispute resolution' for overlapping 'claims' in find_hit_integration_bounds works by these rules:

Peaklets are inviolate. Lone hits never steal samples from adjacent peaklets.
A hit can never steal a sample inside another hit (i.e. above threshold).
- This will never conflict with the first rule, because a peaklet is only made when there is no hit for peaklet_gap_threshold, which strax requires to be >= the left + right extensions.
If multiple lone hits claim a sample, and the above rules do not resolve the dispute, it goes to the earlier hit.

Tests

I compared the area of lone_hits in a one-minute triggerless converted XENON1T run, 180215_1029. As described in XENONnT/straxen#103, the pulse filter can mask this issue, so I tested this without the filter (red line), with the filter (blue line) and with the filter but with the rest of the pulse processing done as if the filter wasn't there (green line) -- i.e. only zeroing the waveform far away from the hit and integrating a large (left, right) extension away from the hit. Dashed vertical lines show the median area.

Current master:

After this change:

You can see that:

In the current master, most lone hit area is missed without the filter. With the filter, we still only integrate the region above threshold, so the (left, right) extension settings do nothing.
After this change, the lone hit area will be properly reconstructed. The left, right extensions now have consequence for the (non)linearity of the filter.
Irrelevant to this change, but let me mention it since it's so noticeable: using the pulse compression filter creates more noise hits (since it increases the effective noise level). Without zeroing the waveform away from hits, the effect is execerbated.

Hopefully this keeps codefactor happy...

WenzDaniel · 2020-04-28T07:33:32Z

strax/processing/peak_building.py

+            result[last_hit_index[ch]][1] = min(result[last_hit_index[ch]][1],
+                                             h['time'])


I was just wondering if it would not be better to do it the other way around. If we always preserve the left-left extension we fix the photon start time.

Tough to say. We're not changing the hit timings, just the regions in which lone hits are integrated. If two hits are near each other both of their areas will be distorted. Maybe it's slightly more important to preserve the first hit's area in general, since the second might be an afterpulse. In most lone hit analyses (e.g. for gain monitoring) you would anyway want to remove partially integrated hits, so probably the choice doesn't matter so much.

Okay, I see thanks.

WenzDaniel · 2020-04-28T07:35:29Z

strax/processing/peak_building.py

+        # TODO: when we add amplitude multiplier, adjust this too!
+        h['area'] = (
+                r['data'][start:end].sum()
+                + (r['baseline'] % 1) * (end - start))


Maybe we should add two new fields in hits which indicate left_extended=bounds[hit_i][0] and right_extended=bounds[hit_i][1]. In this way we could identify overlapping lone hits more easily.

Good point! I added two new hit properties, left_integration and right_integration to store the integration bound indices. We can then cut on right_integration - left_integration - length to remove partially integrated hits.

JelleAalbers added 2 commits April 27, 2020 14:58

Routines for lone hit integration

3a944ef

Remove TODO and fix unrelated formatting issue

89b7beb

Hopefully this keeps codefactor happy...

JelleAalbers mentioned this pull request Apr 27, 2020

Extended integration of lone hits XENONnT/straxen#105

Merged

WenzDaniel reviewed Apr 28, 2020

View reviewed changes

Store integration bounds for lone hits

2085f81

JelleAalbers merged commit 64c9c73 into master Apr 28, 2020

JelleAalbers deleted the lone_hit_integration branch April 28, 2020 11:48

JelleAalbers mentioned this pull request Jul 25, 2021

Fixing peaklet baseline bias #486

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Routines for lone hit integration #261

Routines for lone hit integration #261

JelleAalbers commented Apr 27, 2020

WenzDaniel Apr 28, 2020

JelleAalbers Apr 28, 2020

WenzDaniel Apr 28, 2020

WenzDaniel Apr 28, 2020 •

edited

JelleAalbers Apr 28, 2020

		result[last_hit_index[ch]][1] = min(result[last_hit_index[ch]][1],
		h['time'])

Routines for lone hit integration #261

Routines for lone hit integration #261

Conversation

JelleAalbers commented Apr 27, 2020

Description

Tests

WenzDaniel Apr 28, 2020

Choose a reason for hiding this comment

JelleAalbers Apr 28, 2020

Choose a reason for hiding this comment

WenzDaniel Apr 28, 2020

Choose a reason for hiding this comment

WenzDaniel Apr 28, 2020 • edited

Choose a reason for hiding this comment

JelleAalbers Apr 28, 2020

Choose a reason for hiding this comment

WenzDaniel Apr 28, 2020 •

edited