Refactor concat and get data #430

WenzDaniel · 2021-04-26T12:22:42Z

What is the problem / what does the code in this PR do
In this PR Tianyu and I (mostly Tianyu) refactored the get_hitlet_data function. Tianyus proposal makes the function much more flexible as we search no longer for the hitlet data via the record_i field but using time intervals instead. This change allows a much more general usage of the function. For this reason I added a "header" part which checks for different possibilities in calling this function. The corresponding tests have been added too.

Beside get_hitlet_data I refactored a bit concat_overlaping_hits as it was also part of the proposal. However, I did not manage to modify the function such, that it mimics the full behavior of the old method. But since the new method would lead to a performance drop of about 30% I decided to keep the old one.

A more thorough discussion can be found in this note.

Open points:

Updated plugins, set correct channel offset for mv and nv.
Updated hitlet splitting
Add some test for hitlet splitting.
Test process some data

strax/processing/hitlets.py

WenzDaniel · 2021-04-28T14:19:54Z

strax/processing/hitlets.py

+    if np.all(data == 0):
+        res[:] = np.nan
+    else:
+        inter, amps = strax.highest_density_region(data,
+                                                   fractions_desired, _buffer_size=_buffer_size)


Sorry, I know one should not add more than one feature into a PR, but I am currently optimizing some splitting parameters and I run into this issue. strax.highest_density_region behaves as expected and raises an ValueError. However, it is not desired to stop the entire processing because of that. Hence, the function returns now np.nan in case the HDR is not defined.

WenzDaniel · 2021-04-28T14:20:35Z

tests/test_hitlet.py

+    # Check that negative data does not raise:
+    res = strax.processing.hitlets.highest_density_region_width(np.array([0, -1, -2]),
+                                                          np.array([0.5]),
+                                                          fractionl_edges=True)
+    assert np.all(np.isnan(res)), 'For empty data HDR is not defined, should return np.nan!'
+


... and added the corresponding test.

WenzDaniel · 2021-04-28T14:53:12Z

tests/test_hitlet.py

+def test_not_defined_get_fhwm():
+    # This is a specific unity test for some edge-cases in which the full
+    # width half maximum is not defined.
+    odd_hitlets = np.zeros(4, dtype=strax.hitlet_with_data_dtype(10))
+    odd_hitlets[0]['data'][:5] = [2, 2, 3, 2, 2]
+    odd_hitlets[0]['length'] = 5
+    odd_hitlets[1]['data'][:2] = [5, 5]
+    odd_hitlets[1]['length'] = 2
+    odd_hitlets[2]['length'] = 3
+    odd_hitlets[3]['data'][:3] = [-1, -2, 0]
+    odd_hitlets[3]['length'] = 3
+
+    for oh in odd_hitlets:
+        res = strax.get_fwxm(oh)
+        mes = (f'get_fxhm returned {res} for {oh["data"][:oh["length"]]}!'
+               'However, the FWHM is not defined and the return should be nan!'
+               )
+        assert np.all(np.isnan(res)), mes


No, Idea how this slipped through the last review. I removed now the indentation level and also added a test for negative waveforms.

JoranAngevaare

Thanks Daniel, I think it looks very good indeed. I like these changes a lot.

Perhaps now is a good time to get rid of the _split_hitlets as it is the same as the _split_peaks function

strax/dtypes.py

strax/processing/hitlets.py

strax/processing/peak_splitting.py

Co-authored-by: Joran Angevaare <joranangevaare@gmail.com>

…zDaniel/strax into refactor_concat_and_get_data

WenzDaniel · 2021-04-30T10:41:54Z

Okay, I addressed your comments, besides the ones which will require a larger refactoring.

WenzDaniel added 6 commits April 25, 2021 11:03

Removed record_i parameter. Reads like well written prose.

6f0bb19

Added Tianyus suggestion for get_hitlet_data

50cc27c

Removed record_i field from hitlet_with_data dtype

fd61530

Refactored test_get_hitlet_data

f874612

Removed record_i field from refresh_hit_to_hitlets

742c988

Added channel offset.

d816ce7

zhut19 reviewed Apr 26, 2021

View reviewed changes

strax/processing/hitlets.py Show resolved Hide resolved

zhut19 reviewed Apr 26, 2021

View reviewed changes

strax/processing/hitlets.py Show resolved Hide resolved

WenzDaniel added 4 commits April 26, 2021 19:52

Fixed is_first_record

ed4e605

Removed channel offset from function.

724b571

Added check and test for wrong to_pe shape

bae4384

Updated hitlet splitting added test for outer splitting function

55ff8b6

WenzDaniel marked this pull request as ready for review April 27, 2021 08:22

WenzDaniel requested a review from JoranAngevaare April 27, 2021 08:22

Make HDR robust against negative data

6ee1521

WenzDaniel commented Apr 28, 2021

View reviewed changes

Added same for FWHM

e3af20b

WenzDaniel commented Apr 28, 2021

View reviewed changes

Return values

3eea6ab

JoranAngevaare reviewed Apr 29, 2021

View reviewed changes

WenzDaniel and others added 6 commits April 29, 2021 17:37

Update strax/processing/hitlets.py

0acb641

Co-authored-by: Joran Angevaare <joranangevaare@gmail.com>

Made dtype more explicit.

3437297

Merge branch 'refactor_concat_and_get_data' of https://github.com/Wen…

87775d8

…zDaniel/strax into refactor_concat_and_get_data

Documentation and commenting

ba7315a

Updated test

6eaa981

fiX

c94763f

Added min_data_field kwargs

272927f

JoranAngevaare approved these changes Apr 30, 2021

View reviewed changes

WenzDaniel merged commit e5b0b42 into AxFoundation:master Apr 30, 2021

JoranAngevaare mentioned this pull request May 1, 2021

fix empty hitlets #435

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor concat and get data #430

Refactor concat and get data #430

WenzDaniel commented Apr 26, 2021 •

edited

WenzDaniel Apr 28, 2021

WenzDaniel Apr 28, 2021

WenzDaniel Apr 28, 2021

JoranAngevaare left a comment

WenzDaniel commented Apr 30, 2021

Refactor concat and get data #430

Refactor concat and get data #430

Conversation

WenzDaniel commented Apr 26, 2021 • edited

WenzDaniel Apr 28, 2021

Choose a reason for hiding this comment

WenzDaniel Apr 28, 2021

Choose a reason for hiding this comment

WenzDaniel Apr 28, 2021

Choose a reason for hiding this comment

JoranAngevaare left a comment

Choose a reason for hiding this comment

WenzDaniel commented Apr 30, 2021

WenzDaniel commented Apr 26, 2021 •

edited