Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change R1- and DL0-waveforms shape to (n_channels, n_pixels, n_samples) #2529

Merged
merged 42 commits into from Apr 24, 2024

Conversation

Hckjs
Copy link
Contributor

@Hckjs Hckjs commented Mar 28, 2024

This PR basically adds three (somehow dependent) features:

  1. Changing the R1 and DL0 waveforms shape to be always (n_channels, n_pixels, n_samples).

  2. Implementing 1) in such a general way (and two lines of code) allows the ImageExtractors to handle not gain selected data (by passing selected_gain_channel=None)

  3. To test these i've adapted the WaveformToymodel to be able to create 2 gain waveforms. All extractors are now tested for different camera types (including different number of gain channels)

Things that probably still need to be discussed:

  • I don't think it will be needed, but for now the TwoPassWindowSum does not support for 2 gain channels
  • There was already some discussion about splitting the DL1CameraContainer considering the two different ndims of image and peak_time
  • With the current API of the GainSelector it cannot distinguish between already gain selected data and data that originally consist of just one gain channel. We may pass something like camera.readout.n_channels?

Closes #1836

Thanks already to @TjarkMiener for the discussion and input!

This comment has been minimized.

@maxnoe
Copy link
Member

maxnoe commented Mar 28, 2024

I don't think it will be needed, but for now the TwoPassWindowSum does not support for 2 gain channels

I tend agree. At least for now we can leave this unsupported. Just make sure there is a nice error message in case someone tries to apply it to two-gain data

Copy link

codecov bot commented Mar 28, 2024

Codecov Report

Attention: Patch coverage is 95.37954% with 14 lines in your changes are missing coverage. Please review.

Project coverage is 92.67%. Comparing base (e2a848e) to head (cfad23c).
Report is 20 commits behind head on main.

❗ Current head cfad23c differs from pull request most recent head e37bb41. Consider uploading reports for the commit e37bb41 to get more accurate results

Files Patch % Lines
src/ctapipe/image/extractor.py 73.80% 11 Missing ⚠️
src/ctapipe/calib/camera/calibrator.py 91.66% 1 Missing ⚠️
src/ctapipe/image/invalid_pixels.py 95.23% 1 Missing ⚠️
src/ctapipe/io/hdf5eventsource.py 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2529      +/-   ##
==========================================
+ Coverage   92.66%   92.67%   +0.01%     
==========================================
  Files         232      231       -1     
  Lines       20220    20292      +72     
==========================================
+ Hits        18736    18805      +69     
- Misses       1484     1487       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@maxnoe
Copy link
Member

maxnoe commented Mar 28, 2024

I guess with this we break the reading-in of R0, R1 and DL0 data written into HDF5 files? I.e. this is a data model change that requires changes in the compatibility table of the HDF5EventSource?

This comment has been minimized.

@TjarkMiener
Copy link
Member

I guess with this we break the reading-in of R0, R1 and DL0 data written into HDF5 files? I.e. this is a data model change that requires changes in the compatibility table of the HDF5EventSource?

At the moment (lst-cluster down), I can only test it with the LST-1 HDF5 files of R1 calibration data, which comes with two gains. This is working fine for me with the HDF5EventSource. I only have an (unrelated) issue with the coordinate frame when filling the array pointing in the HDF5EventSource. I will have a look at the DL0 LST-1 ACADA data once the lst-cluster is running.

File "/Users/tjarkmiener/calibration/2529/ctapipe/src/ctapipe/io/hdf5eventsource.py", line 632, in _generator
 self._fill_array_pointing(data)
File "/Users/tjarkmiener/calibration/2529/ctapipe/src/ctapipe/io/hdf5eventsource.py", line 716, in _fill_array_pointing
 raise ValueError(f"Unsupported pointing frame: {frame}")
ValueError: Unsupported pointing frame: CoordinateFrameType.UNKNOWN

@maxnoe
Copy link
Member

maxnoe commented Apr 2, 2024

At the moment (lst-cluster down), I can only test it with the LST-1 HDF5 files of R1 calibration data, which comes with two gains. This is working fine for me with the HDF5EventSource.

That case will work fine, the issue will be reading single-gain data written with older versions which will result in the 2d shape that is written into the file not the 3d shape expected now

@maxnoe
Copy link
Member

maxnoe commented Apr 2, 2024

I will have a look at the DL0 LST-1 ACADA data once the lst-cluster is running.

I don't understand the relevance of DL0 ACADA (zfits) data in the discussion about the HDF5 event source...

@TjarkMiener
Copy link
Member

That case will work fine, the issue will be reading single-gain data written with older versions which will result in the 2d shape that is written into the file not the 3d shape expected now

Yes, sorry! This is what I meant. I accidentally thought that this applies to DL0 ACADA data. However, as you said it's stored (of course) in zfits and not HDF5!

@mexanick
Copy link

mexanick commented Apr 2, 2024

That case will work fine, the issue will be reading single-gain data written with older versions which will result in the 2d shape that is written into the file not the 3d shape expected now

Is there a good reason to care about backward compatibility here? I mean, do you want a workaround to be implemented to remain compatible with files written by previous versions or just update the is_compatible method of HDF5EventSource to provide a correct compatibility info?

Edit: if we don't care about backwards compatibility, one line change here shall be enough, right?

@TjarkMiener
Copy link
Member

I have tested it with an old LST-1 sim file in hdf5, where the R1 waveform data is stored with 2 dims, i.e. shape of waveform: (1855,40). In case you would like to have backward compatibility, I did the following:
I expanded the dimension here of the waveforms:

                    data.r1.tel[tel_id] = next(waveform_readers[key])
                    if data.r1.tel[tel_id]["waveform"].ndim < 3:
                        data.r1.tel[tel_id]["waveform"] = np.expand_dims(data.r1.tel[tel_id]["waveform"], axis=0)   

Since the r1.pixel_status is None, I introduced a check here

         dl0_pixel_status = None
         if r1.pixel_status is not None:
             dl0_pixel_status = r1.pixel_status.copy()
             # set dvr pixel bit in pixel_status for pixels kept by DVR
             dl0_pixel_status[signal_pixels] |= PixelStatus.DVR_STORED_AS_SIGNAL
             # unset dvr bits for removed pixels
             dl0_pixel_status[~signal_pixels] &= ~np.uint8(PixelStatus.DVR_STATUS)

I'm having some trouble with self.reference_location (this is "None" with my file) in SubarrayDescription which I don't know how to solve. Any ideas?
I hope it can be useful!

@maxnoe
Copy link
Member

maxnoe commented Apr 2, 2024

Is there a good reason to care about backward compatibility here? I mean, do you want a workaround to be implemented to remain compatible with files written by previous versions or just update the is_compatible method of HDF5EventSource to provide a correct compatibility info?

I don't have a strong opinion here, but we need to do one of two things:

  • Raise a nice error in case we encounter old data we do not support anymore
  • Add a little bit of code to reshape the 2d data into the now expected 3d shape

@TjarkMiener
Copy link
Member

I only have an (unrelated) issue with the coordinate frame when filling the array pointing in the HDF5EventSource.

Due to some version incompatibilities, the array pointing information was not written to the HDF5 file. When reading the incomplete/corrupted file the HDF5EventSource complains, which is the correct behavior. Therefore, the quoted sentence can be ignored.

@TjarkMiener
Copy link
Member

  • Raise a nice error in case we encounter old data we do not support anymore

I'm in favor of raising a nice error. I don't think there are many (and crucial) R1 or DL0 datasets produced in HDF5 following the 2d waveform shape. Most users are only storing DL1b and/or DL1a. (I actually have a R1 production of LST-1 sims for DL studies. However, it can be easily reprocessed with this PR!)

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@maxnoe
Copy link
Member

maxnoe commented Apr 4, 2024

The docs failure is due to an unclosed tables file, I think the one that is opened on line 222 of examples/core/table_writer_reader.py.

Either you close the file, or you add an exception for that warning into the docs conf

This comment has been minimized.

This comment has been minimized.

src/ctapipe/image/extractor.py Outdated Show resolved Hide resolved
@maxnoe maxnoe added this to the v0.21.0 milestone Apr 5, 2024
Hckjs and others added 18 commits April 23, 2024 09:02
 - Adapted extractors and calibrator to handel broken pixels for multiple gain channels
 - InvalidPixelHandel cannot handle multiple gain channels for now
 - I actually cant remember why i used transpose to broadcast the calib
   coeffs to the charges and peak times. I think it was due to a usecase
   which is actually not needed.
 - I've added a test especially for 2 gain data to check that the
   calibrator also works this way
Co-Authored-By: Jonas Hackfeld <jonas.hackfeld@ruhr-uni-bochum.de>
 - _apply_correction is now a static method from ImageExtractors parent
   class
 - _calculate_correction is moved to the parent ImageExtractors class
   to avoid writing the docstrings multiple times. It needs to be
   overwritten by the child components
 - it duplicated the changes made in main, had to remove one of them

This comment has been minimized.

1 similar comment
Copy link

Failed

  • 3.58% Duplicated Lines (%) on New Code (is greater than 3.00%)

Analysis Details

7 Issues

  • Bug 0 Bugs
  • Vulnerability 0 Vulnerabilities
  • Code Smell 7 Code Smells

Coverage and Duplications

  • Coverage 97.30% Coverage (92.80% Estimated after merge)
  • Duplications 3.58% Duplicated Code (0.70% Estimated after merge)

Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs

View in SonarQube

@maxnoe maxnoe merged commit 4164bc9 into main Apr 24, 2024
12 checks passed
@maxnoe maxnoe deleted the extractor_gain branch April 24, 2024 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Re-introduce possibility of running ImageExtractor on waveforms with multiple gains
6 participants