Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add missing dl1 parameters #41

Conversation

HealthyPear
Copy link
Member

Soon protopipe will use (part of) the new ctapipe-stage-1 script to produce DL1 files in a new format.

Meanwhile, in order to proceed with the DL1 comparison against CTA/MARS (see #37 for more information), some DL1 parameters are required which are not currently exported.
This PR aims at adding these parameters to the DL1 file generated by write_dl1.py.

In doing this:

  • a potential performance limitation has been fixed (wrong number of islands when using extended image cleaning)
  • leakage1 and leakage2 parameters have been also added to the exported data file.

Closes #40 .

@HealthyPear HealthyPear changed the title Feature add missing dl1 parameters Add missing dl1 parameters Jan 29, 2020
@codecov
Copy link

codecov bot commented Jan 29, 2020

Codecov Report

Merging #41 into master will decrease coverage by 0.00%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master     #41      +/-   ##
=========================================
- Coverage    0.40%   0.39%   -0.01%     
=========================================
  Files          20      20              
  Lines        2232    2267      +35     
=========================================
  Hits            9       9              
- Misses       2223    2258      +35     
Impacted Files Coverage Δ
protopipe/pipeline/event_preparer.py 0.00% <0.00%> (ø)
protopipe/scripts/write_dl1.py 0.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d928c4c...db806f0. Read the comment docs.

@HealthyPear HealthyPear added this to Reviewer approved in Pipeline features and enhancements Jan 31, 2020
@HealthyPear HealthyPear moved this from Reviewer approved to Review in progress in Pipeline features and enhancements Jan 31, 2020
@HealthyPear HealthyPear moved this from Review in progress to In progress in Pipeline features and enhancements Feb 3, 2020
@HealthyPear HealthyPear added this to the Release 0.3 milestone Feb 3, 2020
@HealthyPear HealthyPear added the enhancement New feature or request label Feb 3, 2020
kosack
kosack previously approved these changes Feb 4, 2020
Copy link
Contributor

@kosack kosack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine. One thing to keep in mind is that the names of the leakage parameters changed since v0.7 - in the next release they have more clear names, so you'll need to update this code. In fact you might want to start using the new names for your internal variables here to avoid confusion. The new version defines them as follows (and the output has the prefix attached, so the column names in the DL1 file become for example leakage_pixels_width_1:

class LeakageContainer(Container):
    """
    Fraction of signal in 1 or 2-pixel width border from the edge of the
    camera, measured in number of signal pixels or in intensity.
    """

    container_prefix = "leakage"

    pixels_width_1 = Field(
        nan, "fraction of pixels after cleaning that are in camera border of width=1"
    )
    pixels_width_2 = Field(
        nan, "fraction of pixels after cleaning that are in camera border of width=2"
    )
    intensity_width_1 = Field(
        nan,
        "Intensity in photo-electrons after cleaning"
        " that are in the camera border of width=1 pixel",
    )
    intensity_width_2 = Field(
        nan,
        "Intensity in photo-electrons after cleaning"
        " that are in the camera border of width=2 pixels",
    )


class ConcentrationContainer(Container):
    """
    Concentrations are ratios between light amount
    in certain areas of the image and the full image.
    """

    container_prefix = "concentration"
    cog = Field(
        nan, "Percentage of photo-electrons in the three pixels closest to the cog"
    )
    core = Field(nan, "Percentage of photo-electrons inside the hillas ellipse")
    pixel = Field(nan, "Percentage of photo-electrons in the brightest pixel")

Pipeline features and enhancements automation moved this from In progress to Reviewer approved Feb 4, 2020
@HealthyPear
Copy link
Member Author

Looks fine. One thing to keep in mind is that the names of the leakage parameters changed since v0.7 - in the next release they have more clear names, so you'll need to update this code. In fact you might want to start using the new names for your internal variables here to avoid confusion. The new version defines them as follows (and the output has the prefix attached, so the column names in the DL1 file become for example leakage_pixels_width_1:

Do you mean changing just the name of the variable I then export into HDF5 column?
So, e.g.

feature_events[cam_id]["leak2_reco"]
feature_events[cam_id]["leak1"]

would become respectively,

feature_events[cam_id]["leakage_pixels_width_2_reco"]
feature_events[cam_id]["leakage_pixels_width_1"]

Anyhow, wouldn't this be superseded by the new DL1 script?

@kosack
Copy link
Contributor

kosack commented Feb 5, 2020

Anyhow, wouldn't this be superseded by the new DL1 script?

Yes, it's just to make it easier in the future to compare - in the DL1 script, it names the columns that way, so there will be less confusion when we switch (especially in the benchmark plots)

@kosack
Copy link
Contributor

kosack commented Feb 5, 2020

Thought the other difference will be that there is no "reco" version of parameters → all parameters are assumed to be reco, and rather simulated ones will be either in another dataset or "mc_leakage*". But that's just a convention, not really a problem.

Pipeline features and enhancements automation moved this from Reviewer approved to Review in progress Feb 7, 2020
@kosack
Copy link
Contributor

kosack commented Jun 2, 2020

Now that ctapipe-0.8 is out, the naming of columns is (relatively) fixed. The only other change there that differes from this version is that we dropped the mc_* prefix and replaced it with true_* (so you get true_energy, true_hillas_x, etc). Generate a DL1 output file for with ctapipe-0.8 to see in more detail.

Pipeline features and enhancements automation moved this from Review in progress to Reviewer approved Jun 3, 2020
Copy link
Contributor

@kosack kosack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine, but just keep in mind many names may need to be updated as you migrate to the latest ctapipe (v0.8), e.g. leakage() returns a LeakageContainer with columns already named correctly.

@HealthyPear
Copy link
Member Author

I think this is fine, but just keep in mind many names may need to be updated as you migrate to the latest ctapipe (v0.8), e.g. leakage() returns a LeakageContainer with columns already named correctly.

Sure, this PR refers of course to protopipe 0.2.1+.

I will update this part of the code accordingly through the upgrade to 0.3.

@HealthyPear HealthyPear merged commit a328138 into cta-observatory:master Jun 4, 2020
Pipeline features and enhancements automation moved this from Reviewer approved to Done Jun 4, 2020
@HealthyPear HealthyPear deleted the feature-add_missing_DL1_parameters branch June 4, 2020 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

Successfully merging this pull request may close these issues.

Add missing DL1 parameters
2 participants