Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jet Flavour data pre-processing and inference #224

Merged
merged 8 commits into from Oct 31, 2022
Merged

Conversation

selvaggi
Copy link
Contributor

@selvaggi selvaggi commented Oct 30, 2022

This PR provides an example for to produce cluster jets, compute jet constituent observables needed for flavour tagging and build a jet based tree in two steps (stage1.py and stage2.cpp). The jet-based tree is later used for training the model with @hqucms' Weaver. The model is exported into ONNX and used for inference as showed in analysis_inference.py
This PR bulld upon and superseeds #188.

Credits: @hqucms, @forthommel , @ADV99

@selvaggi selvaggi marked this pull request as ready for review October 30, 2022 21:54
analyzers/dataframe/FCCAnalyses/JetConstituentsUtils.h Outdated Show resolved Hide resolved
analyzers/dataframe/FCCAnalyses/JetConstituentsUtils.h Outdated Show resolved Hide resolved
analyzers/dataframe/FCCAnalyses/JetConstituentsUtils.h Outdated Show resolved Hide resolved
analyzers/dataframe/src/JetConstituentsUtils.cc Outdated Show resolved Hide resolved
analyzers/dataframe/src/JetConstituentsUtils.cc Outdated Show resolved Hide resolved
@@ -228,7 +565,8 @@ getRP2TRK_phi0_tanlambda_cov(ROOT::VecOps::RVec<edm4hep::ReconstructedParticleDa
for (auto & p: in) {
if (p.tracks_begin<tracks.size())
result.push_back(tracks.at(p.tracks_begin).covMatrix[11]);
else result.push_back(std::nan(""));
//else result.push_back(std::nan(""));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//else result.push_back(std::nan(""));

@@ -216,7 +552,8 @@ getRP2TRK_phi0_z0_cov(ROOT::VecOps::RVec<edm4hep::ReconstructedParticleData> in,
for (auto & p: in) {
if (p.tracks_begin<tracks.size())
result.push_back(tracks.at(p.tracks_begin).covMatrix[7]);
else result.push_back(std::nan(""));
//else result.push_back(std::nan(""));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//else result.push_back(std::nan(""));

@@ -204,7 +539,8 @@ getRP2TRK_phi0_omega_cov(ROOT::VecOps::RVec<edm4hep::ReconstructedParticleData>
for (auto & p: in) {
if (p.tracks_begin<tracks.size())
result.push_back(tracks.at(p.tracks_begin).covMatrix[4]);
else result.push_back(std::nan(""));
//else result.push_back(std::nan(""));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//else result.push_back(std::nan(""));

@@ -192,7 +526,8 @@ getRP2TRK_d0_tanlambda_cov(ROOT::VecOps::RVec<edm4hep::ReconstructedParticleData
for (auto & p: in) {
if (p.tracks_begin<tracks.size())
result.push_back(tracks.at(p.tracks_begin).covMatrix[10]);
else result.push_back(std::nan(""));
//else result.push_back(std::nan(""));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//else result.push_back(std::nan(""));

examples/FCCee/weaver/stage2.cpp Outdated Show resolved Hide resolved
@vvolkl vvolkl enabled auto-merge (squash) October 31, 2022 11:36
@vvolkl vvolkl merged commit 91025ad into HEP-FCC:master Oct 31, 2022
@clementhelsens
Copy link
Contributor

hello @selvaggi
I did not had time to comment on this PR, nevertheless I have few comments:

  1. stage2.cpp: I don't think such code belong to this repository, and you should rather try to add this step in a weaverPreprocessing RDF analysers
  2. FCCAnalyses::JetFlavourUtils: for the inference of pi0/gamma, I created a similar utils removing the jet specifics, see FCCAnalyses::WeaverUtils, see here for example
  3. all_stages.py: instead of this, you should rather extend the fccanalysis run options to allow the splitting of a dataset from both command line arguments and process list options like here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants