Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skim EXO HighMET #37749

Merged
merged 3 commits into from
May 9, 2022
Merged

Skim EXO HighMET #37749

merged 3 commits into from
May 9, 2022

Conversation

afrankenthal
Copy link
Contributor

@afrankenthal afrankenthal commented Apr 29, 2022

PR description:

This PR adds a common EXOHighMET skim that addresses the Run 3 needs of several EXO analyses. Specifically, it runs on top of the RAW+RECO data tier to allow analyzers to access detailed hit information (RECO) and to use the dataset for analysis R&D (RAW), such as ML and other reconstruction improvements.

We have realized that the needs of the EXO analyses differ in a non-negligible way from the existing JME HighMET skim [1]. The JME goals with such a skim are to study the behavior of high-MET tail and to develop and test different noise cleaning strategies. They have found for example that the RAW data tier is not so relevant for their use-case. They also don't need all the events collected for this purpose.

The proposed skim selects events on or near the plateau of the MET trigger efficiency, with MET pT > 200 GeV. There is currently no additional filter on the event content of the skim. The skim rate as tested on a file in the MET 2018A dataset is 2.7%, selecting 194 out of 7219 events.

This skim will be useful for several analyses, for example [2]:

  • Delayed jets (EXO-19-001)
  • Inelastic dark matter (EXO-20-010)
  • Disappearing tracks (EXO-19-010)
  • LLP hadronic decays in the ECAL
  • Fractionally charged particles (EXO-19-006)
  • Heavy Stable Charged Particles (HSCP)

And we believe others will also benefit from it.

[1] https://indico.cern.ch/event/1042443/contributions/4379421/attachments/2257197/3831154/HighMETSkim_JME_3June2021.pdf

[2] https://indico.cern.ch/event/1086811/contributions/4568933/attachments/2329715/3969638/EXO_skim_plans_run3_JME_20211018.pdf

PR validation:

This has been verified on CMSSW_12_4_0_pre3 with two test configs: Configuration/Skimming/test/test_EXOHighMET_onQCD_cfg.py and Configuration/Skimming/test/test_EXOHighMET_onData_cfg.py. Both in terms of number of events and file size, we get a roughly 2.7% rate on data. For completeness, we also test on a very high pT QCD MC sample, and here the skim rate is 12%.

@cmsbuild cmsbuild changed the base branch from CMSSW_12_4_X to master April 29, 2022 22:01
@cmsbuild
Copy link
Contributor

@afrankenthal, CMSSW_12_4_X branch is closed for direct updates. cms-bot is going to move this PR to master branch.
In future, please use cmssw master branch to submit your changes.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37749/29618

  • This PR adds an extra 20KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @afrankenthal (Andre Frankenthal) for master.

It involves the following packages:

  • Configuration/Skimming (pdmv)

@cmsbuild, @bbilin, @kskovpen, @jordan-martins can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @missirol, @fabiocos this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@kskovpen
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-347331/24353/summary.html
COMMIT: 2ce6eea
CMSSW: CMSSW_12_4_X_2022-04-29-2300/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/37749/24353/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 8 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3704146
  • DQMHistoTests: Total failures: 14
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3704110
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 205 log files, 45 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@afrankenthal afrankenthal marked this pull request as ready for review May 2, 2022 22:40
@kskovpen
Copy link
Contributor

kskovpen commented May 4, 2022

+pdmv

@cmsbuild
Copy link
Contributor

cmsbuild commented May 4, 2022

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

perrotta commented May 4, 2022

The two (big) test configs only differ for the input dataset and the global tag.
To easen maintainability, wouldn't be better to have they merged into one single config file, allowing to switch between the two cases, real data and QCD MC?

@afrankenthal
Copy link
Contributor Author

Hi @perrotta that makes sense. I've consolidated the two configs with a VarParsing option 'runOnData', defaulting to True since that's the one we really care about.

@cmsbuild
Copy link
Contributor

cmsbuild commented May 4, 2022

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37749/29713

  • This PR adds an extra 16KB to repository

  • Found files with invalid states:

    • Configuration/Skimming/test/test_EXOHighMET_onQCD_cfg.py:
    • Configuration/Skimming/test/test_EXOHighMET_onData_cfg.py:
  • There are other open Pull requests which might conflict with changes you have proposed:

@cmsbuild
Copy link
Contributor

cmsbuild commented May 4, 2022

Pull request #37749 was updated. @cmsbuild, @bbilin, @kskovpen, @jordan-martins can you please check and sign again.

@perrotta
Copy link
Contributor

perrotta commented May 4, 2022

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented May 4, 2022

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-347331/24451/summary.html
COMMIT: 26c94f5
CMSSW: CMSSW_12_4_X_2022-05-04-1100/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37749/24451/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

There are some workflows for which there are errors in the baseline:
39434.501 step 3
The results for the comparisons for these workflows could be incomplete
This means most likely that the IB is having errors in the relvals.The error does NOT come from this pull request

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3700548
  • DQMHistoTests: Total failures: 7
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 3700518
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -0.004 KiB( 48 files compared)
  • DQMHistoSizes: changed ( 312.0 ): -0.004 KiB MessageLogger/Warnings
  • Checked 205 log files, 45 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@kskovpen
Copy link
Contributor

kskovpen commented May 9, 2022

+pdmv

@cmsbuild
Copy link
Contributor

cmsbuild commented May 9, 2022

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

perrotta commented May 9, 2022

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants