Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle event meta data changes in streamer files [14_0] #44978

Merged
merged 7 commits into from
Jun 10, 2024

Conversation

Dr15Jones
Copy link
Contributor

PR description:

Moved storage of meta data needed for events into the first EventMessage sent before the subsequent events which need that meta data.
This fixed a problem seen online where different HLT nodes were generating different event meta data for the same luminosity block but online the event data for one node is used.

PR validation:

Code compiles and unit tests pass.

backport of #44892 (and #44753 to make the former trivial to backport)

makortel and others added 5 commits May 15, 2024 09:41
The event meta-data (e.g. BranchLists) is now added as an
EventMsg before the following EventMsg for which it applies.
This added EventMsg contains no data products. When seen,
the added EventMsg is handled as an artificial file boundary in
order to cause the needed stream synchronization.
This change is needed to handle how HLT merges parts of streamer
files which might have different meta-data.
- event meta data is cached at begin job or when new file is opened
- refactored StreamerOutputModuleCommon to be two classes so buffer
  could be handled separately for the caching
- added many unit tests to show GlobalEvFOutputModule properly
  writes the expected events and the data products are correct
@cmsbuild
Copy link
Contributor

cmsbuild commented May 15, 2024

A new Pull Request was created by @Dr15Jones for CMSSW_14_0_X.

It involves the following packages:

  • CalibCalorimetry/EcalLaserSorting (alca)
  • DQMServices/StreamerIO (dqm)
  • DataFormats/Streamer (core)
  • DataFormats/TestObjects (core)
  • EventFilter/Utilities (daq)
  • FWCore/Framework (core)
  • FWCore/Modules (core)
  • IOPool/Input (core)
  • IOPool/Streamer (core)

@cmsbuild, @nothingface0, @smuzaffar, @rvenditti, @tjavaid, @antoniovagnerini, @saumyaphor4252, @Dr15Jones, @syuvivida, @smorovic, @emeschi, @perrotta, @makortel, @consuegs can you please review it and eventually sign? Thanks.
@mmusich, @argiro, @thomreis, @rsreds, @yuanchao, @wddgit, @tocheng, @rchatter, @barvic, @rovere, @ReyerBand, @Martin-Grunewald, @makortel, @missirol, @wang0jin this is something you requested to watch as well.
@sextonkennedy, @antoniovilela, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@cmsbuild
Copy link
Contributor

cmsbuild commented May 15, 2024

cms-bot internal usage

@Dr15Jones Dr15Jones changed the title Streamer new format 14 0 Handle event meta data changes in streamer files [14_0] May 15, 2024
@Dr15Jones
Copy link
Contributor Author

please test

@nothingface0
Copy link
Contributor

nothingface0 commented May 15, 2024

@Dr15Jones Using 14_0_6, most of the DQM clients (DQM/Integration/python/clients) seem to have an issue opening existing stream files from run 380531:

Can't deserialize registry data (in open file): An exception of category 'FatalRootError' occurred.
   Additional Info:
      [a] Fatal Root Error: @SUB=TBufferFile::ReadClassBuffer
Could not find the StreamerInfo for version 11 of the class edm::SendJobHeader, object skipped at offset 33

Is this incompatibility expected?

@Dr15Jones
Copy link
Contributor Author

Is this incompatibility expected?

Yes. The streamer format had to be changed to accommodate the needed meta data. The streamer format has no backwards compatibility ability (by its original design).

You'll have to do what was described before, convert the original streamer into a ROOT file then use the ROOT file as input to a new job and have that new job write the new streamer format.

@cmsbuild
Copy link
Contributor

-1

Failed Tests: RelVals-INPUT
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-12a253/39389/summary.html
COMMIT: f695f87
CMSSW: CMSSW_14_0_X_2024-05-15-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/44978/39389/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals-INPUT

  • 4.764.76_ZMuSkim2012D/step2_ZMuSkim2012D.log

Comparison Summary

Summary:

  • You potentially added 59 lines to the logs
  • Reco comparison results: 40 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3333953
  • DQMHistoTests: Total failures: 0
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3333933
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 202 log files, 165 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

@smorovic
Copy link
Contributor

@cmsbuild please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-12a253/39404/summary.html
COMMIT: f695f87
CMSSW: CMSSW_14_0_X_2024-05-16-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/44978/39404/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 47 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3333953
  • DQMHistoTests: Total failures: 6
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3333927
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 202 log files, 165 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

@makortel
Copy link
Contributor

makortel commented Jun 5, 2024

@makortel
Copy link
Contributor

makortel commented Jun 5, 2024

+core

(this is just to have my signature on the record, avoid the future merge to become pending on core signature) Merging should still wait for positive outcome from the ongoing test. @cms-sw/orp-l2 Should we put this PR on hold, or is the situation clear enough without?

@antoniovilela
Copy link
Contributor

+core

(this is just to have my signature on the record, avoid the future merge to become pending on core signature) Merging should still wait for positive outcome from the ongoing test. @cms-sw/orp-l2 Should we put this PR on hold, or is the situation clear enough without?

Hi Matti, this PR is on hold. From the e-mail thread, one had the impression the tests were successful, but it would be good to have confirmation.
Thanks.

@tjavaid
Copy link

tjavaid commented Jun 10, 2024

backport of #44892

@tjavaid
Copy link

tjavaid commented Jun 10, 2024

+1

@perrotta
Copy link
Contributor

+alca

@rappoccio
Copy link
Contributor

unhold

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next CMSSW_14_0_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_14_1_X is complete. This pull request will now be reviewed by the release team before it's merged. @antoniovilela, @sextonkennedy, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@rappoccio
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 3ef7dab into cms-sw:CMSSW_14_0_X Jun 10, 2024
10 checks passed
mmusich added a commit to mmusich/hltScripts that referenced this pull request Jun 15, 2024
mmusich added a commit to mmusich/hltScripts that referenced this pull request Jun 15, 2024
@Dr15Jones Dr15Jones deleted the streamerNewFormat_14_0 branch July 17, 2024 14:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet