Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

increase size of buffer in InitMsgBuilder #37937

Merged
merged 1 commit into from May 13, 2022

Conversation

missirol
Copy link
Contributor

@missirol missirol commented May 12, 2022

PR description:

This PR fixes a problem reported by @fwyzard when running the full HLT GRun menu using GlobalEvFOutputModule as cms.OutputModule (which is how the HLT produces output files online).

The issue occurs in the beginRun stage, when serializing the content of the "INI" streamer files: the size of the buffer given by InitMsgBuilder to EventMsgBuilder can be too small if the number of L1 and HL triggers is above certain values.

The current size of 256 is insufficient, for example, in the presence of 500 L1T seeds and 500 HLT paths.

The issue leads to a crash, and it can be reproduced with this minimal update of the relevant DAQ unit test.

diff --git a/EventFilter/Utilities/test/startFU.py b/EventFilter/Utilities/test/startFU.py
index c00d612aae8..9133a1196aa 100644
--- a/EventFilter/Utilities/test/startFU.py
+++ b/EventFilter/Utilities/test/startFU.py
@@ -128,6 +128,9 @@ process.tcdsRawToDigi.InputLabel = cms.InputTag("rawDataCollector")
 process.p1 = cms.Path(process.a*process.tcdsRawToDigi*process.filter1)
 process.p2 = cms.Path(process.b*process.filter2)
 
+for pidx in range(3,1000):
+  setattr(process, f'p{pidx}', cms.Path(process.b))
+
 process.streamA = cms.OutputModule("EvFOutputModule",
     SelectEvents = cms.untracked.PSet(SelectEvents = cms.vstring( 'p1' ))
 )
./EventFilter/Utilities/test/LocalRunBUFU.sh

To my knowledge, this problem affects both GlobalEvFOutputModule and EvFOutputModule.

Given the deadline for 12_4_0_pre4 (and possible need for a patch release in 12_3_X), this PR applies a minimal fix increasing the buffer size.

A buffer size of 640 should be sufficient for 512 L1T seeds and 2000 HLT paths (the current HLT menu for pp collisions has approx. 800 paths).

In the near future, the algorithm could be improved to find an optimal buffer size based on the number of L1 and HL triggers in the configuration.

This PR will need to be backported at least to 12_3_X.

Debugged with @fwyzard.

PR validation:

Manual tests. This fix solves the original issue found when testing the full GRun menu.

If this PR is a backport, please specify the original PR and why you need to backport that PR:

N/A

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37937/29988

  • This PR adds an extra 12KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @missirol (Marino Missiroli) for master.

It involves the following packages:

  • IOPool/Streamer (core)

@cmsbuild, @smuzaffar, @Dr15Jones, @makortel can you please review it and eventually sign? Thanks.
@makortel, @wddgit this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@missirol
Copy link
Contributor Author

Attn: @smorovic

@fwyzard
Copy link
Contributor

fwyzard commented May 12, 2022

@missirol thanks for debugging the problem!

@missirol
Copy link
Contributor Author

urgent

Targets 12_4_0_pre4.

@missirol
Copy link
Contributor Author

type bugfix

@missirol
Copy link
Contributor Author

missirol commented May 12, 2022

thanks for debugging the problem!

(much) help from @fwyzard is acknowledged :)

@missirol
Copy link
Contributor Author

please test

@smorovic
Copy link
Contributor

+1

@smorovic
Copy link
Contributor

Thanks for debugging and fixing it. Are you also planning to create a backport for 12_3_X?

@smorovic
Copy link
Contributor

Also, it looks like 'core' needs to sign-off Streamer directory, I don't have the rights.

@missirol
Copy link
Contributor Author

Are you also planning to create a backport for 12_3_X?

Yes, definitely. I can prepare this tomorrow.

@Dr15Jones
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs after it passes the integration tests. This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4cb6ea/24690/summary.html
COMMIT: 79ef5e2
CMSSW: CMSSW_12_4_X_2022-05-12-1100/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/37937/24690/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3697209
  • DQMHistoTests: Total failures: 2
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3697185
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 206 log files, 45 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@qliphy
Copy link
Contributor

qliphy commented May 13, 2022

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants