Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prepare DQMStore for Framework concurrent run support #39491

Merged
merged 1 commit into from Sep 27, 2022

Conversation

wddgit
Copy link
Contributor

@wddgit wddgit commented Sep 23, 2022

PR description:

Modify DQMStore so that it continues to function when the pull request adding Framework support for concurrent runs is merged. That pull request, #38801, adds additional concurrency that was always planned in the design and documented on the design TWIKI, but in fact was not implemented. After that PR is merged, the streamBeginRun and streamEndRun transitions may run concurrently with some other transitions, including globalBeginLuminosityBlock. The current implementation of DQMStore relies on globalBeginLuminosityBlock occurring before beginStreamRun to call initLumi for Monitor elements with lumi scope. DQMStore will break if beginStreamRun is called after globalBeginLuminosityBlock because new MonitorElements will be booked and then initLumi will never be called.

This is a one line fix for this problem.

Note that this fix is intended to allow DQMStore to function properly in the case where the number of concurrent runs is configured to 1, which will be the default. There might or might not be additional work necessary in DQMStore to support a configuration where the number of concurrent runs is configured to be greater than 1. That case has not been tested. The plan is to get support for concurrent runs implemented in the Framework and later improve modules and services outside the Framework to support that mode of operation as resources are available.

Note that PR #38801 cannot be merged until after this PR is merged and we would like to merge that one soon.

@makortel and @Dr15Jones, you might be interested in this also.

PR validation:

Behavior should remain exactly the same, so this relies on existing tests.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-39491/32232

  • This PR adds an extra 12KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @wddgit (W. David Dagenhart) for master.

It involves the following packages:

  • DQMServices/Core (dqm)

@emanueleusai, @ahmad3213, @cmsbuild, @jfernan2, @syuvivida, @pmandrik, @micsucmed, @rvenditti can you please review it and eventually sign? Thanks.
@barvic this is something you requested to watch as well.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@wddgit
Copy link
Contributor Author

wddgit commented Sep 23, 2022

enable threading

@wddgit
Copy link
Contributor Author

wddgit commented Sep 23, 2022

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-75a9ea/27752/summary.html
COMMIT: 9af0340
CMSSW: CMSSW_12_6_X_2022-09-22-2300/el8_amd64_gcc10
Additional Tests: THREADING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/39491/27752/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 51
  • DQMHistoTests: Total histograms compared: 3624368
  • DQMHistoTests: Total failures: 2
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3624344
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 50 files compared)
  • Checked 212 log files, 49 edm output root files, 51 DQM output files
  • TriggerResults: no differences found

@wddgit
Copy link
Contributor Author

wddgit commented Sep 26, 2022

I ran runTheMatrix.py with all the tests (not just the limited ones) and threading enabled. The tests all passed (except for 3 known and unrelated failures). That test also included PR #38801 and PR #39374. The run concurrency pull request is now only waiting on approval of these 3 PR's. We are hoping to get it merged into 12_6_X soon so we have time to gain experience with it before 12_6_0 is finalized.

@emanueleusai
Copy link
Member

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 00ec235 into cms-sw:master Sep 27, 2022
@wddgit wddgit deleted the prepareForConcurrentRuns branch January 13, 2023 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants