DQM: Merge MonitorElement and ConcurrentMonitorElement #28092

schneiml · 2019-09-30T12:20:07Z

PR description:

Moving forward, the separation between ConcurrentMonitorElement and MonitorElement is rather poitnless and annoying, given that all DQM will need to use ConcurrentMonitorElement semantics in the future.

This PR introduces a Frankenstein-ME that uses parts of the new ME implementation [1] (namely, the locked MonitorElementData::Value type) in the current ME code. The result is that the usual interactions with the ME are now locked and thread-safe, similar to the ConcurrentMonitorElement. This means we can now drop the ConcurrentMonitorElement and instead use MonitorElement*. Of course there are still plenty of non-thread-safe interactions in the MonitorElement, but these should not be used.

[1] https://github.com/schneiml/cmssw/blob/dqm-new-dqmstore-on-CMSSW_11_0_0_pre5/DQMServices/Core/src/MonitorElement.cc

PR validation:

No output changes expected, and none observed.
However, some of the tests where probably incorrect -- the code was almost certainly uncompilable yet passed the tests.

cmsbuild · 2019-09-30T12:21:45Z

The code-checks are being triggered in jenkins.

cmsbuild · 2019-09-30T12:23:50Z

The code-checks are being triggered in jenkins.

cmsbuild · 2019-09-30T12:29:37Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-28092/12080

This PR adds an extra 60KB to repository
There are other open Pull requests which might conflict with changes you have proposed:
- File DQMServices/Core/interface/MonitorElement.h modified in PR(s): DQM: Remove Tags. #28042, DQM: remove support for References inside CMSSW #28038
- File DQMServices/Core/src/DQMStore.cc modified in PR(s): DQM: Remove Tags. #28042, DQM: remove support for References inside CMSSW #28038
- File DQMServices/Core/src/MonitorElement.cc modified in PR(s): DQM: remove support for References inside CMSSW #28038
- File DQMServices/Core/test/BuildFile.xml modified in PR(s): DQM: remove support for References inside CMSSW #28038

cmsbuild · 2019-09-30T12:30:48Z

A new Pull Request was created by @schneiml (Marcel Schneider) for master.

It involves the following packages:

DQMServices/Core
DataFormats/Histograms

@smuzaffar, @andrius-k, @Dr15Jones, @kmaeshima, @schneiml, @cmsbuild, @jfernan2, @fioriNTU can you please review it and eventually sign? Thanks.
@barvic, @rovere this is something you requested to watch as well.
@davidlange6, @slava77, @fabiocos you are the release manager for this.

cms-bot commands are listed here

schneiml · 2019-09-30T12:37:52Z

please test

just to see if anything blows up so far.

cmsbuild · 2019-09-30T12:38:20Z

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/2727/console Started: 2019/09/30 14:40

cmsbuild · 2019-09-30T13:13:29Z

-1

Tested at: 1125474

You can see the results of the tests here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-dcb6be/2727/summary.html

I found follow errors while testing this PR

Failed tests: Build

Build:

I found compilation warning when building: See details on the summary page.

cmsbuild · 2019-09-30T13:13:31Z

Comparison not run due to Build errors (RelVals and Igprof tests were also skipped)

rovere · 2019-09-30T14:21:47Z

@schneiml
Ciao Marcel, at some point CMS/DQM has to take a decision related to DQMGUI. This Pr is fully destructive and the DQMGUI will not even compile against it.
We can still build it against older versions, which is what we do, but a clear road-map and plan have to be put in place.

schneiml · 2019-09-30T14:40:01Z

@rovere Ciao Marco, yes, we'll need to have this discussion in a bit broader scope at some point.

However, I think it is fine to remove the DQMGUI support on DQMStore's side, since we changed/need to change the inner workings of DQM dramatically on CMSSW side (first threaded mode, in the future decentralized DQMStore). IIRC DQMGUI currently uses a CMSSW7 version of the DQMStore, so it has missed a lot of changes already (does the version it uses even have threaded mode?).

Even if we stick with the old DQMGUI, it is probably perfectly fine to maintain it's version of the DQMStore separately. After all, all that we care about is file IO (and network IO, which is arguably a special case of file IO), and the file formats need to be stable for other reasons anyways. Also, no changes to the file formats are planned at the moment (except maybe minor changes to DQMIO).

fabiocos · 2019-11-04T08:12:05Z

@gennai @Sam-Harper following the comment #28092 (comment) could you please address the question of @schneiml or point to someone able to do that? The point raised by @fwyzard is relevant for the integration of this code, and as 11_0_X needs to be used for productions we cannot afford to use it just to validate choices a posteriori.

Sam-Harper · 2019-11-04T16:51:10Z

so I'm following up with our experts and hope to have a timing recipe for you soon.

In general though @schneiml , we would prefer if it could be run on our dedicated timing machines vocms003, vocms004 as this allows us an apples to apples comparison. Would this be a problem for you do to so. If not, could you email me your cern username, I'll add you to the people who can access then.

fioriNTU · 2019-11-04T17:43:43Z

@Sam-Harper thank you very much for "activating" your experts so quickly! However, since this measurement has never be done in the context of the DQM core team, and moving on with this PR is quite urgent, it would be possible to have this test ran by the HLT expert himself?

I do not know how long it can take, but I am sure you can take 1/10 of the time with respect to us. Do you think it is feasible?

Sam-Harper · 2019-11-05T10:17:41Z

@fioriNTU , this is not normally our policy to do this for developers otherwise we quickly get overwhelmed.

However in this particular case we are also having to commission the timing in 11_0_X as a reference (still iterating with experts), and therefore is it not any significant extra work to also run the timing for this PR at the same time. So we will do so.

Sam-Harper · 2019-11-06T12:34:26Z

So incase folks are wondering what happened, the test was run, realised it was inadequate for this type of change, re-run again. Those results have been looked at and furtther tests were needed which are on going.

schneiml · 2019-11-06T13:38:18Z

@Sam-Harper thanks for the update!

It looks like this sill merges cleanly after #28297, so we might get away without another rebase (unless #28247 goes first).

Sam-Harper · 2019-11-07T12:01:01Z

Okay timing is okay

No real difference within the jitter.

Sam-Harper · 2019-11-07T12:02:58Z

btw the first results, I got a 10ms increase but it looks like it was just a random upset as re-running it multiple times we converged on the above result.

fioriNTU · 2019-11-07T12:48:09Z

Thank you very much @Sam-Harper !

fabiocos · 2019-11-07T14:50:47Z

@Sam-Harper @fwyzard @Martin-Grunewald are there other concerns/comments by HLT experts?
@christopheralanwest @tlampen @rekovic could you please check and comment in case?

fwyzard · 2019-11-07T20:29:19Z

From my side, I'm happy with the result of the test Sam has done.

Martin-Grunewald · 2019-11-08T05:29:02Z

+1

schneiml · 2019-11-08T08:52:05Z

Thanks @Sam-Harper !

Btw, if you have reasonably easy instructions how to run these things now I am still interested (on the longer term), to do some profiling and see how much impact DQM has at all in these configurations. In Offline, it is typically very little, which is why I think that many speed hacks in the DQM code are out of place...

fabiocos · 2019-11-08T21:26:08Z

+1

fabiocos · 2019-11-08T21:27:01Z

@christopheralanwest @tlampen @rekovic please check this and comment in case for a possible further iteration, the changes in your are looks consequential to the heart of the PR

fabiocos · 2019-11-08T21:27:12Z

merge

christopheralanwest · 2019-11-08T23:20:41Z

+1

cmsbuild added this to the CMSSW_11_0_X milestone Sep 30, 2019

cmsbuild added code-checks-pending comparison-pending core-pending dqm-pending orp-pending pending-signatures tests-pending labels Sep 30, 2019

cmsbuild added code-checks-approved and removed code-checks-pending labels Sep 30, 2019

cmsbuild added tests-started and removed tests-pending labels Sep 30, 2019

cmsbuild added comparison-notrun tests-rejected and removed comparison-pending tests-started labels Sep 30, 2019

cmsbuild added alca-pending and removed code-checks-approved comparison-notrun tests-rejected labels Sep 30, 2019

cmsbuild added the comparison-available label Nov 1, 2019

schneiml mentioned this pull request Nov 4, 2019

DQM: Reduce TH1 Usage #28342

Merged

cmsbuild added hlt-approved and removed hlt-pending labels Nov 8, 2019

cmsbuild added orp-approved and removed orp-pending labels Nov 8, 2019

cmsbuild merged commit 8f02b2b into cms-sw:master Nov 8, 2019

cmsbuild added alca-approved and removed alca-pending labels Nov 8, 2019

apsallid mentioned this pull request Nov 13, 2019

[HGCal] Updates on HGCalValidator #28393

Merged

This was referenced Jan 16, 2020

Bugfix: Crash in DaqTestHistograms #28751

Merged

Bugfix: Crash in DaqTestHistograms (11_0_X) #28752

Merged

mmusich mentioned this pull request Jun 2, 2021

fix issue with DQMGlobalEDAnalyzer skeleton (mkdqmedanalyzer command) #33957

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DQM: Merge MonitorElement and ConcurrentMonitorElement #28092

DQM: Merge MonitorElement and ConcurrentMonitorElement #28092

schneiml commented Sep 30, 2019 •

edited

cmsbuild commented Sep 30, 2019

cmsbuild commented Sep 30, 2019

cmsbuild commented Sep 30, 2019

cmsbuild commented Sep 30, 2019

schneiml commented Sep 30, 2019

cmsbuild commented Sep 30, 2019 •

edited

cmsbuild commented Sep 30, 2019

cmsbuild commented Sep 30, 2019

rovere commented Sep 30, 2019 •

edited

schneiml commented Sep 30, 2019

fabiocos commented Nov 4, 2019

Sam-Harper commented Nov 4, 2019

fioriNTU commented Nov 4, 2019

Sam-Harper commented Nov 5, 2019

Sam-Harper commented Nov 6, 2019

schneiml commented Nov 6, 2019

Sam-Harper commented Nov 7, 2019

Sam-Harper commented Nov 7, 2019

fioriNTU commented Nov 7, 2019

fabiocos commented Nov 7, 2019

fwyzard commented Nov 7, 2019

Martin-Grunewald commented Nov 8, 2019

schneiml commented Nov 8, 2019

fabiocos commented Nov 8, 2019

fabiocos commented Nov 8, 2019

fabiocos commented Nov 8, 2019

christopheralanwest commented Nov 8, 2019

DQM: Merge MonitorElement and ConcurrentMonitorElement #28092

DQM: Merge MonitorElement and ConcurrentMonitorElement #28092

Conversation

schneiml commented Sep 30, 2019 • edited

PR description:

PR validation:

cmsbuild commented Sep 30, 2019

cmsbuild commented Sep 30, 2019

cmsbuild commented Sep 30, 2019

cmsbuild commented Sep 30, 2019

schneiml commented Sep 30, 2019

cmsbuild commented Sep 30, 2019 • edited

cmsbuild commented Sep 30, 2019

cmsbuild commented Sep 30, 2019

rovere commented Sep 30, 2019 • edited

schneiml commented Sep 30, 2019

fabiocos commented Nov 4, 2019

Sam-Harper commented Nov 4, 2019

fioriNTU commented Nov 4, 2019

Sam-Harper commented Nov 5, 2019

Sam-Harper commented Nov 6, 2019

schneiml commented Nov 6, 2019

Sam-Harper commented Nov 7, 2019

Sam-Harper commented Nov 7, 2019

fioriNTU commented Nov 7, 2019

fabiocos commented Nov 7, 2019

fwyzard commented Nov 7, 2019

Martin-Grunewald commented Nov 8, 2019

schneiml commented Nov 8, 2019

fabiocos commented Nov 8, 2019

fabiocos commented Nov 8, 2019

fabiocos commented Nov 8, 2019

christopheralanwest commented Nov 8, 2019

schneiml commented Sep 30, 2019 •

edited

cmsbuild commented Sep 30, 2019 •

edited

rovere commented Sep 30, 2019 •

edited