Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix thread-safety issues in TrackingMonitor #29143

Merged
merged 2 commits into from Mar 9, 2020

Conversation

Dr15Jones
Copy link
Contributor

PR description:

Changing TrackingMonitor into a stream module uncovered thread-safety issues. These changes now make all activities thread-safe by avoiding direct use of the underlying ROOT objects.

PR validation:

The code compiles.

These are needed by TrackAnalyzer
Using the underlying TH* object is not guaranteed to be thread-safe.
@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 7, 2020

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 7, 2020

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-29143/14075

  • This PR adds an extra 44KB to repository

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 7, 2020

A new Pull Request was created by @Dr15Jones (Chris Jones) for master.

It involves the following packages:

DQM/TrackingMonitor
DQMServices/Core

@andrius-k, @kmaeshima, @schneiml, @cmsbuild, @jfernan2, @fioriNTU can you please review it and eventually sign? Thanks.
@barvic, @hdelanno, @makortel, @mtosi, @fioriNTU, @jandrea, @idebruyn, @threus this is something you requested to watch as well.
@davidlange6, @silviodonato, @fabiocos you are the release manager for this.

cms-bot commands are listed here

@Dr15Jones
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 7, 2020

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/5059/console Started: 2020/03/07 22:12

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 7, 2020

+1
Tested at: 5d45920
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-1ee9bf/5059/summary.html
CMSSW: CMSSW_11_1_X_2020-03-07-1100
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 7, 2020

Comparison job queued.

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 8, 2020

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-1ee9bf/5059/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 2 differences found in the comparisons
  • DQMHistoTests: Total files compared: 34
  • DQMHistoTests: Total histograms compared: 2680577
  • DQMHistoTests: Total failures: 39
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2680219
  • DQMHistoTests: Total skipped: 319
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared)
  • Checked 147 log files, 16 edm output root files, 34 DQM output files

@jfernan2
Copy link
Contributor

jfernan2 commented Mar 8, 2020

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 8, 2020

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @silviodonato, @fabiocos (and backports should be raised in the release meeting by the corresponding L2)


//NOTE: for full reproducibility when using threads, this loop needs to be
// a critical section
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How important is "full reproducibility when using threads"? (just thinking how soon a solution for that needs to found)

(e.g. are the divisions and get/setBinContent something that needs to be done on every event, or should they move to harvesting?)

@Dr15Jones
Copy link
Contributor Author

@makortel I already have a good idea how to make it reproducible, even without harvesting. I didn't make that change in this pull request as I was just doing the minimum necessary to get rid of the segmentation faults.

@makortel
Copy link
Contributor

makortel commented Mar 9, 2020

Ok, great!

@silviodonato
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit c66210d into cms-sw:master Mar 9, 2020
@silviodonato
Copy link
Contributor

@Dr15Jones we got many seg fault in CMSSW_11_1_X_2020-03-09-2300 (including #29143) due to TrackingMonitor

@Dr15Jones
Copy link
Contributor Author

@silviodonato My code that was meant to prevent deadlocks failed so we see deadlocks in the IB. I have isolated my mistake and I have a fix coming in the next few minutes which properly handles the deadlock prevention.

@Dr15Jones
Copy link
Contributor Author

see #29164

@Dr15Jones Dr15Jones deleted the fixStreamTrackingMonitor branch March 11, 2020 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants