-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minor Framework speed improvements #20447
Conversation
While doing high speed configuration tests, the unnecessary calculation of parentageID (which is a slow hash calculation) showed up in the performance profiling.
The call to processEventAsync is made in the source's serial queue task. The processEventAsync call does a lot of work, all of which can actually be done as a different task. Now processEventAsync immediately launched a new tbb task which just does the work of the original processEventAsync but this time no longer within the timing of the serial queue task. This doubled the scalability of a simple hyper fast configuration.
The code-checks are being triggered in jenkins. |
A new Pull Request was created by @Dr15Jones (Chris Jones) for master. It involves the following packages: FWCore/Framework @cmsbuild, @smuzaffar, @Dr15Jones can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
@fwyzard here are some further improvements to drive the hyper fast configurations. |
please test |
The tests are being triggered in jenkins. |
-code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/PR-20447/602 Code check has found code style and quality issues which could be resolved by applying a patch in https://cmssdt.cern.ch/SDT/code-checks/PR-20447/602/git-diff.patch You can run |
-1 Tested at: 45ad3bc The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic: You can see the results of the tests here: I found follow errors while testing this PR Failed tests: UnitTests RelVals
I found errors in the following unit tests: ---> test TestFWCoreFrameworkGlobalStreamOne had ERRORS
When I ran the RelVals I found an error in the following worklfows: runTheMatrix-results/1306.0_SingleMuPt1_UP15+SingleMuPt1_UP15+DIGIUP15+RECOUP15+HARVESTUP15/step4_SingleMuPt1_UP15+SingleMuPt1_UP15+DIGIUP15+RECOUP15+HARVESTUP15.log10024.0 step5 runTheMatrix-results/10024.0_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2017_GenSimFull+DigiFull_2017+RecoFull_2017+ALCAFull_2017+HARVESTFull_2017/step5_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2017_GenSimFull+DigiFull_2017+RecoFull_2017+ALCAFull_2017+HARVESTFull_2017.log The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic: |
Comparison not run due to runTheMatrix errors (RelVals and Igprof tests were also skipped) |
Please test |
The tests are being triggered in jenkins. |
@smuzaffar is there a time limit on how long the tests can run? I can not see why the tests for this fail in a step4 job running single threaded since the harvesting step never calls the code I changed! |
+1 The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic: |
Comparison job queued. |
Comparison is ready Comparison Summary:
|
@Dr15Jones , although we do run with timeout 10800 for full runTheMatrix but I see that this was OOM-KILL. For PR testing we are running 8 workflows in parallel on a 8 core machine with 15GB memory. May be we should run cpu-1 workflows in parallel for PR testing. |
please test |
The tests are being triggered in jenkins. |
@davidlange6 I can't think of any reason why this pull request would increase RSS in any noticable way. |
+1 The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic: |
Comparison job queued. |
Comparison is ready Comparison Summary:
|
+1 |
While testing the limits of the framework's event throughput using a 'hyper fast' configuration, a few performance bottlenecks were found. With these changes, and running cmsRun on moderate hardware, one can get 33,000 event/sec with one thread and get a max throughput of 180,000 events/sec using 9 threads.