Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add workflows for profiling the GPU code #35540

Merged
merged 2 commits into from Oct 8, 2021

Conversation

fwyzard
Copy link
Contributor

@fwyzard fwyzard commented Oct 5, 2021

PR description:

Add four workflows for profiling the GPU code:

  • .504 Pixel-only local reconstruction and quadruplets
  • .508 Pixel-only local reconstruction and triplets
  • .514 ECAL-only local reconstruction
  • .524 ECAL-only local reconstruction

The workflows explicitly consume the GPU products, so they can only run on a GPU-equipped machine.

The transfer to the host and the conversion to the legacy format is not run.

PR validation:

Used to profile the various workflows on top of CMSSW_12_1_0_pre3, runing over that release's TTbar relvals with pileup:

measurement CMSSW_12_1_0_pre3
I/O throughput ~ 2 kev/s
11634.504 1071 ± 3 ev/s
11634.508 560 ± 2 ev/s
11634.514 1391 ± 5 ev/s
11634.524 1354 ± 10 ev/s

@fwyzard
Copy link
Contributor Author

fwyzard commented Oct 5, 2021

please test

@fwyzard
Copy link
Contributor Author

fwyzard commented Oct 5, 2021

no need to test explicitly on GPU, since the new workflows are not part of any test

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 5, 2021

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35540/25762

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 5, 2021

A new Pull Request was created by @fwyzard (Andrea Bocci) for master.

It involves the following packages:

  • Configuration/PyReleaseValidation (pdmv, upgrade)

@jordan-martins, @bbilin, @wajidalikhan, @AdrianoDee, @srimanob, @kskovpen can you please review it and eventually sign? Thanks.
@makortel, @kpedro88, @Martin-Grunewald, @missirol, @fabiocos, @slomeo this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 5, 2021

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-04c13f/19411/summary.html
COMMIT: b5262fc
CMSSW: CMSSW_12_1_X_2021-10-05-1100/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/35540/19411/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-04c13f/19411/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-04c13f/19411/git-merge-result

  • DAS Queries: The DAS query tests failed, see the summary page for details.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 16 differences found in the comparisons
  • DQMHistoTests: Total files compared: 40
  • DQMHistoTests: Total histograms compared: 3219394
  • DQMHistoTests: Total failures: 94
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 3219277
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -0.004 KiB( 39 files compared)
  • DQMHistoSizes: changed ( 312.0 ): -0.004 KiB MessageLogger/Warnings
  • Checked 169 log files, 37 edm output root files, 40 DQM output files
  • TriggerResults: no differences found

@fwyzard
Copy link
Contributor Author

fwyzard commented Oct 8, 2021

ping @cms-sw/pdmv-l2

@fwyzard
Copy link
Contributor Author

fwyzard commented Oct 8, 2021

ping @cms-sw/upgrade-l2

@srimanob
Copy link
Contributor

srimanob commented Oct 8, 2021

+Upgrade

This PR is to add 4 workflows for profiling.

@kskovpen
Copy link
Contributor

kskovpen commented Oct 8, 2021

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 8, 2021

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

perrotta commented Oct 8, 2021

+1

@cmsbuild cmsbuild merged commit 7403d38 into cms-sw:master Oct 8, 2021
@iarspider
Copy link
Contributor

Hi @fwyzard ,

we observe RelVal failures in GPU_X builds starting Friday: https://cmssdt.cern.ch/SDT/html/cmssdt-ib/#/relVal/CMSSW_12_1/2021-10-10-2300?selectedArchs=slc7_amd64_gcc900&selectedFlavors=GPU_X&selectedStatus=failed

Could these be related to this PR?

Thanks!

@fwyzard
Copy link
Contributor Author

fwyzard commented Oct 14, 2021

Most likely it is due to #35566 .
Should be fixed by #35630 .

@fwyzard fwyzard deleted the add_GPU_profiling_workflows branch July 31, 2022 13:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants