Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[14_0_X] Revert "Workaround to produce exactly same data products in Serial and CUDA backends in Alpaka modules possibly used at HLT" #45081

Conversation

makortel
Copy link
Contributor

PR description:

Reverts #44699, to be used in conjunction with #44978

PR validation:

None

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

Backport of #45080

…Serial and CUDA backends in Alpaka modules possibly used at HLT"
@cmsbuild
Copy link
Contributor

A new Pull Request was created by @makortel for CMSSW_14_0_X.

It involves the following packages:

  • EventFilter/EcalRawToDigi (reconstruction)
  • HeterogeneousCore/AlpakaCore (heterogeneous)
  • RecoLocalCalo/EcalRecProducers (reconstruction)
  • RecoLocalTracker/SiPixelClusterizer (reconstruction)
  • RecoLocalTracker/SiPixelRecHits (reconstruction)
  • RecoParticleFlow/PFClusterProducer (reconstruction)
  • RecoParticleFlow/PFRecHitProducer (reconstruction)
  • RecoTracker/PixelSeeding (reconstruction)
  • RecoTracker/PixelVertexFinding (reconstruction)
  • RecoVertex/BeamSpotProducer (reconstruction, alca)

@fwyzard, @saumyaphor4252, @makortel, @jfernan2, @perrotta, @cmsbuild, @mandrenguyen, @consuegs can you please review it and eventually sign? Thanks.
@rovere, @mmusich, @missirol, @mtosi, @JanFSchulte, @tsusa, @gpetruc, @dkotlins, @GiacomoSguazzoni, @seemasharmafnal, @ferencek, @mroguljic, @threus, @tvami, @lgray, @VinInn, @youyingli, @thomreis, @hatakeyamak, @yuanchao, @felicepantaleo, @mmarionncern, @Martin-Grunewald, @apsallid, @francescobrivio, @rsreds, @argiro, @ReyerBand, @sameasy, @wang0jin, @fabiocos, @rchatter, @VourMa, @dgulhan, @tocheng this is something you requested to watch as well.
@antoniovilela, @sextonkennedy, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@cmsbuild
Copy link
Contributor

cmsbuild commented May 28, 2024

cms-bot internal usage

@makortel
Copy link
Contributor Author

enable gpu

@makortel
Copy link
Contributor Author

@cmsbuild, please test

@antoniovilela
Copy link
Contributor

@smuzaffar @makortel @cms-sw/orp-l2 @cms-sw/ppd-l2
Confirming:
Plan is to merge in 14_1_X #45080 together with #44892
Then, this PR #45081 plus #44978 will go in the special 14_0_X branch, starting from 14_0_7_patch1.

@smuzaffar
Copy link
Contributor

Then, this PR #45081 plus #44978 will go in the special 14_0_X branch, starting from 14_0_7_patch1.

@antoniovilela , just to be sure, #45081 and #44978 will also go in CMSSW_14_0_X ... right? and once these PR are merged then we want to have CMSSW_14_0_7_HLT release (or please suggest a better name) which should have 14_0_7_patch1 + #45081 and #44978 ... right?

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-df7e0f/39584/summary.html
COMMIT: 5623257
CMSSW: CMSSW_14_0_X_2024-05-28-1100/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/45081/39584/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-df7e0f/39584/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-df7e0f/39584/git-merge-result

Comparison Summary

Summary:

  • You potentially added 108 lines to the logs
  • Reco comparison results: 81 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3440217
  • DQMHistoTests: Total failures: 2798
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3437399
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 6931.257 KiB( 48 files compared)
  • DQMHistoSizes: changed ( 12834.0,... ): 1035.999 KiB HLT/BTV
  • DQMHistoSizes: changed ( 141.042,... ): 929.087 KiB HLT/BTV
  • Checked 206 log files, 170 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 35 differences found in the comparisons
  • DQMHistoTests: Total files compared: 3
  • DQMHistoTests: Total histograms compared: 39740
  • DQMHistoTests: Total failures: 1072
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 38668
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 2 files compared)
  • Checked 8 log files, 10 edm output root files, 3 DQM output files
  • TriggerResults: no differences found

@mmusich
Copy link
Contributor

mmusich commented May 28, 2024

just to be sure, #45081 and #44978 will also go in CMSSW_14_0_X ... right?

I thought the whole point of the special branch was to NOT have #45081 and #44978 in 14_0_X until tested (in the special release). What am I missing?

@malbouis
Copy link
Contributor

I thought the whole point of the special branch was to NOT have #45081 and #44978 in 14_0_X until tested (in the special release). What am I missing?

I think we do not yet want to merge 44978 into 140X. It should first be tested and when we’re certain it is doing what’s intended, then we merge.
this is why we need the special release with 44978 (the definite solution) in and 44968 (the workaround) out.

@smuzaffar
Copy link
Contributor

just to be sure, #45081 and #44978 will also go in CMSSW_14_0_X ... right?

I thought the whole point of the special branch was to NOT have #45081 and #44978 in 14_0_X until tested (in the special release).

ok, understood now

@antoniovilela
Copy link
Contributor

just to be sure, #45081 and #44978 will also go in CMSSW_14_0_X ... right?

I thought the whole point of the special branch was to NOT have #45081 and #44978 in 14_0_X until tested (in the special release).

ok, understood now

Many thanks @smuzaffar @mmusich @malbouis

smuzaffar added a commit that referenced this pull request May 30, 2024
@smuzaffar
Copy link
Contributor

@makortel @Dr15Jones @antoniovilela @rappoccio , both #45080 and #44892 are merged in 14.1.X and will be part of 11h00 IB today.
I have created CMSSW_14_0_HLTTest branch which has CMSSW_14_0_7_patch1 + changes from #45081 and #44978 . The changes w.r.t 14.0.7.patch1 are https://github.com/cms-sw/cmssw/compare/CMSSW_14_0_7_patch1...CMSSW_14_0_HLTTest?expand=1 . I can start the build of CMSSW_14_0_7_HLTTest release later today(around 16h00) and can upload it once we have good results from 14.1.X 11h IB. Let me know if all look good for CMSSW_14_0_7_HLTTest branch.

By the way, do we need all archs for this test release or production arch (i.e. el8_amd64_gcc12) is enough? Do we a MULTIARCHS release too ?

@francescobrivio
Copy link
Contributor

By the way, do we need all archs for this test release or production arch (i.e. el8_amd64_gcc12) is enough? Do we a MULTIARCHS release too ?

  • I think we can stick to the production arch only
  • we do need MULTIARCHS too (for DAQ/HLT)

Thanks!
Francesco

@makortel
Copy link
Contributor Author

The changes w.r.t 14.0.7.patch1 are https://github.com/cms-sw/cmssw/compare/CMSSW_14_0_7_patch1...CMSSW_14_0_HLTTest?expand=1 . I can start the build of CMSSW_14_0_7_HLTTest release later today(around 16h00) and can upload it once we have good results from 14.1.X 11h IB. Let me know if all look good for CMSSW_14_0_7_HLTTest branch.

Looks good to me.

@smuzaffar
Copy link
Contributor

thanks @makortel . Note that #44978 is backport of #44892 which was integrated in today's 11h 14.1.X IB and one unit test failed . Is it safe to go ahead with CMSSW_14_0_7_HLTTest release ?

@makortel
Copy link
Contributor Author

Note that #44978 is backport of #44892 which was integrated in today's 11h 14.1.X IB and one unit test failed .

Given the nature of the test, the failure is expected (#44892 (comment)). The test itself is brittle and will need an update.

Is it safe to go ahead with CMSSW_14_0_7_HLTTest release ?

Yes.

Thanks!

@fwyzard
Copy link
Contributor

fwyzard commented Jun 7, 2024

+heterogeneous

@jfernan2
Copy link
Contributor

+1

@perrotta
Copy link
Contributor

+alca

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next CMSSW_14_0_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_14_1_X is complete. This pull request will now be reviewed by the release team before it's merged. @rappoccio, @antoniovilela, @sextonkennedy (and backports should be raised in the release meeting by the corresponding L2)

@rappoccio
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 94a8581 into cms-sw:CMSSW_14_0_X Jun 10, 2024
13 checks passed
@makortel makortel deleted the revert-44699-productIDModuleTypeResolverWorkaround_140x branch July 10, 2024 20:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet