Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read SoA scalars and Eigen columns from a ROOT streamer #39532

Merged
merged 8 commits into from Oct 1, 2022

Conversation

fwyzard
Copy link
Contributor

@fwyzard fwyzard commented Sep 29, 2022

PR description:

Fix the ROOT streamers for the template-based SoA to read back the scalar variables, and to write and read back Eigen objects.

Update the TestAlpakaAnalyzer and CUDA TestPortableAnalyzer to check that scalars and matrices are read back correctly, and print a more verbose description of the SoA being read.

Update the test scripts so that the unit tests actually fail (exit with a non-zero value) if any of the cmsRun jobs fails.

PR validation:

The improved unit tests pass.

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

To be backported to CMSSW_12_5_X for the Alpaka migration.

@fwyzard
Copy link
Contributor Author

fwyzard commented Sep 29, 2022

type bugfix

@fwyzard
Copy link
Contributor Author

fwyzard commented Sep 29, 2022

enable gpu

@fwyzard
Copy link
Contributor Author

fwyzard commented Sep 29, 2022

please test

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-39532/32305

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @fwyzard (Andrea Bocci) for master.

It involves the following packages:

  • DataFormats/SoATemplate (heterogeneous)
  • HeterogeneousCore/AlpakaCore (heterogeneous)
  • HeterogeneousCore/AlpakaTest (heterogeneous)

@makortel, @fwyzard can you please review it and eventually sign? Thanks.
@makortel, @missirol, @rovere this is something you requested to watch as well.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@cmsbuild
Copy link
Contributor

-1

Failed Tests: UnitTests
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-474471/27840/summary.html
COMMIT: 848368c
CMSSW: CMSSW_12_6_X_2022-09-28-2300/el8_amd64_gcc10
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/39532/27840/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found errors in the following unit tests:

---> test testHeterogeneousCoreAlpakaTestWriteRead had ERRORS

Comparison Summary

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-474471/41834.0_TTbar_14TeV+2026D94+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTrigger+RecoGlobal+HARVESTGlobal

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3433154
  • DQMHistoTests: Total failures: 3
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3433129
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 204 log files, 49 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • Reco comparison had 3 failed jobs
  • DQMHistoTests: Total files compared: 4
  • DQMHistoTests: Total histograms compared: 19876
  • DQMHistoTests: Total failures: 74
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 19802
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 3 files compared)
  • Checked 12 log files, 9 edm output root files, 4 DQM output files
  • TriggerResults: no differences found

Copy link
Contributor

@makortel makortel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first commit (e5736f4) is already part of the master branch. Can it be dropped?

, \
/* Column */ \
memcpy(BOOST_PP_CAT(NAME, _), onfile.BOOST_PP_CAT(NAME, _), sizeof(CPP_TYPE) * onfile.elements_); \
, \
/* Eigen column */ \
/* TODO: implement*/ \
std::cerr << "Warning: reading an Eigen column from a ROOT file is not implemented" << std::endl; \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would an exception make sense? (or does it really matter for now?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericcano should have an implementation of it later today, I'll add it here and update the PR description.

@fwyzard
Copy link
Contributor Author

fwyzard commented Sep 29, 2022

Unit Tests

I found errors in the following unit tests:

---> test testHeterogeneousCoreAlpakaTestWriteRead had ERRORS

The error is

cmsRun: /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_6_X_2022-09-28-2300/src/HeterogeneousCore/AlpakaTest/plugins/TestAlpakaAnalyzer.cc:86: virtual void TestAlpakaAnalyzer::analyze(const edm::Event&, const edm::EventSetup&): Assertion `view.r() == 1' failed.

and I've had it locally as well.

Looks like SCRAM is not checking out and recompiling all the dependencies (or we failed to declare some of them ?).

@cmsbuild
Copy link
Contributor

Pull request #39532 was updated. @makortel, @fwyzard can you please check and sign again.

@fwyzard fwyzard changed the title Implement reading back SoA scalars from a ROOT streamer Implement reading SoA scalars and Eigen columns from a ROOT streamer Sep 29, 2022
@fwyzard fwyzard changed the title Implement reading SoA scalars and Eigen columns from a ROOT streamer Read SoA scalars and Eigen columns from a ROOT streamer Sep 29, 2022
@fwyzard
Copy link
Contributor Author

fwyzard commented Sep 30, 2022

please test with cms-sw/cmsdist#8107

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-474471/27868/summary.html
COMMIT: 954edbc
CMSSW: CMSSW_12_6_X_2022-09-29-2300/el8_amd64_gcc10
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/39532/27868/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /pool/condor/dir_268586/jenkins/workspace/compare-root-files-short-matrix/data/PR-474471/41834.0_TTbar_14TeV+2026D94+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTrigger+RecoGlobal+HARVESTGlobal

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 19 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3433154
  • DQMHistoTests: Total failures: 114
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3433018
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 204 log files, 49 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • Reco comparison had 3 failed jobs
  • DQMHistoTests: Total files compared: 4
  • DQMHistoTests: Total histograms compared: 19876
  • DQMHistoTests: Total failures: 8
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 19868
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 3 files compared)
  • Checked 12 log files, 9 edm output root files, 4 DQM output files
  • TriggerResults: no differences found

@fwyzard
Copy link
Contributor Author

fwyzard commented Sep 30, 2022

@makortel do you have any other concerns, or is it ok to merge this ?

@makortel
Copy link
Contributor

@makortel do you have any other concerns, or is it ok to merge this ?

Looks good, although I would still prefer to drop the first commit (e5736f4) since it is already part of the master branch (even if this time git/GitHub is able to see it doesn't cause conflict).

@fwyzard
Copy link
Contributor Author

fwyzard commented Sep 30, 2022

That's the trick I found (by chance) so that all future PRs based on this same branch will merge without conflicts in both 12.5.x and 12.6.x.

@fwyzard
Copy link
Contributor Author

fwyzard commented Sep 30, 2022

+heterogeneous

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

perrotta commented Oct 1, 2022

@fwyzard you tested this lately together with cms-sw/cmsdist#8107, but neither here nor in the description of that cmsdist PR is ever written that the two havo to be merged together: please confirm that this PR can be merged independently from cms-sw/cmsdist#8107, or provide some different direction otherwise.

@fwyzard
Copy link
Contributor Author

fwyzard commented Oct 1, 2022

@perrotta, technically this and 8107 are independent.

8107 is a fix that can go in by itself.

This PR implements changes that are independent from 8107, and (also) makes the unit test report any failures instead of ignoring them.

Some random failures are already happening, but without this PR they are ignored: 8107 fixes them.

@perrotta
Copy link
Contributor

perrotta commented Oct 1, 2022

+1

@cmsbuild cmsbuild merged commit a89eb4c into cms-sw:master Oct 1, 2022
@perrotta
Copy link
Contributor

perrotta commented Oct 2, 2022

@fwyzard perhaps you already noticed it, but after the merge of this PR the unit test in HeterogeneousCore/AlpakaTest/test/testHeterogeneousCoreAlpakaTestWriteRead.sh started failing, see e.g. https://cmssdt.cern.ch/SDT/cgi-bin/logreader/el8_amd64_gcc10/CMSSW_12_6_X_2022-10-01-1100/unitTestLogs/HeterogeneousCore/AlpakaTest#/

@fwyzard
Copy link
Contributor Author

fwyzard commented Oct 2, 2022

Yes - the fix is cms-sw/cmsdist#8107 .

@fwyzard
Copy link
Contributor Author

fwyzard commented Oct 2, 2022

What I meant earlier is that the underlying tests has been failing for a while (and cms-sw/cmsdist#8107 is the fix).

However, the failures were going unreported due to an annoying interplay between how scram run the tests and how bash inherits an environment from a parent process running within a conditional statement. This PR fixes the scripts to actually report the errors - so the pre-existing failures are now apparent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants