Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NanoAOD] Don't handle column types redundantly anymore #30436

Merged
merged 8 commits into from Jul 24, 2020
Merged

[NanoAOD] Don't handle column types redundantly anymore #30436

merged 8 commits into from Jul 24, 2020

Conversation

guitargeek
Copy link
Contributor

@guitargeek guitargeek commented Jun 26, 2020

PR description:

This is just a re-opening of cms-nanoAOD#401 from September 2019, but this time to upstream CMSSW instead of cmssw-nanoAOD. Back in the day I was interested in storing columns of so far not supported basic types in NanoAOD (like uint16_t), and made some developments which are I think the first steps towards this. Since #30273 set the precedent that technical NanoAOD developments are acceptable outside cmssw-nanoAOD, I reopen the PR here almost a year later. Original description in the following.


Hi NanoAOD devs!

I hope this is the right cmssw fork and branch for this PR.

Yesterday I wanted to introduce some new column types in my private nanoAOD productions (int16_t for example, to save a bit of space) and use them in the flat table producers. However, I realized that there are many parts of the NanoAOD code which have to be tweaked if you want to do this, as the way how column types are handled is not completely trivial.

One source of complication is that when you add a column to a flat table with addColumn(), you have to pass the type as a template argument as well as an enum value in the function parameters. After working a bit with the code, I understood that this is redundant, because the check_type function makes sure you always use the right enum value with the right template parameter. Therefore, we could just drop this enum parameter and deduce it from the template argument. In this situation, we would also not need check_type anymore.

The only tricky part are bool columns, which should actually be represented by a uint8_t vector. So far, the logic to take care of this had to be implemented in the plugins that made use of the FlatTable class, but I think I found a way to have this logic directly in the FlatTable class so one can just use addColumn to create bool columns and they will be internally stored in the uint8_t vector.

What do you think? This simplifies the type handling already quite a bit, and I think it's the good path towards a FlatTable class that will support all basic types that you can also store in TTrees.

I tested this with the local matrix tests so far, can the nano-bot tests still be done here? That would be very cool!

Thanks for considering this and cheers,
Jonas

PR validation:

if this PR is a backport please specify the original PR and why you need to backport that PR:

No backport intended.

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-30436/16539

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @guitargeek (Jonas Rembser) for master.

It involves the following packages:

DataFormats/NanoAOD
PhysicsTools/NanoAOD

@perrotta, @cmsbuild, @fgolf, @slava77, @santocch, @peruzzim can you please review it and eventually sign? Thanks.
@gpetruc, @rovere this is something you requested to watch as well.
@silviodonato, @dpiparo you are the release manager for this.

cms-bot commands are listed here

@slava77
Copy link
Contributor

slava77 commented Jun 26, 2020

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 26, 2020

The tests are being triggered in jenkins.

@guitargeek guitargeek changed the title Don't handle column types redundantly anymore [NanoAOD] Don't handle column types redundantly anymore Jun 26, 2020
@cmsbuild
Copy link
Contributor

-1

Tested at: 028e589

CMSSW: CMSSW_11_2_X_2020-06-26-1100
SCRAM_ARCH: slc7_amd64_gcc820

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ce380e/7448/git-log-recent-commits
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ce380e/7448/git-merge-result

You can see the results of the tests here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ce380e/7448/summary.html

I found follow errors while testing this PR

Failed tests: Build ClangBuild

  • Build:

I found compilation error when building:

                 from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_11_2_X_2020-06-26-1100/src/DQMOffline/Muon/src/DiMuonHistograms.cc:2:
/cvmfs/cms-ib.cern.ch/nweek-02634/slc7_amd64_gcc820/external/tbb/2020_U2/include/tbb/task.h:21:139: note: #pragma message: TBB Warning: tbb/task.h is deprecated. For details, please see Deprecated Features appendix in the TBB reference manual.
 #pragma message("TBB Warning: tbb/task.h is deprecated. For details, please see Deprecated Features appendix in the TBB reference manual.")
                                                                                                                                           ^
/data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_11_2_X_2020-06-26-1100/src/DQMOffline/Muon/src/GEMEfficiencyHarvester.cc: In member function 'void GEMEfficiencyHarvester::doResolution(dqm::legacy::DQMStore::IBooker&, dqm::legacy::DQMStore::IGetter&, std::__cxx11::string)':
/data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_11_2_X_2020-06-26-1100/src/DQMOffline/Muon/src/GEMEfficiencyHarvester.cc:281:38: error: 'maxEtaPartition_' is not a member of 'GEMeMap'
       res_data[key].reserve(GEMeMap::maxEtaPartition_);
                                      ^~~~~~~~~~~~~~~~
/data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_11_2_X_2020-06-26-1100/src/DQMOffline/Muon/src/GEMEfficiencyHarvester.cc:307:40: error: 'maxEtaPartition_' is not a member of 'GEMeMap'
         new TH2F(name, title, GEMeMap::maxEtaPartition_, 0.5, GEMeMap::maxEtaPartition_ + 0.5, 2, -0.5, 1.5);
                                        ^~~~~~~~~~~~~~~~

  • Clang:

I found compilation error while trying to compile with clang. Command used:

USER_CUDA_FLAGS='--expt-relaxed-constexpr' USER_CXXFLAGS='-Wno-register -fsyntax-only' scram build -k -j 32 COMPILER='llvm compile'

In file included from /cvmfs/cms-ib.cern.ch/week0/slc7_amd64_gcc820/cms/cmssw-patch/CMSSW_11_2_X_2020-06-26-1100/src/FWCore/Framework/src/Worker.h:31:
In file included from /cvmfs/cms-ib.cern.ch/week0/slc7_amd64_gcc820/cms/cmssw-patch/CMSSW_11_2_X_2020-06-26-1100/src/FWCore/Concurrency/interface/WaitingTask.h:25:
/cvmfs/cms-ib.cern.ch/nweek-02634/slc7_amd64_gcc820/external/tbb/2020_U2/include/tbb/task.h:21:9: warning: TBB Warning: tbb/task.h is deprecated. For details, please see Deprecated Features appendix in the TBB reference manual. [-W#pragma-messages]
#pragma message("TBB Warning: tbb/task.h is deprecated. For details, please see Deprecated Features appendix in the TBB reference manual.")
        ^
/data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_11_2_X_2020-06-26-1100/src/DQMOffline/Muon/src/GEMEfficiencyHarvester.cc:281:38: error: no member named 'maxEtaPartition_' in 'GEMeMap'
      res_data[key].reserve(GEMeMap::maxEtaPartition_);
                            ~~~~~~~~~^
/data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_11_2_X_2020-06-26-1100/src/DQMOffline/Muon/src/GEMEfficiencyHarvester.cc:307:40: error: no member named 'maxEtaPartition_' in 'GEMeMap'
        new TH2F(name, title, GEMeMap::maxEtaPartition_, 0.5, GEMeMap::maxEtaPartition_ + 0.5, 2, -0.5, 1.5);
                              ~~~~~~~~~^


The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ce380e/7448/git-log-recent-commits
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ce380e/7448/git-merge-result

@cmsbuild
Copy link
Contributor

Comparison not run due to Build errors (RelVals and Igprof tests were also skipped)

@silviodonato
Copy link
Contributor

please test
#30440 should solve the error

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ce380e/8175/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 34
  • DQMHistoTests: Total histograms compared: 2525996
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2525948
  • DQMHistoTests: Total skipped: 47
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared)
  • Checked 144 log files, 17 edm output root files, 34 DQM output files

@slava77
Copy link
Contributor

slava77 commented Jul 21, 2020

+1

for #30436 00b26e7

@makortel
Copy link
Contributor

+1

@mariadalfonso
Copy link
Contributor

+1

@mariadalfonso
Copy link
Contributor

In conclusion: I can't do any backports of this PR without manual changes, which would only make the situation worse. Therefore, I advise against merging my own PR before the fundamental NanoAOD backporting problem is solved. Backporting the clang-format commit might be a good start.

What are the benefits of backporting (even to 106X)?
Can Nano move to a more recent production release, e.g. 11_1_X?

For the analysis or dedicated production, I think the 11_1_X can be used indeed
But when we will do the UL-reminiAOD I immagine the 106X and at the same time some version of nano will be produced.

We are planning a XPOG meeting this Tuesday with a general discussion
https://indico.cern.ch/event/939863

@guitargeek
Copy link
Contributor Author

Thank you very much!

@slava77
Copy link
Contributor

slava77 commented Jul 22, 2020

In conclusion: I can't do any backports of this PR without manual changes, which would only make the situation worse. Therefore, I advise against merging my own PR before the fundamental NanoAOD backporting problem is solved. Backporting the clang-format commit might be a good start.

What are the benefits of backporting (even to 106X)?
Can Nano move to a more recent production release, e.g. 11_1_X?

For the analysis or dedicated production, I think the 11_1_X can be used indeed
But when we will do the UL-reminiAOD I immagine the 106X and at the same time some version of nano will be produced.

We are planning a XPOG meeting this Tuesday with a general discussion
https://indico.cern.ch/event/939863

I wish we didn't have to backport so much stuff to 10_6_X for reminiAOD either, since we actually support remini in the master a more backport-friendly recent production release could be (have been) used.
At least we should minimize needs to backport to 10_2_X as well.
Let's see what comes out from the meeting discussion.

@santocch
Copy link

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo (and backports should be raised in the release meeting by the corresponding L2)

@silviodonato
Copy link
Contributor

@mariadalfonso @gouskos @peruzzim Can I merge this PR? Shall I wait for cms-nanoAOD#401 ?

@guitargeek
Copy link
Contributor Author

Hi @silviodonato , I think Mariarosaria was pretty clear in #30436 (comment) and especially cms-nanoAOD#401 (comment) that this PR to cms-sw is the one to merge.

@silviodonato
Copy link
Contributor

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants