Updated root to tip of branch v6-22-00-patches #6388

smuzaffar · 2020-11-09T22:10:07Z

No description provided.

smuzaffar · 2020-11-09T22:10:18Z

please test

cmsbuild · 2020-11-09T22:10:31Z

The tests are being triggered in jenkins.

CMSSW_11_2_ROOT622_X_2020-11-09-1100/slc7_amd64_gcc820: https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/10599/console Started: 2020/11/09 23:11

cmsbuild · 2020-11-09T22:10:32Z

A new Pull Request was created by @smuzaffar (Malik Shahzad Muzaffar) for branch IB/CMSSW_11_2_X/rootnext.

@cmsbuild, @smuzaffar, @mrodozov can you please review it and eventually sign? Thanks.
cms-bot commands are listed here

smuzaffar · 2020-11-09T22:12:37Z

lets get it in for 23h00 Ib

cmsbuild · 2020-11-10T04:18:18Z

-1

Tested at: 3939110

CMSSW: CMSSW_11_2_ROOT622_X_2020-11-09-1100
SCRAM_ARCH: slc7_amd64_gcc820
You can see the results of the tests here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2fc353/10599/summary.html

I found follow errors while testing this PR

Failed tests: UnitTests

Unit Tests:

I found errors in the following unit tests:

---> test MagneticFieldEngineTestDriver had ERRORS

cmsbuild · 2020-11-10T04:18:20Z

Comparison job queued.

cmsbuild · 2020-11-10T08:16:59Z

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-2fc353/10599/summary.html

Comparison Summary:

No significant changes to the logs found
Reco comparison results: 8 differences found in the comparisons
DQMHistoTests: Total files compared: 35
DQMHistoTests: Total histograms compared: 2529296
DQMHistoTests: Total failures: 1
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 2529273
DQMHistoTests: Total skipped: 22
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 34 files compared)
Checked 148 log files, 22 edm output root files, 35 DQM output files

smuzaffar · 2020-11-11T09:38:58Z

@slava77 , can you please check these reco comparison differences ?

slava77 · 2020-11-11T14:03:35Z

@slava77 , can you please check these reco comparison differences ?

Interesting

Reco comparison results: 8 differences found in the comparisons

it looks like all of the differences are appearing in particleFlowBlock linkData_ size, but then there is nothing downstream.

IIRC there is no DQM for particleFlowBlock; so, seeing no differences in DQM would not reject some possible subtleties of the reco plotting script, which in this case would be running a more recent version of root after loading the older version root file first.

It's useful to make a manual check in "native" builds for the baseline and this PR test.
I see that the baseline is CMSSW_11_2_X_2020-11-09-1100.
Just in case, is the PR build available on CVMFS?
It looks like the PR is merged in IB/CMSSW_11_2_X/rootnext , which should be in CMSSW_11_2_ROOT622_X_2020-11-09-2300/slc7_amd64_gcc820; I guess I can check this instead.
I will do it some time later today.

smuzaffar · 2020-11-11T14:09:01Z

Yes, you can use latest CMSSW_11_2_ROOT622_X IB .
I am opening a PR to test it in CMSSW_11_2_X (master Ibs)

slava77 · 2020-11-11T19:37:59Z

I picked step3.root from 11634.0_TTbar_14TeV+2021+TTbar_14TeV_TuneCP5_GenSim+Digi+Reco+HARVEST+ALCA for the baseline and the PR outputs.
For the baseline I inspected it in its original CMSSW_11_2_X_2020-11-09-1100,
while for this PR test output I used CMSSW_11_2_ROOT622_X_2020-11-09-2300.

The content of PFBlock::linkData_ looks different based on TTree::Draw (or Scan) used in the plotting script.

This data member type is an std::map<unsigned int, Link = struct {float distance; char test;}>.
It looks to me that with this PR the results are incorrect based on reported values in e.g. the first event 2nd block
en->Scan("recoPFBlocks_particleFlowBlock__RECO.obj[2].elements_@.size():recoPFBlocks_particleFlowBlock__RECO.obj[2].linkData_@.size():recoPFBlocks_particleFlowBlock__RECO.obj[2].linkData_[].first:recoPFBlocks_particleFlowBlock__RECO.obj[2].linkData_[].second.distance", "", "", 1,0)

the size is different 208 -> 80
the .first values do not match and the max value for the case with this PR is logically not possible from the way the algorithm filling these values works
the .second.distance field with this PR looks like random numbers in all cases

I can't recall offhand if anything else of type std::map is plotted in the reco comparisons; it may be that this is incidentally the only case.
At least for the PFBlocks, we do not read this data for further processing in the tested workflows; so, I can't be sure that the map reading is limited to FWLite (while it still works in full framework) or if there is a possibility that the std::map streaming is fully broken.

smuzaffar · 2020-11-11T22:10:31Z

@slava77 , testing latest root v6.22 for native 11.2.X IBs shows no comparison differences #6404

slava77 · 2020-11-11T22:45:34Z

@slava77 , testing latest root v6.22 for native 11.2.X IBs shows no comparison differences #6404

what's the difference between the builds?
IB/CMSSW_11_2_X/master...IB/CMSSW_11_2_X/rootnext looks perhaps only different in the root debug build flag
Is there anything else different?

smuzaffar · 2020-11-11T22:48:16Z

yes only difference is root debug build flag in rootnext branch every thing else is identical

slava77 · 2020-11-12T01:53:59Z

@pcanal
I'm responding here to cms-sw/cmssw#30359 (comment) for a bit more continuity in the thread

@slava77 Does reading the failing map from a file written with an old version of ROOT & CMSSW work (or fails) when reading with the new FWLite? (i.e. is the file badly read or badly written?)

it looks like a case of both badly written and badly read: I can read the old file with the new IB in CMSSW_11_2_ROOT622_X_2020-11-09-2300, and the values in the map .second.distance are (accidentally?) reasonable, however the key values and the size is different

[1]

*    Row   * Instance * elements_@.size() * linkData_@.size() * linkData_[].first * linkData_[].second.distance *
*        0 *        0 *       132 *        23 *         1 * 0.0010000 *
*        0 *        1 *       132 *        23 *        40 * 0.0010000 *
*        0 *        2 *       132 *        23 *       257 * 0.0010000 *
*        0 *        3 *       132 *        23 *       513 * 0.0010000 *
*        0 *        4 *       132 *        23 *       769 * 0.0010000 *

[2]
vs native baseline file made and read in CMSSW_11_2_X_2020-11-09-1100

*        0 *        0 *       132 *       208 *        40 * 0.0010000 *
*        0 *        1 *       132 *       208 *        41 * 0.0010000 *
*        0 *        2 *       132 *       208 *        42 * 0.0010000 *
*        0 *        3 *       132 *       208 *        44 * 0.0010000 *
*        0 *        4 *       132 *       208 *        47 * 0.0010000 *
*        0 *        5 *       132 *       208 *       164 * 0.0010000 *
*        0 *        6 *       132 *       208 *       171 * 0.0161606 *
*        0 *        7 *       132 *       208 *       293 * 0.0010000 *
*        0 *        8 *       132 *       208 *       300 * 0.0144026 *
*        0 *        9 *       132 *       208 *       421 * 0.0010000 *

the above agrees with values printed from the job itself (running in CMSSW_11_2_X_2020-11-09-1100)

[3]
vs the file from PR read in ROOT622_X

*        0 *        0 *       132 *        80 *        10 * 2.162e+10 *
*        0 *        1 *       132 *        80 *        37 * 2.193e-08 *
*        0 *        2 *       132 *        80 *        40 *         0 *
*        0 *        3 *       132 *        80 *       164 * 9.758e-10 *
*        0 *        4 *       132 *        80 *       235 * -1.510681 *
*        0 *        5 *       132 *        80 *       253 * 1.0767518 *

[4]
I also remade the file in CMSSW_11_2_ROOT622_X_2020-11-09-2300 and get somewhat different random numbers while reading it in the same release

*        0 *        0 *       132 *       126 *         0 * -10.97802 *
*        0 *        1 *       132 *       126 *        40 * 0.1121101 *
*        0 *        2 *       132 *       126 *       256 * 1.691e+37 *
*        0 *        3 *       132 *       126 *       267 * -1.89e-30 *
*        0 *        4 *       132 *       126 *       372 * -0.216617 *
*        0 *        5 *       132 *       126 *       511 *         0 *

note that the printouts from inside the job agree with the baseline release as shown in [2] (so, at runtime the data in memory is the same in [4] and in [2])

and the same reproduced file made in CMSSW_11_2_ROOT622_X_2020-11-09-2300 now read in CMSSW_11_2_X_2020-11-09-1100

*        0 *        0 *       132 *       208 *        40 * 0.1121101 *
*        0 *        1 *       132 *       208 *        41 * -10.97802 *
*        0 *        2 *       132 *       208 *        42 * 1.238e-08 *
*        0 *        3 *       132 *       208 *        44 * -0.014939 *
*        0 *        4 *       132 *       208 *        47 * -1.45e-37 *
*        0 *        5 *       132 *       208 *       164 * 257592.01 *

note that the size and keys turns out to be right, but the last column is all wrong.

pcanal · 2020-11-12T21:48:48Z

So we need to solve this.

I am tad bit confused. "so, at runtime the data in memory is the same in [4] and in [2])" but the output for the column 5th column is different between the 2 Scan output so I am not sure whether "and keys turns out to be right" and the data is different for different run or if it is wrong.

Either way, it seems like it might be a long standing problem, so I only need to debug it in one release. Can I get the instruction to run the writing part with the text output to capture the values a printed by the job itself? I assume the reading part is simply, open file and run Scan.

slava77 · 2020-11-12T22:44:23Z

I used CMSSW_11_2_ROOT622_X_2020-11-09-2300
and reran step3*py on step2.root input from https://cmssdt.cern.ch/SDT/jenkins-artifacts/ib-baseline-tests/CMSSW_11_2_X_2020-11-09-1100/slc7_amd64_gcc820/-GenuineIntel/matrix-results/11634.0_TTbar_14TeV+2021+TTbar_14TeV_TuneCP5_GenSim+Digi+Reco+HARVEST+ALCA/
which is the jenkins output of the baseline IB CMSSW_11_2_X_2020-11-09-1100

I guess this can be remade from scratch with runTheMatrix.py -l 11634.0

For the runtime, debugs can be added in RecoParticleFlow/PFProducer/plugins/PFBlockProducer.cc
PFBlockProducer::produce method
I simply had the following

    for (auto const& block : blocks) {
      str << "Block elements size "<<block.elements().size() << endl;
      str << "Block link data: size "<<block.linkData().size()<<" :"<<endl;
      for (auto const& link : block.linkData()) {
        str << "\t key "<<link.first<<" distance "<< link.second.distance<<" test "<<link.second.test<< endl;
      }
    }

    LogWarning("PFBlockProducer") << str.str() << endl;

which is more appropriate than the original for the purpose of this debugging.
Additionally, use process.particleFlowBlock.verbose = True in the config, which can as well be hardcoded if the c++ code printouts are needed.

HTH

Updated root to tip of branch v6-22-00-patches

3939110

cmsbuild added comparison-pending externals-pending orp-pending pending-signatures tests-pending labels Nov 9, 2020

cmsbuild added tests-started and removed tests-pending labels Nov 9, 2020

smuzaffar merged commit dc5b7e3 into cms-sw:IB/CMSSW_11_2_X/rootnext Nov 9, 2020

cmsbuild added tests-rejected and removed tests-started labels Nov 10, 2020

cmsbuild added comparison-available and removed comparison-pending labels Nov 10, 2020

smuzaffar mentioned this pull request Nov 11, 2020

[ROOT] Testing latest root v6.22 for 11.2.X master IBS #6404

Merged

slava77 mentioned this pull request Nov 11, 2020

Job crash in ROOT 6.22 IBs cms-sw/cmssw#30359

Closed

slava77 mentioned this pull request Nov 16, 2020

[TF] Fix needed for tensorflow 2.3.1 cms-sw/cmssw#32131

Merged

perrotta mentioned this pull request Nov 17, 2020

CSJetProducer bug fix for flow modulation cms-sw/cmssw#32158

Merged

slava77 mentioned this pull request Nov 18, 2020

Propagate timing to PF TICL Candiates cms-sw/cmssw#32046

Merged

slava77 mentioned this pull request Dec 28, 2020

New HLT DEBUG object: EgTrigSumObj : CMSSW_11_3_X cms-sw/cmssw#32486

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updated root to tip of branch v6-22-00-patches #6388

Updated root to tip of branch v6-22-00-patches #6388

smuzaffar commented Nov 9, 2020

smuzaffar commented Nov 9, 2020 •

edited

cmsbuild commented Nov 9, 2020 •

edited

cmsbuild commented Nov 9, 2020

smuzaffar commented Nov 9, 2020

cmsbuild commented Nov 10, 2020

cmsbuild commented Nov 10, 2020

cmsbuild commented Nov 10, 2020

smuzaffar commented Nov 11, 2020

slava77 commented Nov 11, 2020

smuzaffar commented Nov 11, 2020

slava77 commented Nov 11, 2020

smuzaffar commented Nov 11, 2020

slava77 commented Nov 11, 2020

smuzaffar commented Nov 11, 2020

slava77 commented Nov 12, 2020

pcanal commented Nov 12, 2020

slava77 commented Nov 12, 2020

Updated root to tip of branch v6-22-00-patches #6388

Updated root to tip of branch v6-22-00-patches #6388

Conversation

smuzaffar commented Nov 9, 2020

smuzaffar commented Nov 9, 2020 • edited

cmsbuild commented Nov 9, 2020 • edited

cmsbuild commented Nov 9, 2020

smuzaffar commented Nov 9, 2020

cmsbuild commented Nov 10, 2020

cmsbuild commented Nov 10, 2020

cmsbuild commented Nov 10, 2020

smuzaffar commented Nov 11, 2020

slava77 commented Nov 11, 2020

smuzaffar commented Nov 11, 2020

slava77 commented Nov 11, 2020

smuzaffar commented Nov 11, 2020

slava77 commented Nov 11, 2020

smuzaffar commented Nov 11, 2020

slava77 commented Nov 12, 2020

pcanal commented Nov 12, 2020

slava77 commented Nov 12, 2020

smuzaffar commented Nov 9, 2020 •

edited

cmsbuild commented Nov 9, 2020 •

edited