Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapt Configuration/DataProcessing to python3 #28174

Merged
merged 3 commits into from Oct 16, 2019
Merged

Conversation

fabiocos
Copy link
Contributor

PR description:

Minimal adjustments to Configuration/DataProcessing to pass the unit test in the PY3 IB.

PR validation:

scram b runtests is passed both in the standard (python2) and python3 builds. The output comparison requires edmConfigDump to work in python3, features independently added by #28173

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-28174/12248

  • This PR adds an extra 16KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @fabiocos (Fabio Cossutti) for master.

It involves the following packages:

Configuration/DataProcessing

@cmsbuild, @franzoni, @fabiocos, @kpedro88, @davidlange6 can you please review it and eventually sign? Thanks.
@Martin-Grunewald this is something you requested to watch as well.
@davidlange6, @slava77, @fabiocos you are the release manager for this.

cms-bot commands are listed here

@fabiocos
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 15, 2019

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/2946/console Started: 2019/10/15 14:36

@fabiocos
Copy link
Contributor Author

this update is enough to let the unit test run smoothly, but in order to dump correctly the output a further updated is needed, I will profit from the update to edmConfigDump by @makortel in this PR

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-b629f2/2946/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 34
  • DQMHistoTests: Total histograms compared: 2961064
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2960722
  • DQMHistoTests: Total skipped: 341
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared)
  • Checked 147 log files, 16 edm output root files, 34 DQM output files

@fabiocos
Copy link
Contributor Author

code-checks

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-28174/12279

  • This PR adds an extra 16KB to repository

@cmsbuild
Copy link
Contributor

Pull request #28174 was updated. @cmsbuild, @franzoni, @fabiocos, @kpedro88, @davidlange6 can you please check and sign again.

@fabiocos
Copy link
Contributor Author

@smuzaffar as far as I can so far we are using protocol=0 in pickle.dump , which is the default in python2.7 according to https://docs.python.org/2/library/pickle.html if no other value is specified explicitly. And indeed the .pkl files are readable as ascii text files (which does not mean that they are meaningful to a human reader). I now explicitly set this option, and with that pickle files produced in python2 can be processed by python3 and produce an output.

What I notice though is that the output are different in all cases: py2 vs py3, py2 vs py3_from_py2pickle . Apart for the trivial order of a number of lines, there are other differences I see.
One is in the number of significant digits that are written out:

502134,502141c502120,502127
<             xmin = cms.double(-3.14159265359)
<             xmin = cms.double(-3.14159265359)
<             xmin = cms.double(-3.14159265359)
<             xmin = cms.double(-3.14159265359)
<             xmin = cms.double(-3.14159265359)
<         xmin = cms.double(-3.14159265359)
<         xmin = cms.double(-3.14159265359)
<         xmin = cms.double(-3.14159265359)
---
>             xmin = cms.double(-3.141592653589793)
>             xmin = cms.double(-3.141592653589793)
>             xmin = cms.double(-3.141592653589793)
>             xmin = cms.double(-3.141592653589793)
>             xmin = cms.double(-3.141592653589793)
>         xmin = cms.double(-3.141592653589793)
>         xmin = cms.double(-3.141592653589793)
>         xmin = cms.double(-3.141592653589793)

or even worse

82767,82768c82762,82763
<         0.0, 0.1, 0.2, 0.3, 0.4, 
<         0.0, 0.1, 0.2, 0.3, 0.4, 
---
>         0.0, 0.10000000000000009, 0.20000000000000018, 0.2999999999999998, 0.3999999999999999, 
>         0.0, 0.10000000000000009, 0.20000000000000018, 0.2999999999999998, 0.3999999999999999, 

But what worries me more is that some pieces of configuration present in py2 seem to be missing in py3. Dumping an example express configuration I cannot find in the py3 version

process.disc = cms.PSet(
    denominator = cms.VInputTag(
        cms.InputTag("pfMassDecorrelatedDeepBoostedJetTags","probZcc"), cms.InputTag("pfMassDecorrelatedDeepBoostedJetTags","probHcc"), cms.InputTag("pfMassDecorrelatedDeepBoostedJetTags","probQCDbb"), cms.InputTag("pfMassDecorrelatedDeepBoostedJetTags","probQCDcc"), cms.InputTag("pfMassDecorrelatedDeepBoostedJetTags","probQCDb"), 
        cms.InputTag("pfMassDecorrelatedDeepBoostedJetTags","probQCDc"), cms.InputTag("pfMassDecorrelatedDeepBoostedJetTags","probQCDothers")
    ),
    name = cms.string('ccvsLight'),
    numerator = cms.VInputTag(cms.InputTag("pfMassDecorrelatedDeepBoostedJetTags","probZcc"), cms.InputTag("pfMassDecorrelatedDeepBoostedJetTags","probHcc"), cms.InputTag("pfMassDecorrelatedDeepBoostedJetTags","probQCDcc"))
)

@fabiocos
Copy link
Contributor Author

@smuzaffar I have tried the following test: I dump the configurations of test wf 10824.0 produced with runTheMatrix.py adjusted by you, and I compare the output in py2 and py3. The compact configurations look in agreement, but for the order of the input options (which we had discussed) and some sequences, which anyway are all present. If I expand them with edmConfigDump I see non trivial differences as above. They are clearly due not to the pickle manipulation adjustments of this PR but likely to the way dumpPython expands the configurations.

I will now test this PR, and if ok I would suggest to merge it, unless you or @davidlange6 and @Dr15Jones have a different advice. This should technically fix a unit test, the problem of different outputs seems to me to be dealt with independently.

@fabiocos
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 16, 2019

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/2983/console Started: 2019/10/16 14:25

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-b629f2/2983/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 34
  • DQMHistoTests: Total histograms compared: 2961064
  • DQMHistoTests: Total failures: 2
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2960721
  • DQMHistoTests: Total skipped: 341
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared)
  • Checked 147 log files, 16 edm output root files, 34 DQM output files

@fabiocos
Copy link
Contributor Author

expanding the configurations built in python2 with and without this PR there is no difference, so I think it is safe to merge this PR

@fabiocos
Copy link
Contributor Author

+operations

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar, @fabiocos (and backports should be raised in the release meeting by the corresponding L2)

@fabiocos
Copy link
Contributor Author

+1

@cmsbuild cmsbuild merged commit 00a00fe into cms-sw:master Oct 16, 2019
@fabiocos fabiocos deleted the fc-dpfix branch November 24, 2019 17:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants