Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating merger call with duplicate removal. #29550

Merged
merged 3 commits into from May 14, 2020

Conversation

laurenhay
Copy link
Contributor

PR description:

Adding a merger class that checks for duplicates. This will be used by JME workflows.

PR validation:

These are new files; they effect no existing workflows. These files were tested on the NanoAODJMAR workflows and are working as intended.

if this PR is a backport please specify the original PR and why you need to backport that PR:

N/A

Also adding @rappoccio .

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-29550/14837

  • This PR adds an extra 16KB to repository

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-29550/14838

  • This PR adds an extra 16KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @laurenhay for master.

It involves the following packages:

CommonTools/CandAlgos
CommonTools/UtilAlgos

@perrotta, @cmsbuild, @santocch, @slava77 can you please review it and eventually sign? Thanks.
@makortel this is something you requested to watch as well.
@silviodonato, @dpiparo you are the release manager for this.

cms-bot commands are listed here

@perrotta
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 24, 2020

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/5847/console Started: 2020/04/24 10:44

@perrotta
Copy link
Contributor

Please @laurenhay share here some instruction for testing this code
Also some more info in the PR description about possible use cases, or a link to a possible presentation at a JME meeting (e.g.), could be useful.

@cmsbuild
Copy link
Contributor

+1
Tested at: 8c33b88
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-79148a/5847/summary.html
CMSSW: CMSSW_11_1_X_2020-04-23-2300
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Pull request #29550 was updated. @perrotta, @cmsbuild, @santocch, @slava77 can you please check and sign again.

@perrotta
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented May 11, 2020

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/6234/console Started: 2020/05/11 22:47

@cmsbuild
Copy link
Contributor

+1
Tested at: adfb42a
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-79148a/6234/summary.html
CMSSW: CMSSW_11_1_X_2020-05-11-1100
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-79148a/6234/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 2 differences found in the comparisons
  • DQMHistoTests: Total files compared: 34
  • DQMHistoTests: Total histograms compared: 2697527
  • DQMHistoTests: Total failures: 2
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2697206
  • DQMHistoTests: Total skipped: 319
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared)
  • Checked 147 log files, 16 edm output root files, 34 DQM output files

@perrotta
Copy link
Contributor

+1

  • A new merged is cloned starting from the original Merger producer: it allows merging without duplicates and provides sorted collections, for future usage in some JME workflows
  • Tested working with the script provided
  • Jenkins tests pass and show no differences with the baseline, as expected

@silviodonato
Copy link
Contributor

merge
@slava77 @perrotta Can't we speedup https://github.com/cms-sw/cmssw/blob/master/CommonTools/UtilAlgos/interface/Merger.h#L61 by adding something like coll->reserve( sumOfSizeInputCollection )?
FYI @santocch

@cmsbuild cmsbuild merged commit 9a9818e into cms-sw:master May 14, 2020
@slava77
Copy link
Contributor

slava77 commented May 14, 2020

merge
@slava77 @perrotta Can't we speedup https://github.com/cms-sw/cmssw/blob/master/CommonTools/UtilAlgos/interface/Merger.h#L61 by adding something like coll->reserve( sumOfSizeInputCollection )?
FYI @santocch

yes, calling a reserve will speed it up. I don't have a good guess for the total speedup, fractionally it can be about a factor of more than 2, but I think that this is sub-ms per call for even rather large collections.

@santocch
Copy link

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants