Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throw GetByLabelWithoutRegistration exception also when no module in a job registered consumption of a given branch #38875

Merged
merged 2 commits into from Aug 2, 2022

Conversation

makortel
Copy link
Contributor

@makortel makortel commented Jul 27, 2022

PR description:

#38864 reported a case where an EDAnalyzer using Event::getByLabel() without declaring the consumed data products got a ProductNotFound exception instead of more explanatory GetByLabelWithoutRegistration exception. I traced this down to ProductResolverIndexHelper not being populated (from ProductRegistry with non-consumed branches branches that have product type that is not consumed in the job (either via the same or different branch)

// The main type is not consumed, and either
// there is no contained type, or if there is,
// neither it nor any of its base classes are consumed.
// Don't insert anything in the lookup tables.

leading to ProductResolverIndexHelper::index() returning a ProductResolverIndexInvalid and Principal::findProductByLabel() returning nullptr, which is interpreted as ProductNotFound by calling code.

This PR proposes to change the Principal::findProductByLabel() to check the branch existence directly from ProductRegistry after getting ProductResolverIndexInvalid, and throw GetByLabelWithoutRegistration exception if the branch in question is present. This adds some cost, but only in case

  • code calls consumes but uses (deprecated) getByLabel() to get the data, and
  • product in question does not exist, and
  • code checks the Handle validity and is able to ignore the missing product

I would expect this case to be rare, and therefore the cost to be acceptable. I also think we should make an effort to address the remaining getByLabel() calls in the near-to-medium-future.

PR validation:

Checked that the test configuration in #38864 now throws GetByLabelWithoutRegistration exception.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-38875/31299

  • This PR adds an extra 24KB to repository

  • There are other open Pull requests which might conflict with changes you have proposed:

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @makortel (Matti Kortelainen) for master.

It involves the following packages:

  • FWCore/Framework (core)

@cmsbuild, @smuzaffar, @Dr15Jones, @makortel can you please review it and eventually sign? Thanks.
@wddgit this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@Dr15Jones
Copy link
Contributor

I don't think this is the proper solution. It seems to miss the case where

  1. module A calls getByLabel without doing the consumes
  2. module B just happens to consumes that exact same data

This would cause a race condition as to whether A gets the data if A does not also happen to always be scheduled to run after B.

@Dr15Jones
Copy link
Contributor

My point being that getByLabel should throw an exception for not calling comsumes for that data product even if the data product happens to be accessible in the Event at the time of the call.

@makortel
Copy link
Contributor Author

makortel commented Jul 27, 2022

If B consumes the exact same data, this code (referring now to code in master)

ProductResolverIndex index =
productLookup().index(kindOfType, typeID, label.c_str(), instance.c_str(), process.c_str());

should return a valid index. In that case the code I added in this PR will not get run, and the existing check
if (UNLIKELY(consumer and (not consumer->registeredToConsume(index, false, branchType())))) {
failedToRegisterConsumes(kindOfType, typeID, label, instance, process);
}

should check if this module declared the consumption of the data product, and throw the GetByLabelWithoutRegistration exception if consumes() was not called.

I think this case was already covered, and I don't see how changes in this PR would impact that.

@Dr15Jones
Copy link
Contributor

What made it not possible to re-use the logic on line 814?

@Dr15Jones
Copy link
Contributor

How about adding a unit test option for this case?

@makortel
Copy link
Contributor Author

What made it not possible to re-use the logic on line 814?

The index in line 814 would be ProductResolverIndexInvalid in this case, and

bool EDConsumerBase::registeredToConsume(ProductResolverIndex iIndex,
bool skipCurrentProcess,
BranchType iBranch) const {
for (auto it = m_tokenInfo.begin<kLookupInfo>(), itEnd = m_tokenInfo.end<kLookupInfo>(); it != itEnd; ++it) {
if (it->m_index.productResolverIndex() == iIndex and it->m_index.skipCurrentProcess() == skipCurrentProcess and
it->m_branchType == iBranch) {
return true;
}
}
return false;
}

would then return false always.

@makortel
Copy link
Contributor Author

How about adding a unit test option for this case?

I was hoping to get away by getByLabel() being deprecated etc, but since you suggested it, I'll craft one.

@makortel
Copy link
Contributor Author

I added a test that (in the first commit) demonstrates what happens when a data product is obtained via getByLabel() with and without consumes() calls when

  • data product comes from the input file, no other consumers of the same product type (the problematic case of this PR)
  • data product of the same type is produced+consumed by other modules
  • data product of different type is produced+consumed by other modules

The changes in the test in the second commit demonstrate the effect of the changes.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-38875/31317

@cmsbuild
Copy link
Contributor

Pull request #38875 was updated. @cmsbuild, @smuzaffar, @Dr15Jones, @makortel can you please check and sign again.

@makortel
Copy link
Contributor Author

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

-1

Failed Tests: RelVals-INPUT
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-8f9c39/26514/summary.html
COMMIT: 7b2d9a7
CMSSW: CMSSW_12_5_X_2022-07-28-1100/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/38875/26514/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals-INPUT

The relvals timed out after 4 hours.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 2 differences found in the comparisons
  • DQMHistoTests: Total files compared: 51
  • DQMHistoTests: Total histograms compared: 3668050
  • DQMHistoTests: Total failures: 8
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3668020
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 50 files compared)
  • Checked 210 log files, 47 edm output root files, 51 DQM output files
  • TriggerResults: no differences found

@makortel
Copy link
Contributor Author

makortel commented Aug 1, 2022

@Dr15Jones Does the added test address your concern?

@Dr15Jones
Copy link
Contributor

I'm good with the change once the build system decides to finish its tests :).

@makortel
Copy link
Contributor Author

makortel commented Aug 1, 2022

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 1, 2022

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-8f9c39/26569/summary.html
COMMIT: 7b2d9a7
CMSSW: CMSSW_12_5_X_2022-08-01-1100/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/38875/26569/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 51
  • DQMHistoTests: Total histograms compared: 3669004
  • DQMHistoTests: Total failures: 7
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 3668974
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -0.004 KiB( 50 files compared)
  • DQMHistoSizes: changed ( 312.0 ): -0.004 KiB MessageLogger/Warnings
  • Checked 210 log files, 47 edm output root files, 51 DQM output files
  • TriggerResults: no differences found

@makortel
Copy link
Contributor Author

makortel commented Aug 1, 2022

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 1, 2022

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@qliphy
Copy link
Contributor

qliphy commented Aug 2, 2022

+1

@cmsbuild cmsbuild merged commit 79e7839 into cms-sw:master Aug 2, 2022
@makortel makortel deleted the failedToRegisterConsumes branch August 2, 2022 13:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants