Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pixel simulated bad channel info packed into FED raw data #25748

Merged

Conversation

veszpv
Copy link
Contributor

@veszpv veszpv commented Jan 23, 2019

This is the follow-up of the PR:
#25466
in which simulation of bad components grouped by FED channels, randomly chosen from a collision data-driven set of configurations, were introduced.

Reminder: the killing of digis happens in the digitizer during simulation. The digitizer places a list of killed channels into the event in order for the tracking to be able to inactivate rechits in the empty modules. In real collisions, the FED FW generates an error code 25 associated to the channel from which no response is received upon a trigger (e.g. the module is turned off or the TBM gets stuck). The RawToDigi module unpacks these errors and places a list of bad channels into the event for tracking. In PR25466, the RawToDigi step replaced the default empty channel list output with the list from the digitizer.

In this PR, the channel list prepared by the digitizer is packed into the simulated FEDRawData, so that the unpacker do not need to handle simulation differently from data. By default, the bad channel simulation is turned off (in the digitizer config), no FED error 25 is simulated, and the unpacker produces an empty bad channel list.

Validation was done to check that the channel list produced by the unpacker agrees with the one from the digitizer, and running the same events before and after this modification to 10_4_0 with the feature turned on, the same number of clusters are created.

@mmusich @tsusa

…he PixelFEDChannelCollection created by the digitizer directly to reco
… label as for the digis, and verifying that channels exist in the cabling map. No extra output is generated, as channel info will be packed into the FEDRawData
…en), but in any case, pack FED error 25 into data
@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-25748/8146

  • This PR adds an extra 28KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @veszpv (Viktor Veszpremi) for master.

It involves the following packages:

EventFilter/SiPixelRawToDigi

@cmsbuild, @perrotta, @slava77 can you please review it and eventually sign? Thanks.
@makortel, @Martin-Grunewald, @dkotlins, @VinInn this is something you requested to watch as well.
@davidlange6, @slava77, @fabiocos you are the release manager for this.

cms-bot commands are listed here

@slava77
Copy link
Contributor

slava77 commented Jan 23, 2019

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jan 23, 2019

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/32806/console Started: 2019/01/24 00:28

@cmsbuild
Copy link
Contributor

-1

Tested at: 06db132

You can see the results of the tests here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-25748/32806/summary.html

I found follow errors while testing this PR

Failed tests: ClangBuild

  • Clang:

I found compilation warning while trying to compile with clang. Command used:

USER_CUDA_FLAGS='--expt-relaxed-constexpr' USER_CXXFLAGS='-Wno-register -fsyntax-only' scram build -k -j 16 COMPILER='llvm compile'

See details on the summary page.

@cmsbuild
Copy link
Contributor

Comparison not run due to Build errors/Fireworks only changes/No short matrix requested (RelVals and Igprof tests were also skipped)

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-25748/32840/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 32
  • DQMHistoTests: Total histograms compared: 3097440
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3097242
  • DQMHistoTests: Total skipped: 197
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 31 files compared)
  • Checked 133 log files, 14 edm output root files, 32 DQM output files

@@ -130,6 +134,25 @@ void SiPixelDigiToRaw::produce( edm::StreamID, edm::Event& ev,

LogDebug("SiPixelDigiToRaw") << cache->cablingTree_->version();

PixelDataFormatter::BadChannels badChannels;
edm::Handle<PixelFEDChannelCollection> pixelFEDChannelCollectionHandle;
if (ev.getByToken(theBadPixelFEDChannelsToken, pixelFEDChannelCollectionHandle)){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if this "if" is not satisfied? Shouldn't it throw, or give some error at least?
if so, badChannels would remain uninitialized: wouldn't it give troubles to formatter.formatRawData below, as such?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the bad component simulation is turned OFF in the digitizer, currently the default setting, the digitizer does not produce any FEDChannelCollection output. Then the if is not satisfied, and badChannels remains an empty map. In this case, no FED25 error will be generated in the RAW output.

if (cache->cablingTree_->findItem(path)!=nullptr) {
detBadChannels.push_back(fedChannel);
} else {
edm::LogError("SiPixelDigiToRaw")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which is the expected frequency of this Error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically should never happen if the payload based on which the digitizer kills modules lists only existing channels. This is just a safety net here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about throwing an exception then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In regular production, we would definitely want to notice if this happens (frequent warning would probably trigger attention), but not sure if we would want to enforce only one possible usage of the code by preventing any exotic used cases.
E.g. should one be able to run DigiToRaw+RawToDigi in private jobs for single FEDs while processing a file with a digi collection produced once for the full detector?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g. should one be able to run DigiToRaw+RawToDigi in private jobs for single FEDs while processing a file with a digi collection produced once for the full detector?

I don't know. My experience is just that log messages are usually ignored until their amount becomes excessive. One option would be to have a boolean flag to control whether to throw or not (default being to throw).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @makortel IMHO, throwing here would be a bit of an overkill, the check that Viktor does here, if failed, doesn't lead to corrupt data or to random failures. It just checks that whatever we have put in the payload for the simulation does correspond to an actual FED channel, i.e. logs an error if we are trying to kill anything which is not physically a FED channel (e.g. single-ROCs, etc.), so it's more a question of book-keeping of inefficiency source. Also there are other failsafes in the payload producers that are meant to avoid fall in this case and indeed it should never happen.

return (int(ch.fed)==fedId && ch.link==linkId(words[fedId].back()));
});
if (badChannel!=detBadChannels->second.end()) {
LogError("FormatDataException")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which is the expected frequency of this Error?
(Just to avoid floading the output with errors...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, just a safety net, should never be needed. It just checks that the digitizer consistently killed pixels in channels that it marked bad, or that the digitizer uses the same cabling map as the DigiToRaw module that called the formatRawData.

@tsusa
Copy link
Contributor

tsusa commented Feb 4, 2019

Hi @perrotta, is there anything missing to move on with the PR? We would appreciate if it could enter in 10_5_0_pre2...

@perrotta
Copy link
Contributor

perrotta commented Feb 4, 2019

+1

  • Reading and propagation to tracking of the bad channels list is moved from the previous temporary location in SiPixelRawToDigi to SiPixelDigiToRaw
  • Default setting is that this feature stays off, and as such no effect is expected in the tests outputs, and none is seen
  • Validation of the bad channels list when the feature is switched on was done by the proponents, as reported in the PR description

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 4, 2019

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar, @fabiocos (and backports should be raised in the release meeting by the corresponding L2)

@fabiocos
Copy link
Contributor

fabiocos commented Feb 5, 2019

code-checks

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 5, 2019

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 5, 2019

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-25748/8324

  • This PR adds an extra 28KB to repository

@fabiocos
Copy link
Contributor

fabiocos commented Feb 5, 2019

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants