Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

L1T fix int overflow #37534

Merged
merged 2 commits into from Apr 30, 2022
Merged

Conversation

cecilecaillol
Copy link
Contributor

PR description:

Addressing issue #27554

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37534/29249

  • This PR adds an extra 16KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @cecilecaillol for master.

It involves the following packages:

  • DQM/L1TMonitor (dqm)

@emanueleusai, @ahmad3213, @cmsbuild, @jfernan2, @pmandrik, @micsucmed, @rvenditti can you please review it and eventually sign? Thanks.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@cecilecaillol
Copy link
Contributor Author

please test

@perrotta
Copy link
Contributor

Thak you @cecilecaillol for the fix.

The error reported in #27554 says:

/build/cmsbld/jenkins/workspace/build-any-ib/w/tmp/BUILDROOT/a33d565f9262c93f1d0ea38c818d514b/opt/cmssw/slc7_amd64_gcc700/cms/cmssw/CMSSW_11_0_UBSAN_X_2019-07-12-2300/src/DQM/L1TMonitor/src/L1TGMT.cc:205:40: runtime error: signed integer overflow: 152916275 * 3564 cannot be represented in type 'int'

I am mostly worried by the orbit number 152916275 : is that a reasonable number? Shouldn't it be limited to 262144 = 2**18 orbits in 1 LS?

@@ -202,12 +202,12 @@ void L1TGMT::analyze(const Event& e, const EventSetup& c) {
// if( (Ev - evnum_old_) == 1 && bxnum_old_ > -1 ) {
// assume getting all events in a sequence (usefull only from reco data)
if (bxnum_old_ > -1) {
int dBx = Bx - bxnum_old_ + 3564 * (e.orbitNumber() - obnum_old_);
float dBx = Bx - bxnum_old_ + 3564 * (e.orbitNumber() - obnum_old_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this also work for the first event, when bxnum_old_ and obnum_old_ are still initialized at 0?

And, I am not sure but I think that to allow the conversion to float before assigning the right hand size to dBx you should have some float quantity already in the expression, for example 3564 --> 3564.f

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This issue escaped my attention, apologies. Indeed I agree with you that for the expression to be evaluated as float it should contain a float term, before the assignment to a float variable. @cecilecaillol can we test if Andrea's suggestion solves the issue of large orbit numbers?

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-f49cd8/23831/summary.html
COMMIT: 8544467
CMSSW: CMSSW_12_4_X_2022-04-11-1100/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/37534/23831/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 6 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3593039
  • DQMHistoTests: Total failures: 13
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 3593003
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -0.004 KiB( 47 files compared)
  • DQMHistoSizes: changed ( 312.0 ): -0.004 KiB MessageLogger/Warnings
  • Checked 200 log files, 45 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

@emanueleusai
Copy link
Member

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

perrotta commented Apr 21, 2022

@emanueleusai you signed this, and therefore I assume that the three objections I wrote above are not deemed relevant by DQM. Just for the record, could you please answer them here anyhow?

1) Is an orbit number 152916275 considered acceptable in this context? If not, shouldn't somebody investigate where such a large number comes from, and try to fix over there (perhaps there is at some point a difference between two unsigned numbers?)

2) Does the fix implemented at L205 work also for the firs event, when bxnum_old_ and obnum_old_ are still initialized at 0?

3) In order to allow the conversion to float before assigning the right hand expression to dBx, did you check whether one should have included some float quantity already explicitely in that expression, for example 3564 --> 3564.f?

@perrotta
Copy link
Contributor

@cecilecaillol @emanueleusai any answer/comment about #37534 (comment)?

@cecilecaillol
Copy link
Contributor Author

@vukasinmilosevic We have this issue with the L1T DQM. It seems odd to have such large numbers in the code. Can you or someone from your team check this code makes sense and address the comments above?

@vukasinmilosevic
Copy link
Contributor

Hi @dinyar, do you maybe know more about this?

@dinyar
Copy link
Contributor

dinyar commented Apr 27, 2022

Hi @vukasinmilosevic,

I can have a look, however this is the DQM for the Run-1 GMT, so I don't have any special insight into that code..

Cheers,
Dinyar

@cmsbuild
Copy link
Contributor

Pull request #37534 was updated. @emanueleusai, @ahmad3213, @cmsbuild, @jfernan2, @pmandrik, @micsucmed, @rvenditti can you please check and sign again.

@dinyar
Copy link
Contributor

dinyar commented Apr 28, 2022

Thak you @cecilecaillol for the fix.

The error reported in #27554 says:

/build/cmsbld/jenkins/workspace/build-any-ib/w/tmp/BUILDROOT/a33d565f9262c93f1d0ea38c818d514b/opt/cmssw/slc7_amd64_gcc700/cms/cmssw/CMSSW_11_0_UBSAN_X_2019-07-12-2300/src/DQM/L1TMonitor/src/L1TGMT.cc:205:40: runtime error: signed integer overflow: 152916275 * 3564 cannot be represented in type 'int'

I am mostly worried by the orbit number 152916275 : is that a reasonable number? Shouldn't it be limited to 262144 = 2**18 orbits in 1 LS?

At least in hardware we only reset the orbit counter at run start. I'm not sure if this is then truncated somewhere in software though .

@emanueleusai
Copy link
Member

emanueleusai commented Apr 29, 2022

please test

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs after it passes the integration tests. This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-f49cd8/24337/summary.html
COMMIT: c196fa4
CMSSW: CMSSW_12_4_X_2022-04-29-1100/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/37534/24337/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3695434
  • DQMHistoTests: Total failures: 2
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3695410
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 205 log files, 45 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@emanueleusai
Copy link
Member

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 2c1815a into cms-sw:master Apr 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants