Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[segfaults] SiPixelDigiValid::analyse: blade number is out-of-range of array, out-of-bounds write on stack #13858

Closed
davidlt opened this issue Mar 29, 2016 · 8 comments

Comments

@davidlt
Copy link
Contributor

davidlt commented Mar 29, 2016

I noticed that we have failing workflows on CLANG builds, e.g. 10023.0 MinBias_13: https://cmssdt.cern.ch/SDT/cgi-bin/buildlogs/slc6_amd64_gcc530/CMSSW_8_1_CLANG_X_2016-03-26-1100/pyRelValMatrixLogs/run/10023.0_MinBias_13+MinBias_13TeV_pythia8_TuneCUETP8M1_2017_GenSimFullINPUT+DigiFull_2017+RecoFull_2017+HARVESTFull_2017/step3_MinBias_13+MinBias_13TeV_pythia8_TuneCUETP8M1_2017_GenSimFullINPUT+DigiFull_2017+RecoFull_2017+HARVESTFull_2017.log

GDB session revealed that we have out-of-bounds write to an array on the stack.

diff --git a/Validation/TrackerDigis/plugins/SiPixelDigiValid.cc b/Validation/TrackerDigis/plugins/SiPixelDigiValid.cc
index 4d257dd..9c5c661 100644
--- a/Validation/TrackerDigis/plugins/SiPixelDigiValid.cc
+++ b/Validation/TrackerDigis/plugins/SiPixelDigiValid.cc
@@ -20,6 +20,9 @@
 #include "Geometry/TrackerGeometryBuilder/interface/PixelGeomDetUnit.h"
 #include "Geometry/TrackerGeometryBuilder/interface/RectangularPixelTopology.h"

+#include <iostream>
+#include <cassert>
+
 //using namespace std;
 //using namespace edm;

@@ -515,6 +518,7 @@ for ( int i =0 ; i< 44; i++) {
            unsigned int side  = tTopo->pxfSide(id);
            unsigned int disk  = tTopo->pxfDisk(id);
            unsigned int blade = tTopo->pxfBlade(id);
+           std::cerr << "blade: " << blade << std::endl;
            unsigned int panel = tTopo->pxfPanel(id);
            unsigned int mod   = tTopo->pxfModule(id);
            //LogInfo("SiPixelDigiValid")<<"EndcaP="<<side<<" Disk="<<disk<<" Blade="<<blade<<" Panel="<<panel<<" Module="<<mod;
@@ -624,6 +628,8 @@ for ( int i =0 ; i< 44; i++) {
                      }else {
                          //LogError("SiPixelDigiValid")<<" The number of module is Wrong";
                      }
+                     assert(blade - 1 >= 0);
+                     assert(blade - 1 < 24);
                      ++ndigiZpDisk1PerPanel1[blade-1];
              }

The "plane" can hold only up to 24 "blades", yet we could get numbers as high as 54. In this particular case we were damaging tTopo pointer on the stack.

[..]
627                          ++ndigiZpDisk1PerPanel1[blade-1];
(gdb) p tTopo
$17 = (const TrackerTopology * const) 0x7fff44a17000
(gdb) n
630                  if(side == 2 && disk == 1 && panel ==2 ){
(gdb) p tTopo
$18 = (const TrackerTopology * const) 0x800044a17000
(gdb) p blade
$19 = 54
[..]

This is a memory we cannot access thus segfault.

@cmsbuild
Copy link
Contributor

A new Issue was created by @davidlt .

@davidlange6, @smuzaffar, @Degano, @davidlt, @Dr15Jones can you please review it and eventually sign/assign? Thanks.

cms-bot commands are list here #13029

@davidlt
Copy link
Contributor Author

davidlt commented Mar 29, 2016

assign dqm

@cmsbuild
Copy link
Contributor

New categories assigned: dqm

@deguio,@vanbesien you have been requested to review this Pull request/Issue and eventually sign? Thanks

@davidlt
Copy link
Contributor Author

davidlt commented Apr 4, 2016

ping^1

@boudoul
Copy link
Contributor

boudoul commented Apr 6, 2016

hi @davidlt , sorry we (tracker) have just been informed now about this error - Indeed there is the harcoded number (24) which was valid with the run2 detector but not anymore with the 2017 scenario (the workflow crashing) - Thanks for spotting (and even giving hints what is the cause of the error!) We shall adress this asap.

@dmitrijus
Copy link
Contributor

There is a simple (and slow) thing we can do - just extend the arrays if we see a higher blade:
dmitrijus@35d5727

If tracker people are happy, I will make a pull request.
Ideally, the upper limit should be read from tracker topology, but I have not found a way to do so.

@boudoul
Copy link
Contributor

boudoul commented Apr 7, 2016

Hi @dmitrijus - I believe @fioriNTU is about to provide a fix - I would prefer something more clean assuming he can do - Thanks for helping

@schneiml
Copy link
Contributor

schneiml commented Apr 7, 2016

I (on behalf of @fioriNTU ) looked into a "proper" fix which retrieves the correct number of blades (currently testing).
There are a number of problems with this:

  • It introduces additional dependencies to RecoTracker
  • It only addresses the problem with the number of blades, not the general geometry depedence of the code.
  • There are more fixed-size arrays for ladders. These do not cause a crash since the upgrade detector has less ladders than the old one, but fixing them as well should make the output more meaningful.

So, for an actual proper fix this plugin has to be rewritten, like the rest of the DQM; we can only apply a fix to make it not crash now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants