Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed seeding for HLT #13819

Merged
merged 13 commits into from Apr 4, 2016
Merged

Speed seeding for HLT #13819

merged 13 commits into from Apr 4, 2016

Conversation

VinInn
Copy link
Contributor

@VinInn VinInn commented Mar 23, 2016

Speeding up seeding in LHT for electron and regional
negligible regressions expected due to numerics

fyi @mtosi @fwyzard @matteosan1

offline mtv at
http://innocent.home.cern.ch/innocent/RelVal/pu35_81_fhlt

candidate to back port if hlt asks for

@VinInn
Copy link
Contributor Author

VinInn commented Mar 23, 2016

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/12028/console

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @VinInn (Vincenzo Innocente) for CMSSW_8_1_X.

It involves the following packages:

DataFormats/GeometrySurface
DataFormats/GeometryVector
Geometry/CommonDetUnit
Geometry/TrackerGeometryBuilder
RecoEgamma/EgammaElectronAlgos
RecoEgamma/EgammaPhotonAlgos
RecoTracker/TkTrackingRegions
TrackingTools/KalmanUpdators

@civanch, @Dr15Jones, @cvuosalo, @ianna, @mdhildreth, @cmsbuild, @slava77, @davidlange6 can you please review it and eventually sign? Thanks.
@ghellwig, @Sam-Harper, @battibass, @makortel, @abbiendi, @GiacomoSguazzoni, @jhgoh, @lgray, @bellan, @mschrode, @rovere, @istaslis, @gpetruc, @cerati, @VinInn, @trocino, @dgulhan, @venturia this is something you requested to watch as well.
@slava77, @Degano, @smuzaffar you are the release manager for this.

cms-bot commands are list here #13028

inline double halfPi() {return 0.5*3.141592653589793238;}
inline constexpr double pi() {return 3.141592653589793238;}
inline constexpr double twoPi() {return 2. *3.141592653589793238;}
inline constexpr double halfPi() {return 0.5*3.141592653589793238;}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use M_PI, 2. * M_PI, M_PI_2 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because is like this since orca times...
(somewhere else you will find CLHEP::pi())
I would not mind to go back to old good C macros
(as already done elsewhere)

@cmsbuild
Copy link
Contributor

-1

Tested at: 38fa966
I found errors in the following unit tests:

---> test runtestTqafTopEventSelection had ERRORS

you can see the results of the tests here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-13819/12028/summary.html

@cmsbuild
Copy link
Contributor

@VinInn
Copy link
Contributor Author

VinInn commented Mar 24, 2016

cannot find the failed test
is it not a das issue?

@VinInn
Copy link
Contributor Author

VinInn commented Mar 24, 2016

comparison shows no regression
a difference on the number of strip hits in HLT is expected.

@VinInn
Copy link
Contributor Author

VinInn commented Mar 24, 2016

I noticed the truntestTqafTopEventSelection fails also in #13820

@@ -129,6 +129,8 @@ void TrackerGeometry::addDetUnitId(DetId p){
}

void TrackerGeometry::addDet(GeomDet const * p) {
// set index
const_cast<GeomDet *>(p)->setGdetIndex(theDets.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static analyzer is not happy with (yet another) const_cast in this file.
Why are they even here, should the method signature be changed instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would require modifications in the whole TrackerGeometry building code...
let's wait it stabilize for PhaseII, then we can clean it up.

@@ -119,9 +125,10 @@ class GeomDet {

ReferenceCountingPointer<Plane> thePlane;
DetId m_detId;
int m_index;
int m_index=-1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VinInn - it would be nice to clean up the class code a bit: sort public, protected and private members, name the latter either with leading m_* or the*, indent the code consistently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to cleanup all old code in general yes.
takes time and effort.
Let's schedule it for PhaseI (this code is candidate for backport to 800).

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 1, 2016

-1

Tested at: 426feb3
I found errors in the following unit tests:

---> test runtestTqafTopEventSelection had ERRORS

you can see the results of the tests here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-13819/12126/summary.html

@VinInn
Copy link
Contributor Author

VinInn commented Apr 1, 2016

is this runtestTqafTopEventSelection failing randomly? last time was in #13831 ... (which now passed tests with full marks!)

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 1, 2016

@slava77
Copy link
Contributor

slava77 commented Apr 1, 2016

+1

for #13819 426feb3

  • changes in the code are in line with the description
  • see performance summary as of Speed seeding for HLT #13819 38fa966 in Speed seeding for HLT #13819 (comment)
  • the latest update is fully in 426feb3 (the rest is a rebase of the old commits) with changes cleaning up counting code used for debugging and reformatting of LogDebug messages
    • log files in steps with PixelHitMatcher and ElectronSeedGenerator do not have the extra printouts at the end anymore.

@ianna
Copy link
Contributor

ianna commented Apr 1, 2016

+1

@VinInn
Copy link
Contributor Author

VinInn commented Apr 2, 2016

not sure why simulation is involved, @civanch I am sure nothing should change in sim

@civanch
Copy link
Contributor

civanch commented Apr 2, 2016

+1
There is no direct touch to Sim,
Concerning failing of the unit test: many PRs passing standard tests without a problem. Having the same problem twice may mean some hidden issue

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 2, 2016

This pull request is fully signed and it will be integrated in one of the next CMSSW_8_1_X IBs (but tests are reportedly failing). This pull request requires discussion in the ORP meeting before it's merged. @slava77, @davidlange6, @Degano, @smuzaffar

@davidlange6 davidlange6 merged commit a1a7545 into cms-sw:CMSSW_8_1_X Apr 4, 2016
: thePhiAtVertex(phiAtVertex), theCurvature(curvature),
theOriginRBound (originRBound), theTolerance(tolerance) { }
theOriginRBound (originRBound) {
assert(theCurvature.max()>0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, in our TSG tests we suddenly hit this assert:

# Conditions read from  CMS_CONDITIONS  via FrontierProd 
%MSG-i ThreadSetup:  (NoModuleName) 07-Jun-2016 19:17:49 CEST pre-events
setting # threads 4
%MSG
07-Jun-2016 19:18:00 CEST  Initiating request to open file file:RelVal_Raw_GRun_DATA.root
07-Jun-2016 19:18:01 CEST  Successfully opened file file:RelVal_Raw_GRun_DATA.root
isBH_ 18:13:18:0
Begin processing the 1st record. Run 272762, Event 41669009, LumiSection 51 at 07-Jun-2016 19:18:58.423 CEST
Begin processing the 2nd record. Run 272762, Event 41468301, LumiSection 51 at 07-Jun-2016 19:18:58.425 CEST
Begin processing the 3rd record. Run 272762, Event 41643921, LumiSection 51 at 07-Jun-2016 19:18:58.438 CEST
Begin processing the 4th record. Run 272762, Event 41555597, LumiSection 51 at 07-Jun-2016 19:18:58.459 CEST
cmsRun: /build/cmsbuild/jenkins-workarea/workspace/build-any-ib/w/tmp/BUILDROOT/6eb14a8c47f34962dae9df674c18b15e/opt/cmssw/slc6_amd64_gcc530/cms/cmssw/CMSSW_8_1_X_2016-06-05-0000/src/RecoTracker/TkTrackingRegions/src/OuterHitPhiPrediction.h:23: OuterHitPhiPrediction::OuterHitPhiPrediction(const Range&, const Range&, float): Assertion `theCurvature.max()>0' failed.

This is obviously problematic!
In particular if this is to be backported to 80X!
That this shows only now is/seems to be related that our now failing tests (real data) now use 2016 real data (before they were using 2015 real data).
At this stage I need to remove the assert; is that OK?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would like to know how to reproduce

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, using: CMSSW_8_1_X_2016-06-08-2300, add PR #14811,
and execute addOnTests.py -t hlt_data_GRun - this works.
Then edit the file RecoTracker/TkTrackingRegions/src/OuterHitPhiPrediction.h:23
and put back the assert and run the above addOn again. Then there is a crash!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, it seems related to the GT used: if I use the HLT GT 80X_dataRun2_HLT_frozen_v12
it crashes. If I use the reco GT 80X_dataRun2_v14 it does not crash. But of course we
need to use the HLT GT, so is there something missing in there?

Copy link
Contributor

@mmusich mmusich Jun 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Martin-Grunewald if a record would be missing it would crash before the first event.
I would argue that the difference of calibrations between 80X_dataRun2_v14 (data re-reco, no IOVs from 2016), vs 80X_dataRun2_HLT_frozen_v12 (online GT, IOVs from 2016) is causing the problem you report.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mmusich
Thanks a lot for the explanation.
Following this, I shall change from auto:run2_hlt to auto:run2_hlt_relval for the TSG/HLTrigger side (and I assume DQM should do the same given the use cases shown in your link), so that we can then get rid of auto:run2_hlt in autoCond.py.

Actually there could be a relval use case for auto:run2_hlt in the future: imagine
a frozen HLT menu used for relvals which uses some version of L1T which in future
may not be the then latest one in auto:run2_hlt_relval

Here are the PR to move to 'relval' GTs for hlt and data in the TSG tests: #14849 and #14850
(cmsDriver is not modified as there the string is just used as a radix where more gets appended)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In both case is still wrong to reconstuct data taken past the snapshot

I agree, but is there some kind of protection or check to detect when this happens ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fwyzard not that I am aware of, might be useful to throw some warning. @ggovi

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fwyzard @Martin-Grunewald
To further clarify: specify a snapshot in a GT make sense only when the underlying condition set is 'a good one' and is frozen. In practice, the GT with snapshot are targeted to specific (non-evolving) data sets. The usage for data 'in the future' wrt the snapshot is in general inappropriate.
That said, we could add a warning to remark the usage outside the snapshot boundary, although this doesn't necessarily consists in something wrong in itself...
It's more about to set up a policy and how strict we want to be ( in the usage of GT with snapshot ). We might even decide to 'close the iovs' and generate an error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Closed IOVs are the mechanism the framework expects to be used to catch such problems.

@VinInn
Copy link
Contributor Author

VinInn commented Jun 10, 2016

Reconstructing a 3.8T run with 0T conditions explain the failing assert.
Actually this points out that the whole code in RectangularEtaPhiTrackingRegion is not protected against 0T field.
In any case removing the assert is correct if we run tracking at HLT for 0T and we want to continue to reproduce the current behaviour.

@mmusich mmusich mentioned this pull request Jun 10, 2016
davidlange6 added a commit that referenced this pull request Aug 26, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet