Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CTPPS: miniAOD #17162

Merged
merged 9 commits into from Jan 20, 2017
Merged

CTPPS: miniAOD #17162

merged 9 commits into from Jan 20, 2017

Conversation

jan-kaspar
Copy link
Contributor

This PR brings CTPPS objects to the miniAOD level.

The new data format (for both tracking and timing detectors):
DataFormats/CTPPSReco/interface/CTPPSLocalTrackLite.h

The ED producer
RecoCTPPS/TotemRPLocal/plugins/CTPPSLocalTrackLiteProducer.cc
is part of the standard reconstruction sequence
RecoCTPPS/TotemRPLocal/python/totemRPLocalReconstruction_cff.py
and the produced data are also included in the AOD:
RecoCTPPS/Configuration/python/RecoCTPPS_EventContent_cff.py

The slimming config:
PhysicsTools/PatAlgos/python/slimming/MicroEventContent_cff.py
has been updated to keep the new data format.

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @jan-kaspar for CMSSW_9_0_X.

It involves the following packages:

DataFormats/CTPPSReco
PhysicsTools/PatAlgos
RecoCTPPS/Configuration
RecoCTPPS/TotemRPLocal

@cmsbuild, @cvuosalo, @slava77, @monttj, @davidlange6 can you please review it and eventually sign? Thanks.
@TaiSakuma, @gouskos, @imarches, @ahinzmann, @acaudron, @mmarionncern, @rappoccio, @jdolen, @nhanvtran, @gpetruc, @gkasieczka, @schoef, @ferencek, @mverzett, @mariadalfonso, @pvmulder, @JyothsnaKomaragiri this is something you requested to watch as well.
@davidlange6, @smuzaffar you are the release manager for this.

cms-bot commands are listed here #13028

struct CTPPSLocalTrackLite
{
public:
CTPPSLocalTrackLite(uint32_t pid=0, float px=0., float pxu=-1., float py=0., float pyu=-1., float pt=0., float ptu=-1.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please define a proper default constructor (without arguments).
Are there use cases for constructors where the first 3-4 arguments are not known while the last few can have default values?

Switch to class from struct

{
}

uint32_t getRPId() const
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add doxygen-compatible comments with documentation of the member methods (member data has some comments already, nice).

@@ -39,4 +39,11 @@
<class name="edm::Wrapper<edm::DetSetVector<TotemRPLocalTrack::FittedRecHit>>"/>
<class name="std::vector<edm::DetSet<TotemRPLocalTrack::FittedRecHit> >"/>
<class name="std::vector<TotemRPLocalTrack::FittedRecHit>"/>

<class name="CTPPSLocalTrackLite" ClassVersion="2">
<version ClassVersion="2" checksum="3838813906"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

version 3 is preferred now for the starting point

@jan-kaspar
Copy link
Contributor Author

@slava77 I've implemented your suggestions.

Are there use cases for constructors where the first 3-4 arguments are not known while the last few can have default values?

Not really - the foreseen timing detectors always have some segmentation so at least coarse spatial information is always available.

@slava77
Copy link
Contributor

slava77 commented Jan 12, 2017

@cmsbuild please test

It's not quite clear to me if this Lite producer will work by running on AOD inputs
@jan-kaspar can you confirm?

@cmsbuild
Copy link
Contributor

cmsbuild commented Jan 12, 2017

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/17258/console Started: 2017/01/12 22:07

@cmsbuild
Copy link
Contributor

-1

Tested at: b94427d

You can see the results of the tests here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-17162/17258/summary.html

I found follow errors while testing this PR

Failed tests: RelVals

  • RelVals:

When I ran the RelVals I found an error in the following worklfows:
136.731 step3

runTheMatrix-results/136.731_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_skimSinglePh_HIPM+HARVESTDR2/step3_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_skimSinglePh_HIPM+HARVESTDR2.log

@cmsbuild
Copy link
Contributor

Comparison not run due to runTheMatrix errors (RelVals and Igprof tests were also skipped)

@slava77
Copy link
Contributor

slava77 commented Jan 12, 2017

in step3 136.731

   [1] Running path 'dqmoffline_step'
   [2] Calling method for module TotemRPDQMSource/'totemRPDQMSource'
Exception Message:
CTPPSDetId ctor: Invalid parameters: arm=19812843 station=5 rp=2

I hope the next update is first tested with the standard runTheMatrix.py -l limited -i all

@slava77
Copy link
Contributor

slava77 commented Jan 17, 2017

@jan-kaspar
is this PR compatible with 80X?
If so, you may want to make a PR for 80X, in view of possible re-miniAOD

@jan-kaspar
Copy link
Contributor Author

is this PR compatible with 80X?
If so, you may want to make a PR for 80X, in view of possible re-miniAOD

Yes to both. I typically issue back-ports only after the development PR has been approved - to avoid making the same corrections in multiple branches. Let me know if you think that I should do differently this time.

@slava77
Copy link
Contributor

slava77 commented Jan 17, 2017 via email

@slava77
Copy link
Contributor

slava77 commented Jan 19, 2017

+1

for #17162 5853bfc

  • code changes are roughly in line with the description (there are also additional technical updates in DQM to use CTPPSDetId instead of unsigned int, apparently fixing up several spots missed in CTPPS: detector id update #16010 )
  • jenkins tests pass and comparisons show no differences (existing products and plots are not affected)
  • local test with 136.725 (2015B; the same run as what's in the shortMatrix in 137.731) had CTPPS data and shows some reasonable distributions for the new products (after the validate.C script is updated)

[red: baseline; black: with this PR, using CMSSW_9_0_X_2017-01-18-1100]
all_mini_origvssign827_runmet2016bwf136p725c_ctppslocaltracklites_ctppslocaltrackliteproducer__rereco_obj_getx
all_mini_origvssign827_runmet2016bwf136p725c_ctppslocaltracklites_ctppslocaltrackliteproducer__rereco_obj_gety

The new product branch appears in RECO/AOD and in miniAOD files with a small size, as expected

27        CTPPSLocalTrackLites_ctppsLocalTrackLiteProducer__reRECO

CPU time of the ctppsLocalTrackLiteProducer is negligible.

@davidlange6 davidlange6 merged commit 4a12ea8 into cms-sw:CMSSW_9_0_X Jan 20, 2017
@slava77
Copy link
Contributor

slava77 commented Jan 20, 2017

@jan-kaspar
looks like it's time for the backport.

@jan-kaspar
Copy link
Contributor Author

@slava77 OK, I've cherry-picked the commits from this PR to our "continuous integration" branch intended for the back-port preceeding the legacy re-reco:
https://github.com/CTPPS/cmssw/tree/ctpps_cumulative_backport_8_0_X

@slava77
Copy link
Contributor

slava77 commented Jan 20, 2017 via email

@davidlange6
Copy link
Contributor

Hi @jan-kaspar @slava77 -it looks like this PR is causing data workflows to crash - eg, 136.721. Could you have a look and provide a fix. Thanks @Dr15Jones

@slava77
Copy link
Contributor

slava77 commented Jan 23, 2017 via email

@slava77
Copy link
Contributor

slava77 commented Jan 24, 2017

curiously, the crash in 136.731 happened in CMSSW_9_0_X_2017-01-20-2300
and is no longer visible in CMSSW_9_0_X_2017-01-23-1100

@davidlange6
Copy link
Contributor

davidlange6 commented Jan 24, 2017 via email

@jan-kaspar
Copy link
Contributor Author

Thanks Slava for sharing the link to the log. Reading through, I can't really understand the problem. I've run the workflow 136.731 within CMSSW_9_0_X_2017-01-20-2300 on Lxplus, but can't reproduce the problem. Would you have any hints how to debug it? Thanks!

@slava77
Copy link
Contributor

slava77 commented Jan 24, 2017 via email

@slava77
Copy link
Contributor

slava77 commented Jan 24, 2017

@jan-kaspar
Copy link
Contributor Author

Thanks Slava! Finally, with option "-j 16 -t 8" I can (sometimes) reproduce the problem. But the behaviour is quite random: sometimes the segfault occurs sometimes not, the crashing module is not always the same one...

I've tried to comment out code using the static variable in "TotemRPGeometryESModule" but it doesn't solve the problem.

Are we really sure that the problem comes from this PR? To me, it seems rather unrelated...

@davidlange6
Copy link
Contributor

we aren't - I can propose to back out this PR and see if the errors go away.

@davidlange6
Copy link
Contributor

to follow up here - yes, reverting this pull request fixes our problems.

@dan131riley
Copy link

AFAICT there's no reason for anything in MeasuredGeometryProducer to be static. It looks like some function got declared static for no good reason, and that led to having to make evRotationStoreState also static, but none of those need to be static, as they can't possibly be called without a "this".

@dan131riley
Copy link

There may be a conflict between TotemRPUVPatternFinder and RPCTrigger usage of Xerces, possibly a latent concurrency bug in RPCTrigger. I've opened issue #17288

@jan-kaspar
Copy link
Contributor Author

to follow up here - yes, reverting this pull request fixes our problems.

Thanks David for checking this.

AFAICT there's no reason for anything in MeasuredGeometryProducer to be static. It looks like some function got declared static for no good reason, and that led to having to make evRotationStoreState also static, but none of those need to be static, as they can't possibly be called without a "this".

Right. The entire code of MeasuredGeometryProducer is ugly and at some point I'd like to replace it with something more standard. In a short term - if we like - it can be completely commented out. In the current workflow (unlikely to change in short future), alignment corrections are only applied at analysis level. Therefore no alignment corrections during RECO and no meanigful use of MeasuredGeometryProducer.

There may be a conflict between TotemRPUVPatternFinder and RPCTrigger usage of Xerces, possibly a latent concurrency bug in RPCTrigger. I've opened issue #17288

Many thanks for investigating Dan! Do I interpret correctly the discussion that the problem most likely lies in RPCTrigger?

I don't quite understand the possible interference between TotemRPUVPatternFinder and RPCTrigger since I don't think that Xerces is actually run by the CTPPS code. In steps: TotemRPUVPatternFinder does not use Xerces, it only requires geometry from EventSetup:
https://github.com/cms-sw/cmssw/blob/CMSSW_9_0_X/RecoCTPPS/TotemRPLocal/plugins/TotemRPUVPatternFinder.cc#L170
The geometry builder "TotemRPGeometryESModule" would lead to Xerces use if there were alignment XML files to load. But in the current configuration there are none. So the only way how TotemRPGeometryESModule could possibly lead to use of Xerces is via requesting the geometry from the DDL files. Is Xerces used for that?

Also, I don't really understand why this PR makes this problem appear. The only thing changed in TotemRPUVPatternFinder was the interpretation of detector ids - seemingly very unrelated.

@dan131riley
Copy link

dan131riley commented Jan 26, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants