New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CTPPS: miniAOD #17162
CTPPS: miniAOD #17162
Conversation
A new Pull Request was created by @jan-kaspar for CMSSW_9_0_X. It involves the following packages: DataFormats/CTPPSReco @cmsbuild, @cvuosalo, @slava77, @monttj, @davidlange6 can you please review it and eventually sign? Thanks. cms-bot commands are listed here #13028 |
struct CTPPSLocalTrackLite | ||
{ | ||
public: | ||
CTPPSLocalTrackLite(uint32_t pid=0, float px=0., float pxu=-1., float py=0., float pyu=-1., float pt=0., float ptu=-1.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please define a proper default constructor (without arguments).
Are there use cases for constructors where the first 3-4 arguments are not known while the last few can have default values?
Switch to class from struct
{ | ||
} | ||
|
||
uint32_t getRPId() const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add doxygen-compatible comments with documentation of the member methods (member data has some comments already, nice).
@@ -39,4 +39,11 @@ | |||
<class name="edm::Wrapper<edm::DetSetVector<TotemRPLocalTrack::FittedRecHit>>"/> | |||
<class name="std::vector<edm::DetSet<TotemRPLocalTrack::FittedRecHit> >"/> | |||
<class name="std::vector<TotemRPLocalTrack::FittedRecHit>"/> | |||
|
|||
<class name="CTPPSLocalTrackLite" ClassVersion="2"> | |||
<version ClassVersion="2" checksum="3838813906"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
version 3 is preferred now for the starting point
…cumented, class version 3.
@slava77 I've implemented your suggestions.
Not really - the foreseen timing detectors always have some segmentation so at least coarse spatial information is always available. |
@cmsbuild please test It's not quite clear to me if this Lite producer will work by running on AOD inputs |
The tests are being triggered in jenkins. |
-1 Tested at: b94427d You can see the results of the tests here: I found follow errors while testing this PR Failed tests: RelVals
When I ran the RelVals I found an error in the following worklfows: runTheMatrix-results/136.731_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_skimSinglePh_HIPM+HARVESTDR2/step3_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_skimSinglePh_HIPM+HARVESTDR2.log |
Comparison not run due to runTheMatrix errors (RelVals and Igprof tests were also skipped) |
in step3 136.731
I hope the next update is first tested with the standard |
@jan-kaspar |
Yes to both. I typically issue back-ports only after the development PR has been approved - to avoid making the same corrections in multiple branches. Let me know if you think that I should do differently this time. |
On 1/17/17 9:04 AM, jan-kaspar wrote:
is this PR compatible with 80X?
If so, you may want to make a PR for 80X, in view of possible re-miniAOD
Yes to both. I typically issue back-ports only after the development PR
has been approved - to avoid making the same corrections in multiple
branches. Let me know if you think that I should do differently this time.
Thanks for clarifying.
I think we can proceed as you described, make a backport after this is
signed.
…
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#17162 (comment)>, or
mute the thread
<https://github.com/notifications/unsubscribe-auth/AEdcbm6fJxHB4__07LHVQWNl-hzl_aCpks5rTPSRgaJpZM4Lhn34>.
|
+1
[red: baseline; black: with this PR, using CMSSW_9_0_X_2017-01-18-1100] The new product branch appears in RECO/AOD and in miniAOD files with a small size, as expected
CPU time of the ctppsLocalTrackLiteProducer is negligible. |
@jan-kaspar |
@slava77 OK, I've cherry-picked the commits from this PR to our "continuous integration" branch intended for the back-port preceeding the legacy re-reco: |
On 1/20/17 10:45 AM, jan-kaspar wrote:
@slava77 <https://github.com/slava77> OK, I've cherry-picked the commits
from this PR to our "continuous integration" branch intended for the
back-port preceeding the legacy re-reco:
https://github.com/CTPPS/cmssw/tree/ctpps_cumulative_backport_8_0_X
OK for the rereco plans.
Actually, in case we were going to have re-miniAOD release some time
next week,
I was going to suggest you add CTPPS to miniAOD to go with it.
If you are interested, the backport will need to be minimal (no detid
migration).
…
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#17162 (comment)>, or
mute the thread
<https://github.com/notifications/unsubscribe-auth/AEdcbjM3bC6_dzEjb7GKLBvERHgwOy6rks5rUQDLgaJpZM4Lhn34>.
|
Hi @jan-kaspar @slava77 -it looks like this PR is causing data workflows to crash - eg, 136.721. Could you have a look and provide a fix. Thanks @Dr15Jones |
On 1/23/17 11:13 AM, David Lange wrote:
Hi @jan-kaspar <https://github.com/jan-kaspar> @slava77
<https://github.com/slava77> -it looks like this PR is causing data
workflows to crash - eg, 136.721. Could you have a look and provide a
fix. Thanks @Dr15Jones <https://github.com/Dr15Jones>
the crash is related to TotemRPGeometryESModule and TotemRPUVPatternFinder
https://cmssdt.cern.ch/SDT/cgi-bin/buildlogs/slc6_amd64_gcc530/CMSSW_9_0_X_2017-01-20-2300/pyRelValMatrixLogs/run/136.731_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_skimSinglePh_HIPM+HARVESTDR2/step3_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_skimSinglePh_HIPM+HARVESTDR2.log
…
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#17162 (comment)>, or
mute the thread
<https://github.com/notifications/unsubscribe-auth/AEdcbtCCzy2WNFeZDYuR_JlIwNt1bTGhks5rVPvXgaJpZM4Lhn34>.
|
curiously, the crash in 136.731 happened in CMSSW_9_0_X_2017-01-20-2300 |
but to clarify there are plenty of related ones with the same error..
… On Jan 23, 2017, at 5:04 PM, Slava Krutelyov ***@***.***> wrote:
curiously, the crash in 136.731 happened in CMSSW_9_0_X_2017-01-20-2300
and is no longer visible in CMSSW_9_0_X_2017-01-23-1100
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Thanks Slava for sharing the link to the log. Reading through, I can't really understand the problem. I've run the workflow 136.731 within CMSSW_9_0_X_2017-01-20-2300 on Lxplus, but can't reproduce the problem. Would you have any hints how to debug it? Thanks! |
On 1/24/17 2:37 AM, jan-kaspar wrote:
Thanks Slava for sharing the link to the log. Reading through, I can't
really understand the problem. I've run the workflow 136.731 within
CMSSW_9_0_X_2017-01-20-2300 on Lxplus, but can't reproduce the problem.
Would you have any hints how to debug it? Thanks!
IB tests are done multithreaded
runTheMatrix.py -l 136.731 -t 4
There is a
static bool evRotationStoreState;
which may be introducing a problem, not sure if anything else similarly
non-threadsafe can be a problem.
…
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#17162 (comment)>, or
mute the thread
<https://github.com/notifications/unsubscribe-auth/AEdcbtPL-gpWSqzjgJPAgOyUYKTJpUEnks5rVdRXgaJpZM4Lhn34>.
|
Thanks Slava! Finally, with option "-j 16 -t 8" I can (sometimes) reproduce the problem. But the behaviour is quite random: sometimes the segfault occurs sometimes not, the crashing module is not always the same one... I've tried to comment out code using the static variable in "TotemRPGeometryESModule" but it doesn't solve the problem. Are we really sure that the problem comes from this PR? To me, it seems rather unrelated... |
we aren't - I can propose to back out this PR and see if the errors go away. |
to follow up here - yes, reverting this pull request fixes our problems. |
AFAICT there's no reason for anything in MeasuredGeometryProducer to be static. It looks like some function got declared static for no good reason, and that led to having to make evRotationStoreState also static, but none of those need to be static, as they can't possibly be called without a "this". |
There may be a conflict between TotemRPUVPatternFinder and RPCTrigger usage of Xerces, possibly a latent concurrency bug in RPCTrigger. I've opened issue #17288 |
Thanks David for checking this.
Right. The entire code of MeasuredGeometryProducer is ugly and at some point I'd like to replace it with something more standard. In a short term - if we like - it can be completely commented out. In the current workflow (unlikely to change in short future), alignment corrections are only applied at analysis level. Therefore no alignment corrections during RECO and no meanigful use of MeasuredGeometryProducer.
Many thanks for investigating Dan! Do I interpret correctly the discussion that the problem most likely lies in RPCTrigger? I don't quite understand the possible interference between TotemRPUVPatternFinder and RPCTrigger since I don't think that Xerces is actually run by the CTPPS code. In steps: TotemRPUVPatternFinder does not use Xerces, it only requires geometry from EventSetup: Also, I don't really understand why this PR makes this problem appear. The only thing changed in TotemRPUVPatternFinder was the interpretation of detector ids - seemingly very unrelated. |
On Jan 26, 2017, at 06:05, jan-kaspar ***@***.***> wrote:
I don't quite understand the possible interference between TotemRPUVPatternFinder and RPCTrigger since I don't think that Xerces is actually run by the CTPPS code. In steps: TotemRPUVPatternFinder does not use Xerces, it only requires geometry from EventSetup:
https://github.com/cms-sw/cmssw/blob/CMSSW_9_0_X/RecoCTPPS/TotemRPLocal/plugins/TotemRPUVPatternFinder.cc#L170
The geometry builder "TotemRPGeometryESModule" would lead to Xerces use if there were alignment XML files to load. But in the current configuration there are none. So the only way how TotemRPGeometryESModule could possibly lead to use of Xerces is via requesting the geometry from the DDL files. Is Xerces used for that?
Yes, Xerces is used for the DDL files. The simplified call stack is something like
TotemRPUVPatternFinder::produce() asks for the TotemRPGeometry which calls
TotemRPGeometryESModule::produceRealTG
TotemRPGeometryESModule::produceRealGD
TotemRPGeometryESModule::produceMeasuredGD
TotemRPGeometryESModule::produceMeasuredDDCV
That finally gets the DDCompactView via XMLIdealGeometryESSource::produce which calls DDLParser::parse which then calls into Xerces to parse it.
Also, I don't really understand why this PR makes this problem appear. The only thing changed in TotemRPUVPatternFinder was the interpretation of detector ids - seemingly very unrelated.
It probably isn't due to any change in TotemRPUVPatternFinder, but rather comes from a change in when (or if) it gets called. If the problem is due to interference between TotemRPUVPatternFinder and RPCTrigger, then it will only happen when those initializations overlap. So maybe some change in that PR (like CTPPSLocalTrackLiteProducer doing a "consumes<DetSetVector<TotemRPLocalTrack>>") changes the producer schedule in a way that makes it very likely that TotemRPUVPatternFinder and RPCTrigger overlap. Looking at how that PR changed the product dependency graph might tell us. Concurrency bugs are fun that way...
|
This PR brings CTPPS objects to the miniAOD level.
The new data format (for both tracking and timing detectors):
DataFormats/CTPPSReco/interface/CTPPSLocalTrackLite.h
The ED producer
RecoCTPPS/TotemRPLocal/plugins/CTPPSLocalTrackLiteProducer.cc
is part of the standard reconstruction sequence
RecoCTPPS/TotemRPLocal/python/totemRPLocalReconstruction_cff.py
and the produced data are also included in the AOD:
RecoCTPPS/Configuration/python/RecoCTPPS_EventContent_cff.py
The slimming config:
PhysicsTools/PatAlgos/python/slimming/MicroEventContent_cff.py
has been updated to keep the new data format.