Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TLS cleanup #4293

Merged
merged 5 commits into from Jun 19, 2014
Merged

TLS cleanup #4293

merged 5 commits into from Jun 19, 2014

Conversation

VinInn
Copy link
Contributor

@VinInn VinInn commented Jun 17, 2014

removed a huge number of useless TLS from ECAL CondContainer(s)
reduced by a factor 2 in TrajectoryCleaner and TrajectoryFilter

@VinInn VinInn changed the title Tl scleanup TLS cleanup Jun 17, 2014
@cmsbuild
Copy link
Contributor

A new Pull Request was created by @VinInn (Vincenzo Innocente) for CMSSW_7_2_X.

TLS cleanup

It involves the following packages:

CondFormats/EcalObjects
TrackingTools/TrajectoryCleaning
TrackingTools/TrajectoryFiltering

@apfeiffer1, @nclopezo, @cmsbuild, @diguida, @rcastello, @StoyanStoynev, @slava77, @ggovi, @Degano can you please review it and eventually sign? Thanks.
@GiacomoSguazzoni, @rovere, @argiro, @gpetruc, @cerati, @venturia this is something you requested to watch as well.
You can sign-off by replying to this message having '+1' in the first line of your reply.
You can reject by replying to this message having '-1' in the first line of your reply.

@cmsbuild
Copy link
Contributor

Pull request #4293 was updated. @apfeiffer1, @nclopezo, @cmsbuild, @diguida, @rcastello, @StoyanStoynev, @slava77, @ggovi, @Degano can you please check and sign again.

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

@slava77
Copy link
Contributor

slava77 commented Jun 17, 2014

25.0 from jenkins includes resim. this is occasionally not reproducible, the random sequence diverges at some point.
I've seen in the past that it could actually work if you rerun jenkins again.

@VinInn
Copy link
Contributor Author

VinInn commented Jun 19, 2014

@StoyanStoynev
thanks Stoyan for the suggestions about TrajectoryFilter.
Tracking POG will consider it in next refactoring of the code.

For what EcalCond*Container is concerned: they is no chance that anything else than a EBAR or ECAL is passed there... @argiro to comment further

@StoyanStoynev
Copy link
Contributor

All tests I made were fine with one exception - wf 19 (SingleGammaPt35). I see a difference in conversion related variables. It originates from this as far as I see:
all_extended_pull4293__vs__baserel_singlegammapt35wf19p0c_recoconversions_allconversions__reco_objat_size
In one event (out of 2000) the number of conversions changes by +one with the new code.
Can someone check what is going on, the idea is that nothing should change
(I used 33d013d and CMSSW_7_2_X_2014-06-17-1400 as a base)? From here the event content is also changed a bit (for this test):
Branch Name | Average Uncompressed Size (Bytes/Event) | Average Compressed Size
(Bytes/Event)

< recoConversions_allConversions__RECO. 4902 1128.8

recoConversions_allConversions__RECO. 4902.74 1128.95

@ktf
Copy link
Contributor

ktf commented Jun 19, 2014

What do you mean when you say "I used 33d013d"? Can you try in last night IB, with and without the PR?

@slava77
Copy link
Contributor

slava77 commented Jun 19, 2014

This looks like one of the cases I've seen in
#4008 (comment)

I think now that this code/test should be run with valgrind.
There's likely something like uninit variable or something like that.

On 6/19/14, 6:04 AM, StoyanStoynev wrote:

All tests I made were fine with one exception - wf 19 (SingleGammaPt35).
I see a difference in conversion related variables. It originates from
this as far as I see:

all_extended_pull4293__vs__baserel_singlegammapt35wf19p0c_recoconversions_allconversions__reco_objat_size
https://cloud.githubusercontent.com/assets/5399712/3326572/eb9e928e-f79f-11e3-9151-f94e39b12d86.png
In one event (out of 2000) the number of conversions changes by +one
with the new code.
Can someone check what is going on, the idea is that nothing should change
(I used 33d013d
33d013d
and CMSSW_7_2_X_2014-06-17-1400 as a base)? From here the event content
is also changed a bit (for this test):
Branch Name | Average Uncompressed Size (Bytes/Event) | Average
Compressed Size
(Bytes/Event)

< recoConversions_allConversions__RECO. 4902 1128.8

recoConversions_allConversions__RECO. 4902.74 1128.95


Reply to this email directly or view it on GitHub
#4293 (comment).


Vyacheslav (Slava) Krutelyov
TAMU: Physics Dept Texas A&M MS4242, College Station, TX 77843-4242
CERN: 42-R-027
AIM/Skype: siava16 googleTalk: slava77@gmail.com
(630) 291-5128 Cell (US) +41 76 275 7116 Cell (CERN)


@VinInn
Copy link
Contributor Author

VinInn commented Jun 19, 2014

@slava77
do you have positive evidence that
runTheMatrix.py -l 19.0 --useInput all --command="-n 2000" >& run19.0_2k.log &
does not reproduce itself?
or the only evidence is in case of recompilation?

are we sure the diff comes from RECO and not from DIGI?

@StoyanStoynev
Copy link
Contributor

"I used 33d013d" -> 33d013d is the latest commit for this PR and the latest IB (to merge it to) was the one I mentioned. Then I compared this to the same "pure" IB . In essence, I did what you said.

@davidlange6
Copy link
Contributor

previously the variable is somewhere unrelated to this PR, no? if so, we need to work in parallel.

On Jun 19, 2014, at 3:17 PM, Slava Krutelyov notifications@github.com
wrote:

This looks like one of the cases I've seen in
#4008 (comment)

I think now that this code/test should be run with valgrind.
There's likely something like uninit variable or something like that.

On 6/19/14, 6:04 AM, StoyanStoynev wrote:

All tests I made were fine with one exception - wf 19 (SingleGammaPt35).
I see a difference in conversion related variables. It originates from
this as far as I see:

all_extended_pull4293__vs__baserel_singlegammapt35wf19p0c_recoconversions_allconversions__reco_objat_size
https://cloud.githubusercontent.com/assets/5399712/3326572/eb9e928e-f79f-11e3-9151-f94e39b12d86.png
In one event (out of 2000) the number of conversions changes by +one
with the new code.
Can someone check what is going on, the idea is that nothing should change
(I used 33d013d
33d013d
and CMSSW_7_2_X_2014-06-17-1400 as a base)? From here the event content
is also changed a bit (for this test):
Branch Name | Average Uncompressed Size (Bytes/Event) | Average
Compressed Size
(Bytes/Event)

< recoConversions_allConversions__RECO. 4902 1128.8

recoConversions_allConversions__RECO. 4902.74 1128.95


Reply to this email directly or view it on GitHub
#4293 (comment).


Vyacheslav (Slava) Krutelyov
TAMU: Physics Dept Texas A&M MS4242, College Station, TX 77843-4242
CERN: 42-R-027
AIM/Skype: siava16 googleTalk: slava77@gmail.com
(630) 291-5128 Cell (US) +41 76 275 7116 Cell (CERN)



Reply to this email directly or view it on GitHub.

@StoyanStoynev
Copy link
Contributor

The PR deals with trajectory filtering and cleaning (by shared hits) code - I would suppose it is related to the conversion reconstruction.

@StoyanStoynev
Copy link
Contributor

@VinInn , I mentioned there was small change in the event size - the branch recoConversions_allConversions__RECO was affected (this is step3) and there were no difference at all at step2 (digis). It should be RECO

@VinInn
Copy link
Contributor Author

VinInn commented Jun 19, 2014

Thanks for the investigation.
I thin kit is unrelated with this PR.
Please contact conversion responsibles to ask to investigate further

@StoyanStoynev
Copy link
Contributor

+1
OK, only the difference above found, probably related to what commented in #4008. To notify relevant responsible persons.
Tested 33d013d on top of CMSSW_7_2_X_2014-06-17-1400.
No difference except the above, ran short and extended matrix tests (upto 2k events per wf).
Jenkins were also fine (except there is an issue, noticed previously, seen in all latest PR tests : see for instance
the latest plots/pages of https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisons/CMSSW_7_2_X_2014-06-18-0200+4293/2891/alternative-comparisons/4.53-result.ps.gz showing differences in pixel variables).

@ktf
Copy link
Contributor

ktf commented Jun 19, 2014

Bypassing alca and DB. Complain if not ok.

ktf added a commit that referenced this pull request Jun 19, 2014
@ktf ktf merged commit 0d8abef into cms-sw:CMSSW_7_2_X Jun 19, 2014
@ktf
Copy link
Contributor

ktf commented Jun 19, 2014

For the record, when running under valgrind 19.0, I've an error at:

https://github.com/cms-sw/cmssw/blame/743bde9939bbc4f34c2b4fd022cfb980314f912a/RecoLocalTracker/SiPixelRecHits/src/PixelCPEBase.cc#L197

==15250== Conditional jump or move depends on uninitialised value(s)
==15250==    at 0x1E5BCA54: PixelCPEBase::fillDetParams() (PixelCPEBase.cc:197)
==15250==    by 0x1E5BC63E: PixelCPEBase::PixelCPEBase(edm::ParameterSet const&, MagneticField const*, TrackerGeometry const&, SiPixelLorentzAngle const*, SiPixelGenErrorDBObject const*, SiPixelTemplateDBObject const*, SiPixelLorentzAngle 
const*, int) (PixelCPEBase.cc:134)
==15250==    by 0x1E5CC230: PixelCPEGeneric::PixelCPEGeneric(edm::ParameterSet const&, MagneticField const*, TrackerGeometry const&, SiPixelLorentzAngle const*, SiPixelGenErrorDBObject const*, SiPixelTemplateDBObject const*, SiPixelLorentz
Angle const*) (PixelCPEGeneric.cc:51)
==15250==    by 0x2B8A16A1: PixelCPEGenericESProducer::produce(TkPixelCPERecord const&) (PixelCPEGenericESProducer.cc:95)
==15250==    by 0x2B8B2F4B: edm::eventsetup::Callback<PixelCPEGenericESProducer, boost::shared_ptr<PixelClusterParameterEstimator>, TkPixelCPERecord, edm::eventsetup::CallbackSimpleDecorator<TkPixelCPERecord> >::operator()(TkPixelCPERecord
 const&) (Callback.h:71)
==15250==    by 0x2B8B2E1E: edm::eventsetup::CallbackProxy<edm::eventsetup::Callback<PixelCPEGenericESProducer, boost::shared_ptr<PixelClusterParameterEstimator>, TkPixelCPERecord, edm::eventsetup::CallbackSimpleDecorator<TkPixelCPERecord>
 >, TkPixelCPERecord, boost::shared_ptr<PixelClusterParameterEstimator> >::getImpl(edm::eventsetup::EventSetupRecord const&, edm::eventsetup::DataKey const&) (CallbackProxy.h:61)
==15250==    by 0x48D5906: edm::eventsetup::DataProxy::get(edm::eventsetup::EventSetupRecord const&, edm::eventsetup::DataKey const&, bool) const (in /afs/cern.ch/cms/sw/ReleaseCandidates/vol1/slc6_amd64_gcc481/cms/cmssw/CMSSW_7_2_X_2014-0
6-16-1400/lib/slc6_amd64_gcc481/libFWCoreFramework.so)
==15250==    by 0x4929B6F: edm::eventsetup::EventSetupRecord::doGet(edm::eventsetup::DataKey const&, bool) const (in /afs/cern.ch/cms/sw/ReleaseCandidates/vol1/slc6_amd64_gcc481/cms/cmssw/CMSSW_7_2_X_2014-06-16-1400/lib/slc6_amd64_gcc481/l
ibFWCoreFramework.so)
==15250==    by 0x2C71F74D: edm::EventSetupRecordDataGetter::doGet(edm::EventSetup const&) (in /afs/cern.ch/cms/sw/ReleaseCandidates/vol1/slc6_amd64_gcc481/cms/cmssw/CMSSW_7_2_X_2014-06-16-1400/lib/slc6_amd64_gcc481/pluginFWCoreModules.so)
==15250==    by 0x48D9DB0: edm::EDAnalyzer::doBeginRun(edm::RunPrincipal const&, edm::EventSetup const&, edm::ModuleCallingContext const*) (in /afs/cern.ch/cms/sw/ReleaseCandidates/vol1/slc6_amd64_gcc481/cms/cmssw/CMSSW_7_2_X_2014-06-16-14
00/lib/slc6_amd64_gcc481/libFWCoreFramework.so)
==15250==    by 0x49B662F: edm::WorkerT<edm::EDAnalyzer>::implDoBegin(edm::RunPrincipal&, edm::EventSetup const&, edm::ModuleCallingContext const*) (in /afs/cern.ch/cms/sw/ReleaseCandidates/vol1/slc6_amd64_gcc481/cms/cmssw/CMSSW_7_2_X_2014
-06-16-1400/lib/slc6_amd64_gcc481/libFWCoreFramework.so)
==15250==    by 0x4906E02: decltype ({parm#1}()) edm::convertException::wrap<bool edm::Worker::doWork<edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0> >(edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0>::
MyPrincipal&, edm::EventSetup const&, edm::CPUTimer*, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0>::Context const*)::{lambda()#1}>(bool edm::Worker::doWork<edm::OccurrenceTrai
ts<edm::RunPrincipal, (edm::BranchActionType)0> >(edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0>::MyPrincipal&, edm::EventSetup const&, edm::CPUTimer*, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm
::RunPrincipal, (edm::BranchActionType)0>::Context const*)::{lambda()#1}) (in /afs/cern.ch/cms/sw/ReleaseCandidates/vol1/slc6_amd64_gcc481/cms/cmssw/CMSSW_7_2_X_2014-06-16-1400/lib/slc6_amd64_gcc481/libFWCoreFramework.so)

can someone look into it?

@VinInn
Copy link
Contributor Author

VinInn commented Jun 19, 2014

most probably irrelevant
p.theTopol may not be initialized (t.b.c.)
the probablility that is == to topol by chance is about 1 in 2^64
v.

On 19 Jun, 2014, at 4:45 PM, Giulio Eulisse notifications@github.com wrote:

For the record, when running under valgrind 19.0, I've an error at:

https://github.com/cms-sw/cmssw/blame/743bde9939bbc4f34c2b4fd022cfb980314f912a/RecoLocalTracker/SiPixelRecHits/src/PixelCPEBase.cc#L197

==15250== Conditional jump or move depends on uninitialised value(s)
==15250== at 0x1E5BCA54: PixelCPEBase::fillDetParams() (PixelCPEBase.cc:197)
==15250== by 0x1E5BC63E: PixelCPEBase::PixelCPEBase(edm::ParameterSet const&, MagneticField const_, TrackerGeometry const&, SiPixelLorentzAngle const_, SiPixelGenErrorDBObject const_, SiPixelTemplateDBObject const_, SiPixelLorentzAngle
const_, int) (PixelCPEBase.cc:134)
==15250== by 0x1E5CC230: PixelCPEGeneric::PixelCPEGeneric(edm::ParameterSet const&, MagneticField const_, TrackerGeometry const&, SiPixelLorentzAngle const_, SiPixelGenErrorDBObject const_, SiPixelTemplateDBObject const_, SiPixelLorentz
Angle const_) (PixelCPEGeneric.cc:51)
==15250== by 0x2B8A16A1: PixelCPEGenericESProducer::produce(TkPixelCPERecord const&) (PixelCPEGenericESProducer.cc:95)
==15250== by 0x2B8B2F4B: edm::eventsetup::Callback<PixelCPEGenericESProducer, boost::shared_ptr, TkPixelCPERecord, edm::eventsetup::CallbackSimpleDecorator >::operator()(TkPixelCPERecord
const&) (Callback.h:71)
==15250== by 0x2B8B2E1E: edm::eventsetup::CallbackProxy<edm::eventsetup::Callback<PixelCPEGenericESProducer, boost::shared_ptr, TkPixelCPERecord, edm::eventsetup::CallbackSimpleDecorator

, TkPixelCPERecord, boost::shared_ptr >::getImpl(edm::eventsetup::EventSetupRecord const&, edm::eventsetup::DataKey const&) (CallbackProxy.h:61)
==15250== by 0x48D5906: edm::eventsetup::DataProxy::get(edm::eventsetup::EventSetupRecord const&, edm::eventsetup::DataKey const&, bool) const (in /afs/cern.ch/cms/sw/ReleaseCandidates/vol1/slc6_amd64_gcc481/cms/cmssw/CMSSW_7_2_X_2014-0
6-16-1400/lib/slc6_amd64_gcc481/libFWCoreFramework.so)
==15250== by 0x4929B6F: edm::eventsetup::EventSetupRecord::doGet(edm::eventsetup::DataKey const&, bool) const (in /afs/cern.ch/cms/sw/ReleaseCandidates/vol1/slc6_amd64_gcc481/cms/cmssw/CMSSW_7_2_X_2014-06-16-1400/lib/slc6_amd64_gcc481/l
ibFWCoreFramework.so)
==15250== by 0x2C71F74D: edm::EventSetupRecordDataGetter::doGet(edm::EventSetup const&) (in /afs/cern.ch/cms/sw/ReleaseCandidates/vol1/slc6_amd64_gcc481/cms/cmssw/CMSSW_7_2_X_2014-06-16-1400/lib/slc6_amd64_gcc481/pluginFWCoreModules.so)
==15250== by 0x48D9DB0: edm::EDAnalyzer::doBeginRun(edm::RunPrincipal const&, edm::EventSetup const&, edm::ModuleCallingContext const_) (in /afs/cern.ch/cms/sw/ReleaseCandidates/vol1/slc6_amd64_gcc481/cms/cmssw/CMSSW_7_2_X_2014-06-16-14
00/lib/slc6_amd64_gcc481/libFWCoreFramework.so)
==15250== by 0x49B662F: edm::WorkerTedm::EDAnalyzer::implDoBegin(edm::RunPrincipal&, edm::EventSetup const&, edm::ModuleCallingContext const_) (in /afs/cern.ch/cms/sw/ReleaseCandidates/vol1/slc6_amd64_gcc481/cms/cmssw/CMSSW_7_2_X_2014
-06-16-1400/lib/slc6_amd64_gcc481/libFWCoreFramework.so)
==15250== by 0x4906E02: decltype ({parm#1}()) edm::convertException::wrap<bool edm::Worker::doWork<edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0> >(edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0>::
MyPrincipal&, edm::EventSetup const&, edm::CPUTimer_, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0>::Context const_)::{lambda()#1}>(bool edm::Worker::doWork<edm::OccurrenceTrai
ts<edm::RunPrincipal, (edm::BranchActionType)0> >(edm::OccurrenceTraits<edm::RunPrincipal, (edm::BranchActionType)0>::MyPrincipal&, edm::EventSetup const&, edm::CPUTimer_, edm::StreamID, edm::ParentContext const&, edm::OccurrenceTraits<edm
::RunPrincipal, (edm::BranchActionType)0>::Context const_)::{lambda()#1}) (in /afs/cern.ch/cms/sw/ReleaseCandidates/vol1/slc6_amd64_gcc481/cms/cmssw/CMSSW_7_2_X_2014-06-16-1400/lib/slc6_amd64_gcc481/libFWCoreFramework.so)

can someone look into it?


Reply to this email directly or view it on GitHub.

@slava77
Copy link
Contributor

slava77 commented Jun 19, 2014

I opened an issue in my area, feel free to continue to comment there on the allConversions reproducibility
slava77#45

@argiro
Copy link
Contributor

argiro commented Jun 20, 2014

On 19 Jun 2014, at 11:27, Vincenzo Innocente notifications@github.com wrote:

@StoyanStoynev
thanks Stoyan for the suggestions about TrajectoryFilter.
Tracking POG will consider it in next refactoring of the code.

For what EcalCond*Container is concerned: they is no chance that anything else than a EBAR or ECAL is passed there... @argiro to comment further

sorry I was traveling . I concur with Vincenzo. It’s very unlikely that the 1 conversion difference is from this piece of code


Reply to this email directly or view it on GitHub.

ktf added a commit that referenced this pull request Jun 20, 2014
Backport TLS fixes from #4293 to CMSSW_7_1_X
This was referenced Jun 23, 2014
@VinInn VinInn deleted the TLScleanup branch April 21, 2017 11:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants