Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make CkfPattern MT ready #4754

Merged
merged 6 commits into from Jul 28, 2014
Merged

Make CkfPattern MT ready #4754

merged 6 commits into from Jul 28, 2014

Conversation

VinInn
Copy link
Contributor

@VinInn VinInn commented Jul 23, 2014

Technical change to allow to run CkfPattern in parallel.
For the time being is compiler flag protected.
Not obvious to use a unique central configuration switch to control it

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @VinInn (Vincenzo Innocente) for CMSSW_7_2_X.

Make CkfPattern MT ready

It involves the following packages:

RecoTracker/CkfPattern

@nclopezo, @cmsbuild, @Degano, @StoyanStoynev, @slava77 can you please review it and eventually sign? Thanks.
@ghellwig, @GiacomoSguazzoni, @rovere, @gpetruc, @cerati, @venturia this is something you requested to watch as well.
You can sign-off by replying to this message having '+1' in the first line of your reply.
You can reject by replying to this message having '-1' in the first line of your reply.

@VinInn
Copy link
Contributor Author

VinInn commented Jul 23, 2014

@Dr15Jones
when merged

git cms-addpkg RecoTracker/CkfPattern
scram b -j 8 USER_CXXFLAGS="-DVI_TBB"

@cmsbuild
Copy link
Contributor

-1
Tested at: d090a8a
When I ran the RelVals I found an error in the following worklfows:
4.53 step2

runTheMatrix-results/4.53_RunPhoton2012B+RunPhoton2012B+HLTD+RECODreHLT+HARVESTDreHLT/step2_RunPhoton2012B+RunPhoton2012B+HLTD+RECODreHLT+HARVESTDreHLT.log
----- Begin Fatal Exception 23-Jul-2014 14:05:31 CEST-----------------------
An exception of category 'PluginLibraryLoadError' occurred while
   [0] Constructing the EventProcessor
Exception Message:
unable to load /build/cmsbuild/jenkins-workarea/workspace/ib-integration-CMSSW_7_2_X-slc6_amd64_gcc481/CMSSW_7_2_X_2014-07-22-1400/lib/slc6_amd64_gcc481/pluginRecoTrackerCkfPatternPlugins.so because dlopen: cannot load any more object with static TLS
----- End Fatal Exception -------------------------------------------------

4.22 step2

runTheMatrix-results/4.22_RunCosmics2011A+RunCosmics2011A+RECOCOSD+ALCACOSD+SKIMCOSD+HARVESTDC/step2_RunCosmics2011A+RunCosmics2011A+RECOCOSD+ALCACOSD+SKIMCOSD+HARVESTDC.log
----- Begin Fatal Exception 23-Jul-2014 14:05:38 CEST-----------------------
An exception of category 'PluginLibraryLoadError' occurred while
   [0] Constructing the EventProcessor
Exception Message:
unable to load /build/cmsbuild/jenkins-workarea/workspace/ib-integration-CMSSW_7_2_X-slc6_amd64_gcc481/CMSSW_7_2_X_2014-07-22-1400/lib/slc6_amd64_gcc481/pluginRecoTrackerCkfPatternPlugins.so because dlopen: cannot load any more object with static TLS
----- End Fatal Exception -------------------------------------------------

401.0 step1

runTheMatrix-results/401.0_TTbarNewMix+TTbarFSPU2+HARVESTFS/step1_TTbarNewMix+TTbarFSPU2+HARVESTFS.log
----- Begin Fatal Exception 23-Jul-2014 14:06:17 CEST-----------------------
An exception of category 'PluginLibraryLoadError' occurred while
   [0] Constructing the EventProcessor
Exception Message:
unable to load /build/cmsbuild/jenkins-workarea/workspace/ib-integration-CMSSW_7_2_X-slc6_amd64_gcc481/CMSSW_7_2_X_2014-07-22-1400/lib/slc6_amd64_gcc481/pluginRecoEgammaEgammaElectronProducersPlugins.so because dlopen: cannot load any more object with static TLS
----- End Fatal Exception -------------------------------------------------

1000.0 step2

runTheMatrix-results/1000.0_RunMinBias2011A+RunMinBias2011A+TIER0+SKIMD+HARVESTDfst2+ALCASPLIT/step2_RunMinBias2011A+RunMinBias2011A+TIER0+SKIMD+HARVESTDfst2+ALCASPLIT.log
----- Begin Fatal Exception 23-Jul-2014 14:07:10 CEST-----------------------
An exception of category 'PluginLibraryLoadError' occurred while
   [0] Constructing the EventProcessor
Exception Message:
unable to load /build/cmsbuild/jenkins-workarea/workspace/ib-integration-CMSSW_7_2_X-slc6_amd64_gcc481/CMSSW_7_2_X_2014-07-22-1400/lib/slc6_amd64_gcc481/pluginRecoTrackerCkfPatternPlugins.so because dlopen: cannot load any more object with static TLS
----- End Fatal Exception -------------------------------------------------

1001.0 step2

runTheMatrix-results/1001.0_RunMinBias2011A+RunMinBias2011A+TIER0EXP+ALCAEXP+ALCAHARVD/step2_RunMinBias2011A+RunMinBias2011A+TIER0EXP+ALCAEXP+ALCAHARVD.log
----- Begin Fatal Exception 23-Jul-2014 14:08:13 CEST-----------------------
An exception of category 'PluginLibraryLoadError' occurred while
   [0] Constructing the EventProcessor
Exception Message:
unable to load /build/cmsbuild/jenkins-workarea/workspace/ib-integration-CMSSW_7_2_X-slc6_amd64_gcc481/CMSSW_7_2_X_2014-07-22-1400/lib/slc6_amd64_gcc481/pluginRecoTrackerCkfPatternPlugins.so because dlopen: cannot load any more object with static TLS
----- End Fatal Exception -------------------------------------------------

1003.0 step2

runTheMatrix-results/1003.0_RunMinBias2012A+RunMinBias2012A+RECODDQM+HARVESTDDQM/step2_RunMinBias2012A+RunMinBias2012A+RECODDQM+HARVESTDDQM.log
----- Begin Fatal Exception 23-Jul-2014 14:08:18 CEST-----------------------
An exception of category 'PluginLibraryLoadError' occurred while
   [0] Constructing the EventProcessor
Exception Message:
unable to load /build/cmsbuild/jenkins-workarea/workspace/ib-integration-CMSSW_7_2_X-slc6_amd64_gcc481/CMSSW_7_2_X_2014-07-22-1400/lib/slc6_amd64_gcc481/pluginRecoTrackerCkfPatternPlugins.so because dlopen: cannot load any more object with static TLS
----- End Fatal Exception -------------------------------------------------

50101.0 step2

runTheMatrix-results/50101.0_SingleMuPt10+SingleMuPt10FSIdINPUT+SingleMuPt10FS_ID/step2_SingleMuPt10+SingleMuPt10FSIdINPUT+SingleMuPt10FS_ID.log
----- Begin Fatal Exception 23-Jul-2014 14:09:02 CEST-----------------------
An exception of category 'PluginLibraryLoadError' occurred while
   [0] Constructing the EventProcessor
Exception Message:
unable to load /build/cmsbuild/jenkins-workarea/workspace/ib-integration-CMSSW_7_2_X-slc6_amd64_gcc481/CMSSW_7_2_X_2014-07-22-1400/lib/slc6_amd64_gcc481/pluginRecoEgammaEgammaElectronProducersPlugins.so because dlopen: cannot load any more object with static TLS
----- End Fatal Exception -------------------------------------------------

25202.0 step2

runTheMatrix-results/25202.0_TTbar_13+TTbar_13+DIGIUP15_PU25+RECOUP15_PU25+HARVEST+MINIAODMC/step2_TTbar_13+TTbar_13+DIGIUP15_PU25+RECOUP15_PU25+HARVEST+MINIAODMC.log
----- Begin Fatal Exception 23-Jul-2014 14:17:11 CEST-----------------------
An exception of category 'PluginLibraryLoadError' occurred while
   [0] Constructing the EventProcessor
Exception Message:
unable to load /build/cmsbuild/jenkins-workarea/workspace/ib-integration-CMSSW_7_2_X-slc6_amd64_gcc481/CMSSW_7_2_X_2014-07-22-1400/lib/slc6_amd64_gcc481/pluginRecoTrackerCkfPatternPlugins.so because dlopen: cannot load any more object with static TLS
----- End Fatal Exception -------------------------------------------------

you can see the results of the tests here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4754/430/summary.html

@VinInn
Copy link
Contributor Author

VinInn commented Jul 23, 2014

@Dr15Jones
we need a centralized solution to the TLS

@Dr15Jones
Copy link
Contributor

We were hoping to avoid it given we have no way of controlling the use of TLS in externals. The plan was to get the 'dynamic' increase in TLS space working in the system libraries. However, that only works for shared libraries and not static libraries.

@VinInn
Copy link
Contributor Author

VinInn commented Jul 23, 2014

is here the problem that "work_" is file-local and hidden?
should I make it global?

@Dr15Jones
Copy link
Contributor

I doubt the problem originates from file local vs global.
@ktf Does David A's change print the message from the failure? It seems odd that the message say 'static'. With David's change, do you know what is the new limiting factors?

@Dr15Jones
Copy link
Contributor

Doing a google search on the phrase from the error message gave this response which looks useful

http://stackoverflow.com/questions/19268293/matlab-error-cannot-open-with-static-tls

@Dr15Jones
Copy link
Contributor

Following more of the google search I think this one looks useful
http://stackoverflow.com/questions/14892101/cannot-load-any-more-object-with-static-tls

basically it seems to imply the problem is if one compiles some code without -fPIC.

@VinInn
Copy link
Contributor Author

VinInn commented Jul 23, 2014

On 23 Jul, 2014, at 4:32 PM, Chris Jones notifications@github.com wrote:

Following more of the google search I think this one looks useful
http://stackoverflow.com/questions/14892101/cannot-load-any-more-object-with-static-tls

basically it seems to imply the problem is if one compiles some code without -fPIC.
we compile with -fPIC

v.

@cmsbuild
Copy link
Contributor

Pull request #4754 was updated. @nclopezo, @cmsbuild, @Degano, @StoyanStoynev, @slava77 can you please check and sign again.

@Dr15Jones
Copy link
Contributor

@VinInn you can get rid of the headers all together by moving the declaration of the class into the appropriate .cc and move the edm plugin macro call to the same .cc file.

@ktf
Copy link
Contributor

ktf commented Jul 23, 2014

For the record. We do have the TLS fix from david in 72X already, since pre2. This is something else. Also we do have -fPIC everywhere. This needs further understanding.

@cmsbuild
Copy link
Contributor

-1
Tested at: 8fbbfcf
When I ran the RelVals I found an error in the following worklfows:
4.53 step2

runTheMatrix-results/4.53_RunPhoton2012B+RunPhoton2012B+HLTD+RECODreHLT+HARVESTDreHLT/step2_RunPhoton2012B+RunPhoton2012B+HLTD+RECODreHLT+HARVESTDreHLT.log
----- Begin Fatal Exception 23-Jul-2014 19:19:32 CEST-----------------------
An exception of category 'PluginLibraryLoadError' occurred while
   [0] Constructing the EventProcessor
Exception Message:
unable to load /build/cmsbuild/jenkins-workarea/workspace/ib-integration-CMSSW_7_2_X-slc6_amd64_gcc481/CMSSW_7_2_X_2014-07-22-1400/lib/slc6_amd64_gcc481/pluginRecoTrackerCkfPatternPlugins.so because dlopen: cannot load any more object with static TLS
----- End Fatal Exception -------------------------------------------------

4.22 step2

runTheMatrix-results/4.22_RunCosmics2011A+RunCosmics2011A+RECOCOSD+ALCACOSD+SKIMCOSD+HARVESTDC/step2_RunCosmics2011A+RunCosmics2011A+RECOCOSD+ALCACOSD+SKIMCOSD+HARVESTDC.log
----- Begin Fatal Exception 23-Jul-2014 19:19:42 CEST-----------------------
An exception of category 'PluginLibraryLoadError' occurred while
   [0] Constructing the EventProcessor
Exception Message:
unable to load /build/cmsbuild/jenkins-workarea/workspace/ib-integration-CMSSW_7_2_X-slc6_amd64_gcc481/CMSSW_7_2_X_2014-07-22-1400/lib/slc6_amd64_gcc481/pluginRecoTrackerCkfPatternPlugins.so because dlopen: cannot load any more object with static TLS
----- End Fatal Exception -------------------------------------------------

401.0 step1

runTheMatrix-results/401.0_TTbarNewMix+TTbarFSPU2+HARVESTFS/step1_TTbarNewMix+TTbarFSPU2+HARVESTFS.log
----- Begin Fatal Exception 23-Jul-2014 19:20:28 CEST-----------------------
An exception of category 'PluginLibraryLoadError' occurred while
   [0] Constructing the EventProcessor
Exception Message:
unable to load /build/cmsbuild/jenkins-workarea/workspace/ib-integration-CMSSW_7_2_X-slc6_amd64_gcc481/CMSSW_7_2_X_2014-07-22-1400/lib/slc6_amd64_gcc481/pluginRecoEgammaEgammaElectronProducersPlugins.so because dlopen: cannot load any more object with static TLS
----- End Fatal Exception -------------------------------------------------

1000.0 step2

runTheMatrix-results/1000.0_RunMinBias2011A+RunMinBias2011A+TIER0+SKIMD+HARVESTDfst2+ALCASPLIT/step2_RunMinBias2011A+RunMinBias2011A+TIER0+SKIMD+HARVESTDfst2+ALCASPLIT.log
----- Begin Fatal Exception 23-Jul-2014 19:21:34 CEST-----------------------
An exception of category 'PluginLibraryLoadError' occurred while
   [0] Constructing the EventProcessor
Exception Message:
unable to load /build/cmsbuild/jenkins-workarea/workspace/ib-integration-CMSSW_7_2_X-slc6_amd64_gcc481/CMSSW_7_2_X_2014-07-22-1400/lib/slc6_amd64_gcc481/pluginRecoTrackerCkfPatternPlugins.so because dlopen: cannot load any more object with static TLS
----- End Fatal Exception -------------------------------------------------

1001.0 step2

runTheMatrix-results/1001.0_RunMinBias2011A+RunMinBias2011A+TIER0EXP+ALCAEXP+ALCAHARVD/step2_RunMinBias2011A+RunMinBias2011A+TIER0EXP+ALCAEXP+ALCAHARVD.log
----- Begin Fatal Exception 23-Jul-2014 19:22:23 CEST-----------------------
An exception of category 'PluginLibraryLoadError' occurred while
   [0] Constructing the EventProcessor
Exception Message:
unable to load /build/cmsbuild/jenkins-workarea/workspace/ib-integration-CMSSW_7_2_X-slc6_amd64_gcc481/CMSSW_7_2_X_2014-07-22-1400/lib/slc6_amd64_gcc481/pluginRecoTrackerCkfPatternPlugins.so because dlopen: cannot load any more object with static TLS
----- End Fatal Exception -------------------------------------------------

1003.0 step2

runTheMatrix-results/1003.0_RunMinBias2012A+RunMinBias2012A+RECODDQM+HARVESTDDQM/step2_RunMinBias2012A+RunMinBias2012A+RECODDQM+HARVESTDDQM.log
----- Begin Fatal Exception 23-Jul-2014 19:22:28 CEST-----------------------
An exception of category 'PluginLibraryLoadError' occurred while
   [0] Constructing the EventProcessor
Exception Message:
unable to load /build/cmsbuild/jenkins-workarea/workspace/ib-integration-CMSSW_7_2_X-slc6_amd64_gcc481/CMSSW_7_2_X_2014-07-22-1400/lib/slc6_amd64_gcc481/pluginRecoTrackerCkfPatternPlugins.so because dlopen: cannot load any more object with static TLS
----- End Fatal Exception -------------------------------------------------

50101.0 step2

runTheMatrix-results/50101.0_SingleMuPt10+SingleMuPt10FSIdINPUT+SingleMuPt10FS_ID/step2_SingleMuPt10+SingleMuPt10FSIdINPUT+SingleMuPt10FS_ID.log
----- Begin Fatal Exception 23-Jul-2014 19:23:34 CEST-----------------------
An exception of category 'PluginLibraryLoadError' occurred while
   [0] Constructing the EventProcessor
Exception Message:
unable to load /build/cmsbuild/jenkins-workarea/workspace/ib-integration-CMSSW_7_2_X-slc6_amd64_gcc481/CMSSW_7_2_X_2014-07-22-1400/lib/slc6_amd64_gcc481/pluginRecoEgammaEgammaElectronProducersPlugins.so because dlopen: cannot load any more object with static TLS
----- End Fatal Exception -------------------------------------------------

25202.0 step2

runTheMatrix-results/25202.0_TTbar_13+TTbar_13+DIGIUP15_PU25+RECOUP15_PU25+HARVEST+MINIAODMC/step2_TTbar_13+TTbar_13+DIGIUP15_PU25+RECOUP15_PU25+HARVEST+MINIAODMC.log
----- Begin Fatal Exception 23-Jul-2014 19:32:00 CEST-----------------------
An exception of category 'PluginLibraryLoadError' occurred while
   [0] Constructing the EventProcessor
Exception Message:
unable to load /build/cmsbuild/jenkins-workarea/workspace/ib-integration-CMSSW_7_2_X-slc6_amd64_gcc481/CMSSW_7_2_X_2014-07-22-1400/lib/slc6_amd64_gcc481/pluginRecoTrackerCkfPatternPlugins.so because dlopen: cannot load any more object with static TLS
----- End Fatal Exception -------------------------------------------------

you can see the results of the tests here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4754/436/summary.html

@VinInn
Copy link
Contributor Author

VinInn commented Jul 24, 2014

xrootd
/home/vin/smallMatrix/4.22_RunCosmics2011A+RunCosmics2011A+RECOCOSD+ALCACOSD+SKIMCOSD+HARVESTDC /home/vin/TrackerMT/CMSSW_7_2_X_2014-07-24-0200 
[innocent@vinavx2 4.22_RunCosmics2011A+RunCosmics2011A+RECOCOSD+ALCACOSD+SKIMCOSD+HARVESTDC]$ cmsRun step2_RAW2DIGI_L1Reco_RECO_DQM_ALCA.py
24-Jul-2014 18:24:52 CEST  Initiating request to open file file:049F6443-8E53-E011-A943-003048F117EA.root
24-Jul-2014 18:24:52 CEST  Successfully opened file file:049F6443-8E53-E011-A943-003048F117EA.root
HLTConfigData::dump: ProcessName = HLT
HLTConfigData::dump: GlobalTag = GR_H_V15::All
HLTConfigData::dump: TableName = /cdaq/special/Interfill/v2.1/HLT/V2
Begin processing the 1st record. Run 160960, Event 10001082, LumiSection 277 at 24-Jul-2014 18:25:08.310 CEST
%MSG-w EcalDQM:  EcalDQMonitorTask:ecalMonitorTask 24-Jul-2014 18:25:13 CEST  Run: 160960 Event: 10001082
Ecal Monitor Source::runOnCollection: EBReducedRecHit does not exist
%MSG

works...

@VinInn
Copy link
Contributor Author

VinInn commented Jul 24, 2014

from local file works, after a file is open from eos ldopen os unhappy

@VinInn
Copy link
Contributor Author

VinInn commented Jul 24, 2014

with setenv LD_PRELOAD /home/vin/TrackerMT/CMSSW_7_2_X_2014-07-24-0200/lib/slc6_amd64_gcc481/pluginRecoTrackerCkfPatternPlugins.so

the matrix succeeds..

@Dr15Jones
Copy link
Contributor

Reading from EOS is done via xrootd and xrootd starts and stops threads which causes the TLS problem.

@VinInn
Copy link
Contributor Author

VinInn commented Jul 25, 2014

it is a bad idea to add -fopenmp only to a plugin
should be added to the main...

@cmsbuild
Copy link
Contributor

Pull request #4754 was updated. @nclopezo, @cmsbuild, @Degano, @StoyanStoynev, @slava77 can you please check and sign again.

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

@slava77
Copy link
Contributor

slava77 commented Jul 28, 2014

+1

for #4754 b44cb33
checked in regular mode (no MT) in CMSSW_7_2_X_2014-07-25-0200 (are sign397)
no differences as expected

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next CMSSW_7_2_X IBs unless changes (tests are also fine).

nclopezo added a commit that referenced this pull request Jul 28, 2014
RecoTracker/CkfPattern -- Make CkfPattern MT ready
@nclopezo nclopezo merged commit 1b6bfb5 into cms-sw:CMSSW_7_2_X Jul 28, 2014
@VinInn VinInn deleted the TrackerOMP branch July 13, 2016 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants