Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slowness in cmsRun starts #34633

Closed
makortel opened this issue Jul 26, 2021 · 64 comments · Fixed by #34735
Closed

Slowness in cmsRun starts #34633

makortel opened this issue Jul 26, 2021 · 64 comments · Fixed by #34735

Comments

@makortel
Copy link
Contributor

Last week @Dr15Jones and I noticed some slowness in cmsRun job startups (like up to 20 min). I noticed today

26-Jul-2021 16:08:26 CEST  Initiating request to open file file:step2.root
26-Jul-2021 16:08:28 CEST  Successfully opened file file:step2.root
%MSG-e HLTConfigProvider:  METAnalyzer:pfMetDQMAnalyzerMiniAOD@beginRun  26-Jul-2021 16:13:59 CEST Run: 274199
Begin processing the 1st record. Run 274199, Event 39389360, LumiSection 21 on stream 0 at 26-Jul-2021 16:15:00.232 CEST

in #34621 (comment).

I then tested locally workflow 10024.0 in CMSSW_12_0_X_2021-07-26-1100, and noticed the step2 to take 2 minutes from Successfully opened file and Begin processing the 1st record. The job really should not take 2 minutes to get there.

@makortel
Copy link
Contributor Author

assign core

@cmsbuild
Copy link
Contributor

New categories assigned: core

@Dr15Jones,@smuzaffar,@makortel you have been requested to review this Pull request/Issue and eventually sign? Thanks

@cmsbuild
Copy link
Contributor

A new Issue was created by @makortel Matti Kortelainen.

@Dr15Jones, @perrotta, @dpiparo, @silviodonato, @smuzaffar, @makortel, @qliphy can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@makortel
Copy link
Contributor Author

I ran the step2 of 10024.0 with strace, and noticed it to spend almost 2.5 min in accessing some CSC txt files. There were total of 12420 calls to open(), but only 23 unique file paths

L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePatternConversionLUT_pat0_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePatternConversionLUT_pat1_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePatternConversionLUT_pat2_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePatternConversionLUT_pat3_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePatternConversionLUT_pat4_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePosOffsetLUT_pat0_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePosOffsetLUT_pat1_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePosOffsetLUT_pat2_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePosOffsetLUT_pat3_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePosOffsetLUT_pat4_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodeSlopeLUT_pat0_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodeSlopeLUT_pat1_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodeSlopeLUT_pat2_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodeSlopeLUT_pat3_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodeSlopeLUT_pat4_v1.txt
L1Trigger/CSCTriggerPrimitives/data/LCTCode/CSCLUT_code_to_bestLCT.txt
L1Trigger/CSCTriggerPrimitives/data/LCTCode/CSCLUT_code_to_secondLCT.txt
L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_max_hs_ME1a.txt
L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_max_hs_ME1a_ganged.txt
L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_max_hs_ME1b.txt
L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_min_hs_ME1a.txt
L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_min_hs_ME1a_ganged.txt
L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_min_hs_ME1b.txt

@cms-sw/l1-l2 @cms-sw/csc-dpg-l2 Would you have any thoughts why these files are opened repeatedly (540 times each)?

@makortel
Copy link
Contributor Author

makortel commented Jul 26, 2021

Testing in an earlier IB CMSSW_12_0_X_2021-07-18-2300 I get time stamp difference between the first CSC txt file open and the last open to be 45 s (the number of file open calls is the same). Hmm.

@makortel
Copy link
Contributor Author

makortel commented Jul 26, 2021

I ran the 10024.0 step2 in IgProf in CMSSW_12_0_X_2021-07-26-1100 for 1 event, the profile is here
https://mkortela.web.cern.ch/mkortela/cgi-bin/navigator/CMSSW_12_0_X_2021-07-26-1100/10024_step2_perf

The construction of EventProcessor takes 115.80 s, compared to 63.02 s for the actual processing of 1 event
https://mkortela.web.cern.ch/mkortela/cgi-bin/navigator/CMSSW_12_0_X_2021-07-26-1100/10024_step2_perf/7

98.51 s (i.e. 85 % of the job startup) is spent in the constructor of CSCTriggerPrimitivesProducer
https://mkortela.web.cern.ch/mkortela/cgi-bin/navigator/CMSSW_12_0_X_2021-07-26-1100/10024_step2_perf/18
and 85.92 s of that is spent in reading files
https://mkortela.web.cern.ch/mkortela/cgi-bin/navigator/CMSSW_12_0_X_2021-07-26-1100/10024_step2_perf/27

@kpedro88
Copy link
Contributor

@dildick please take a look at this (yours are the most recent PRs merged in https://github.com/cms-sw/cmssw/commits/master/L1Trigger/CSCTriggerPrimitives)

@dildick
Copy link
Contributor

dildick commented Jul 26, 2021

"Would you have any thoughts why these files are opened repeatedly (540 times each)?" There are 540 CSC chambers, several LUT readers per chamber. I suppose this could be sped up by making a LUT managers class, instantiating them once, and then pass a shared pointer to each chamber.

@dildick
Copy link
Contributor

dildick commented Jul 26, 2021

These LUTs are not actually being used by any WF

L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePatternConversionLUT_pat0_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePatternConversionLUT_pat1_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePatternConversionLUT_pat2_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePatternConversionLUT_pat3_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePatternConversionLUT_pat4_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePosOffsetLUT_pat0_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePosOffsetLUT_pat1_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePosOffsetLUT_pat2_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePosOffsetLUT_pat3_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePosOffsetLUT_pat4_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodeSlopeLUT_pat0_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodeSlopeLUT_pat1_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodeSlopeLUT_pat2_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodeSlopeLUT_pat3_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodeSlopeLUT_pat4_v1.txt
L1Trigger/CSCTriggerPrimitives/data/LCTCode/CSCLUT_code_to_bestLCT.txt
L1Trigger/CSCTriggerPrimitives/data/LCTCode/CSCLUT_code_to_secondLCT.txt
L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_max_hs_ME1a.txt
L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_max_hs_ME1a_ganged.txt
L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_max_hs_ME1b.txt
L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_min_hs_ME1a.txt
L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_min_hs_ME1a_ganged.txt
L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_min_hs_ME1b.txt

These should be only if a certain Run-3 flag is set

L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePatternConversionLUT_pat0_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePatternConversionLUT_pat1_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePatternConversionLUT_pat2_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePatternConversionLUT_pat3_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePatternConversionLUT_pat4_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePosOffsetLUT_pat0_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePosOffsetLUT_pat1_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePosOffsetLUT_pat2_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePosOffsetLUT_pat3_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePosOffsetLUT_pat4_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodeSlopeLUT_pat0_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodeSlopeLUT_pat1_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodeSlopeLUT_pat2_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodeSlopeLUT_pat3_v1.txt
L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodeSlopeLUT_pat4_v1.txt

but that flag is still False.

@kpedro88
Copy link
Contributor

@dildick yes, that sounds like a better implementation - in particular, it seems like this info could be obtained via EventSetup. (Then you don't have to pass a shared pointer manually; the framework will give read-only access to whatever needs it.)

Do you know why this behavior would have changed with the recent PRs?

@smuzaffar
Copy link
Contributor

smuzaffar commented Jul 26, 2021

then something is wrong @dildick . I also ran 10024.0 step2 ( runTheMatrix.py -i all -l 11602.0 -t 4 --ibeos ) and following files were opened 2160 times each (may be 540*4 threads)

   2160 L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePatternConversionLUT_pat0_v1.txt
   2161 L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePatternConversionLUT_pat1_v1.txt
   2160 L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePatternConversionLUT_pat2_v1.txt
   2160 L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePatternConversionLUT_pat3_v1.txt
   2160 L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePatternConversionLUT_pat4_v1.txt
   2160 L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePosOffsetLUT_pat0_v1.txt
   2161 L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePosOffsetLUT_pat1_v1.txt
   2160 L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePosOffsetLUT_pat2_v1.txt
   2160 L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePosOffsetLUT_pat3_v1.txt
   2160 L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodePosOffsetLUT_pat4_v1.txt
   2161 L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodeSlopeLUT_pat0_v1.txt
   2161 L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodeSlopeLUT_pat1_v1.txt
   2161 L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodeSlopeLUT_pat2_v1.txt
   2160 L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodeSlopeLUT_pat3_v1.txt
   2162 L1Trigger/CSCTriggerPrimitives/data/CCLUT/CSCComparatorCodeSlopeLUT_pat4_v1.txt
   2160 L1Trigger/CSCTriggerPrimitives/data/LCTCode/CSCLUT_code_to_bestLCT.txt
   2161 L1Trigger/CSCTriggerPrimitives/data/LCTCode/CSCLUT_code_to_secondLCT.txt
   2160 L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_max_hs_ME1a.txt
   2161 L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_max_hs_ME1a_ganged.txt
   2160 L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_max_hs_ME1b.txt
   2160 L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_min_hs_ME1a.txt
   2160 L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_min_hs_ME1a_ganged.txt
   2162 L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_min_hs_ME1b.txt

@dildick
Copy link
Contributor

dildick commented Jul 26, 2021

I recently modified the GEM-CSC trigger for data vs emulation studies. It relies heavily on LUTs, but they are all in L1Trigger/CSCTriggerPrimitives/data/GEMCSC

This line needs to be in a if (runCCLUT_) statement
https://github.com/cms-sw/cmssw/blob/master/L1Trigger/CSCTriggerPrimitives/src/CSCCathodeLCTProcessor.cc#L107

How would such thing work via EventSetup? Is there an example somewhere I can look into?

@dildick
Copy link
Contributor

dildick commented Jul 26, 2021

2160 L1Trigger/CSCTriggerPrimitives/data/LCTCode/CSCLUT_code_to_bestLCT.txt
   2161 L1Trigger/CSCTriggerPrimitives/data/LCTCode/CSCLUT_code_to_secondLCT.txt
   2160 L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_max_hs_ME1a.txt
   2161 L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_max_hs_ME1a_ganged.txt
   2160 L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_max_hs_ME1b.txt
   2160 L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_min_hs_ME1a.txt
   2160 L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_min_hs_ME1a_ganged.txt
   2162 L1Trigger/CSCTriggerPrimitives/data/ME11/CSCLUT_wg_min_hs_ME1b.txt

should only be loaded if a flag is set in the CSCALCTCrossCLCT class.

@dildick
Copy link
Contributor

dildick commented Jul 26, 2021

A PR I can suggest in the mean time to speed up the code (until I have a proper solution in place):

  • turn off GEM-CSC trigger in ME1/1 and ME2/1 for now
  • hide specific CSC LUTs behind an if statement

@kpedro88
Copy link
Contributor

@dildick here's one of my old PRs introducing an ESProducer and ESProduct for HCAL radiation damage, which also reads constants from txt files that are then accessed by different parts of the code: #18118

In particular, look at:

  • CalibCalorimetry/HcalPlugins/python/HBHEDarkening_cff.py
  • CalibCalorimetry/HcalPlugins/src/HBHEDarkeningEP.h
  • CalibCalorimetry/HcalPlugins/src/HBHEDarkeningEP.cc
  • CondFormats/DataRecord/interface/HBHEDarkeningRecord.h
  • CondFormats/DataRecord/src/HBHEDarkeningRecord.cc
  • CondFormats/HcalObjects/interface/HBHEDarkening.h
  • CondFormats/HcalObjects/src/HBHEDarkening.cc
  • CondFormats/HcalObjects/src/T_EventSetup_HBHEDarkening.cc

And for a simple usage example:

  • CalibCalorimetry/HcalPlugins/test/HBHEDarkeningAnalyzer.cc
  • CalibCalorimetry/HcalPlugins/test/hbhedarkeninganalyzer_cfg.py

(though modern code should use ESTokens, see the master version of the analyzer)

@dildick
Copy link
Contributor

dildick commented Jul 26, 2021

#34638 would temporarily fix some of the problems.

@dildick
Copy link
Contributor

dildick commented Jul 26, 2021

There is a record in CalibMuon/CSCCalibration already, e.g. CSCL1TPParametersConditions but that one is related to individual parameters, not entire lookup tables. We haven't used that one for the CSC trigger emulator since Run-1.

Basically, I would need to define something like CSCL1TPLookupTables with a corresponding class in DataRecord and CSCObjects, an ESProducer and perhaps an analyzer

@makortel
Copy link
Contributor Author

Thanks @dildick for the quick (partial) fix!

There is a record in CalibMuon/CSCCalibration already, e.g. CSCL1TPParametersConditions but that one is related to individual parameters, not entire lookup tables. We haven't used that one for the CSC trigger emulator since Run-1.

From the framework point of view it is perfectly fine to put many kinds of ES products into the same Record, as long as all those products have the same IOV (technically it is the Record's job denote the IOV). The ESSource for CSCL1TPParametersRcd already sets the IOV to be infinity

void CSCL1TPParametersConditions::setIntervalFor(const edm::eventsetup::EventSetupRecordKey &,
const edm::IOVSyncValue &,
edm::ValidityInterval &oValidity) {
oValidity = edm::ValidityInterval(edm::IOVSyncValue::beginOfTime(), edm::IOVSyncValue::endOfTime());
}

so in principle it would be possible to put both ESProducts into the same Record (assuming none of this stuff has ever been put into the CondDB, that imposes different constraints).

Written that, there could be other reasons to prefer a separate Record class for the new ESProduct, and that would be fine as well.

@makortel
Copy link
Contributor Author

I could add that I'm not convinced that these CSC txt files would be the only cause for the overall slowness (e.g. in the RelVal-INPUT tests). We can always speculate with EOS, CVMFS, Frontier etc being slow, but some further investigation into other potential issues could be useful.

@dildick
Copy link
Contributor

dildick commented Jul 27, 2021

If I understand correctly I would have to add files like

 CondFormats/CSCObjects/src/CSCL1TPLookupTable.cc
 CondFormats/CSCObjects/interface/CSCL1TPLookupTable.h
 CondFormats/CSCObjects/src/T_EventSetup_CSCL1TPLookupTable.cc

 CondFormats/DataRecord/src/CSCL1TPLookupTableRecord.cc
 CondFormats/DataRecord/interface/CSCL1TPLookupTableRecord.h

 CalibMuon/CSCCalibration/src/CSCL1TPLookupTableEP.h
 CalibMuon/CSCCalibration/src/CSCL1TPLookupTableEP.cc
 CalibMuon/CSCCalibration/python/CSCL1TPLookupTable_cff.py

for a new record called CSCL1TPLookupTable that contains all the lookup tables for the CSC local trigger.

At the moment, access to lookup tables is provided through L1Trigger/CSCTriggerPrimitives/interface/CSCLUTReader.h.

So that needs to be replaced by the ESProduct CSCL1TPLookupTable. Rather, CSCLUTReader needs to be integrated into CSCL1TPLookupTable. CSCL1TPLookupTable would necessarily have functions to access specific lookup tables.

It looks like HBHEDarkening::readDoseMap reads one LUT for the HCal darkening parameters which is called for each dosemap and that the results are stored in std::map<int, std::vector<std::vector<float>>> dosemaps_; Something similar might work for me, with the data in either std::vector<int> or std::map<unsigned, std::vector<int> >

The locations of the lookup tables go into CSCL1TPLookupTable_cff and most other files are copy-paste from Kevin's work.

@makortel
Copy link
Contributor Author

@dildick Your overall plan looks reasonable.

Assuming the CSCL1TPLookupTableEP is the ESProducer (or ESSource if it also sets the IOV for the Record), the separate header CSCL1TPLookupTableEP.h should be avoided (contents should be placed in the source file instead).

@ramonbrugman
Copy link

ramonbrugman commented Jul 30, 2021

True i also considered it and ive seen a lot of people had jammed repo's

@bainbrid
Copy link
Contributor

LowPtGsfElectronSeedProducer 35 s https://mkortela.web.cern.ch/mkortela/cgi-bin/navigator/CMSSW_12_0_X_2021-07-28-2300/test_11725.0_step3_06_perf/127

@bainbrid
RecoEgamma/ElectronIdentification/data/LowPtElectrons/
RunII_Autumn18_LowPtElectrons_unbiased.xml.gz and RunII_Autumn18_LowPtElectrons_displaced_pt_eta_biased.xml.gz

For a start these are candidates to be converted to root files.
Are you available to make a conversion some time soon?
Further on, is there an opportunity to reduce the size of the training?

Hi @slava77, yes, I can attempt to convert - should be successful and should speed things up here. As for reducing the model size, possibly yes, I'll enquire. However, I am about to take vacations in August. What is the timeline?

@slava77
Copy link
Contributor

slava77 commented Jul 30, 2021

Hi @slava77, yes, I can attempt to convert - should be successful and should speed things up here. As for reducing the model size, possibly yes, I'll enquire. However, I am about to take vacations in August. What is the timeline?

@bainbrid
thank you for confirming that the conversion to a .root file is possible. Are you still available before the vacation to do this?
I understand that a change in the model itself to make it smaller can take much longer; this can follow much later.

@Dr15Jones
Copy link
Contributor

@dildick thanks for the quick turn around! I have a few comments

const edm::ParameterSet& pset_;

needs to be

const edm::ParameterSet pset_;

as the lifetime of the object passed to the module's constructor is not guaranteed to be longer than the module.

In your set functions, e.g.

void set_cclutPosition(const t_lut& lut) { cclutPosition_ = lut; }

pass the argument by value and then do a std::move

void set_cclutPosition(t_lut lut) { cclutPosition_ = std::move(lut); }

Then when calling the set_ function also do a std::move

lut-> set_cclutPosition(std::move(cclutPosition));

This will avoid a copy of the entire container.
[NOTE: your code does not actually call set_cclutPosition, it actually calls set_cclutSlope twice!]

You can get rid of the statics

static const std::vector<int> wg_min_hs_ME1a_;

and, instead, in the .cc file just use an anonyous namespace with constexpr std::arrays. It will give you the same interface but will be built at compile time, not shared library load time.

namespace {
constexpr  std::array<int, 48> CSCALCTCrossCLCT::wg_min_hs_ME1a_ = {...};
...
}

@bainbrid
Copy link
Contributor

Hi @slava77, yes, I can attempt to convert - should be successful and should speed things up here. As for reducing the model size, possibly yes, I'll enquire. However, I am about to take vacations in August. What is the timeline?

@bainbrid
thank you for confirming that the conversion to a .root file is possible. Are you still available before the vacation to do this?
I understand that a change in the model itself to make it smaller can take much longer; this can follow much later.

Unfortunately not... My vacation starts tomorrow, and the conversion will likely require some small dev and some careful validation. I'll get to it as soon as I can.

@slava77
Copy link
Contributor

slava77 commented Jul 30, 2021

Unfortunately not... My vacation starts tomorrow, and the conversion will likely require some small dev and some careful validation. I'll get to it as soon as I can.

I created #34707.

Have a nice vacation.

@dildick
Copy link
Contributor

dildick commented Aug 2, 2021

@Dr15Jones I implemented your suggestions. Testing the code now. Getting this in WF 11634.0.

----- Begin Fatal Exception 30-Jul-2021 14:20:46 CDT-----------------------
An exception of category 'NoRecord' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 1 stream: 0
   [1] Running path 'HLTAnalyzerEndpath'
   [2] Prefetching for module L1TRawToDigi/'hltGtStage2Digis'
   [3] Prefetching for module RawDataCollectorByLabel/'rawDataCollector'
   [4] Prefetching for module CSCDigiToRawModule/'cscpacker'
   [5] Calling method for module CSCTriggerPrimitivesProducer/'simCscTriggerPrimitiveDigis'
Exception Message:
No "CSCL1TPLookupTableME11ILTRcd" record found in the EventSetup.n
 Please add an ESSource or ESProducer that delivers such a record.
----- End Fatal Exception -------------------------------------------------

Where/how should I insert the ES producer CSCL1TPLookupTableEP that delivers CSCL1TPLookupTableME11ILTRcd?

@dildick
Copy link
Contributor

dildick commented Aug 2, 2021

I think I need to inject it here: https://github.com/cms-sw/cmssw/blob/master/L1Trigger/L1TMuon/python/simDigis_cff.py#L16 and other trigger cff files that load the muon trigger.

@Dr15Jones
Copy link
Contributor

I think I need to inject it here: https://github.com/cms-sw/cmssw/blob/master/L1Trigger/L1TMuon/python/simDigis_cff.py#L16 and other trigger cff files that load the muon trigger.

Seems reasonable. In principal, any configuration that directly names the module of type 'CSCTriggerPrimitivesProducer' should have the ESProducer added. I do such checks by doing

> git grep CSCTriggerPrimitivesProducer | grep '\.py'

In that case only one hit

L1Trigger/CSCTriggerPrimitives/python/cscTriggerPrimitiveDigis_cfi.py:    "CSCTriggerPrimitivesProducer",

so it looks like this is more useful

> git grep cscTriggerPrimitiveDigis_cfi | grep '\.py'

which yields

DQM/L1TMonitor/python/L1TStage2Emulator_cff.py:from L1Trigger.CSCTriggerPrimitives.cscTriggerPrimitiveDigis_cfi import *
L1Trigger/CSCTriggerPrimitives/README.md:The configuration for the CSC local trigger is `cscTriggerPrimitiveDigis_cfi`. By default it is integrated into the standard L1 sequence which can be found [here](https://github.com/cms-sw/cmssw/blob/master/L1Trigger/L1TMuon/python/simDigis_cff.py#L15-L20). When emulating ALCTs/CLCTs/LCTs on simulated wire and comparator digis, the inputs are `"simMuonCSCDigis:MuonCSCWireDigi"` and `"simMuonCSCDigis:MuonCSCComparatorDigi"` respectively. When re-emulating ALCTs/CLCTs/LCTs from unpacked wire and comparator digis - from real LHC data, or GIF++ test-beam data - the inputs are `"muonCSCDigis:MuonCSCWireDigi"` and `"muonCSCDigis:MuonCSCComparatorDigi"` respectively. The configuration of the CSC local trigger as part of the standard L1 sequence on data can be found [here](https://github.com/cms-sw/cmssw/blob/master/L1Trigger/Configuration/python/ValL1Emulator_cff.py#L88-L96).
L1Trigger/CSCTriggerPrimitives/test/deprecated/CSCTriggerPrimitivesReader_cfi.py:from L1Trigger.CSCTriggerPrimitives.cscTriggerPrimitiveDigis_cfi import cscTriggerPrimitiveDigis
L1Trigger/CSCTriggerPrimitives/test/runCSCTriggerPrimitiveProducer_cfg.py:process.load("L1Trigger.CSCTriggerPrimitives.cscTriggerPrimitiveDigis_cfi")
L1Trigger/CSCTriggerPrimitives/test/runL1CSCTPEmulatorConfigAnalyzer_cfg.py:process.load('L1Trigger.CSCTriggerPrimitives.cscTriggerPrimitiveDigis_cfi')
L1Trigger/Configuration/python/L1MuonEmulator_cff.py:from L1Trigger.CSCTriggerPrimitives.cscTriggerPrimitiveDigis_cfi import *
L1Trigger/Configuration/python/ValL1Emulator_cff.py:from L1Trigger.CSCTriggerPrimitives.cscTriggerPrimitiveDigis_cfi import *
L1Trigger/Configuration/python/ValL1Emulator_cff.py:valCscTriggerPrimitiveDigis = L1Trigger.CSCTriggerPrimitives.cscTriggerPrimitiveDigis_cfi.cscTriggerPrimitiveDigis.clone()
L1Trigger/Configuration/python/customiseReEmul.py:    from L1Trigger.CSCTriggerPrimitives.cscTriggerPrimitiveDigis_cfi import cscTriggerPrimitiveDigis
L1Trigger/L1TMuon/python/simDigis_cff.py:import L1Trigger.CSCTriggerPrimitives.cscTriggerPrimitiveDigis_cfi
L1Trigger/L1TMuon/python/simDigis_cff.py:simCscTriggerPrimitiveDigis = L1Trigger.CSCTriggerPrimitives.cscTriggerPrimitiveDigis_cfi.cscTriggerPrimitiveDigis.clone(
L1Trigger/L1TMuonEndCap/test/RunTrackFinder_MC.py:process.load('L1Trigger.CSCTriggerPrimitives.cscTriggerPrimitiveDigis_cfi')

@dildick
Copy link
Contributor

dildick commented Aug 3, 2021

@Dr15Jones I have a running version of the CSC/GEM-CSC trigger with the lookup tables in eventsetup objects. Going to test it on a single muon relval and submit the PR later this week. I should check though if the HLT step is not crashing... I switched this off yesterday in my test.

@francescobrivio
Copy link
Contributor

@makortel is there a specific way to check the timing of loading the LUTs for CSC?
In order to test it with #34779 and see if it actually improves the timing?

@Dr15Jones
Copy link
Contributor

@francescobrivio I'm afraid @makortel is on vacation right now. It seems like if you did

> runTheMatrix.py -l 10024.0 -t 8 --maxSteps=2

Then step2 would be the one to look at. I'd suggest running valgrind on the step2 process both before and after applying the pull request and then look at the relative timing for the constructor of the module in question.

@francescobrivio
Copy link
Contributor

@francescobrivio I'm afraid @makortel is on vacation right now. It seems like if you did

> runTheMatrix.py -l 10024.0 -t 8 --maxSteps=2

Then step2 would be the one to look at. I'd suggest running valgrind on the step2 process both before and after applying the pull request and then look at the relative timing for the constructor of the module in question.

Thanks @Dr15Jones ! We'll try that then! (FYI @dildick)

@qliphy qliphy reopened this Aug 11, 2021
@qliphy
Copy link
Contributor

qliphy commented Aug 11, 2021

#34735 is only partial fix.

@bainbrid
Copy link
Contributor

LowPtGsfElectronSeedProducer 35 s https://mkortela.web.cern.ch/mkortela/cgi-bin/navigator/CMSSW_12_0_X_2021-07-28-2300/test_11725.0_step3_06_perf/127

@bainbrid
RecoEgamma/ElectronIdentification/data/LowPtElectrons/
RunII_Autumn18_LowPtElectrons_unbiased.xml.gz and RunII_Autumn18_LowPtElectrons_displaced_pt_eta_biased.xml.gz
For a start these are candidates to be converted to root files.
Are you available to make a conversion some time soon?
Further on, is there an opportunity to reduce the size of the training?

Hi @slava77, yes, I can attempt to convert - should be successful and should speed things up here. As for reducing the model size, possibly yes, I'll enquire. However, I am about to take vacations in August. What is the timeline?

Hi all, I will try to tackle this today. Can somebody please give me the commands used to generate the (IgProf?) profiling logs above so that I can test myself? Otherwise, I can just create the PR and let somebody else measure the effect?

@slava77
Copy link
Contributor

slava77 commented Aug 17, 2021

Otherwise, I can just create the PR and let somebody else measure the effect?

we can test it in the PR with "enable profiling"

@bainbrid
Copy link
Contributor

Otherwise, I can just create the PR and let somebody else measure the effect?

we can test it in the PR with "enable profiling"

Here is the PR: #34908

@perrotta
Copy link
Contributor

perrotta commented Nov 4, 2021

@makortel can this issue be considered closed by #34908?

@makortel
Copy link
Contributor Author

makortel commented Nov 4, 2021

Yes

@makortel makortel closed this as completed Nov 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.