Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prod-v18.0.X #66

Closed
bortigno opened this issue Apr 4, 2019 · 4 comments
Closed

Prod-v18.0.X #66

bortigno opened this issue Apr 4, 2019 · 4 comments
Assignees
Labels
Milestone

Comments

@bortigno
Copy link
Contributor

bortigno commented Apr 4, 2019

Working on it on branch pre-prod-v18.0.0.

@bortigno bortigno added the 2018 label Apr 4, 2019
@bortigno bortigno added this to the 18.0.0 milestone Apr 4, 2019
@bortigno bortigno self-assigned this Apr 4, 2019
@bortigno
Copy link
Contributor Author

bortigno commented Apr 4, 2019

There is a interface problem in pytohon/Samples.py :

This is the error message suggesting that inputDBS is not part of the Sample class interface anymore...

Creating analyzer and crab config for SingleMu_2018A
Traceback (most recent call last):
  File "./crab/make_crab_script.py", line 81, in <module>
    line = line.replace('samp.inputDBS','%s' %samp.inputDBS)
AttributeError: sample instance has no attribute 'inputDBS'

Checking the master_2017_94X Samples.py file I find that inputDBS is a variable I added recently to be able to run on privately produced samples. Now added back.

I added some more flavours in case of test_run = True in python/Samples.py : in case of test_run = True (which could be a pre-production for example), the code prepares by default config files only with one era and one signal.

For a test the NUM=1 [1] which for the moment does not make much sense to me.

For the moment I committed everything and created an annotated tag pre-prod-v18.0.0 and launched the production:

Task name example: 190404_113938:bortigno_crab_SingleMu_2018A_2019_04_04_13_38_prod2018_pre-prod-v18p0p0

[1] from https://twiki.cern.ch/twiki/bin/view/CMSPublic/CRAB3ConfigurationFile
"suggests (but not impose) how many units (i.e. files, luminosity sections or events - depending on the splitting mode - see the note about Data.splitting below) to include in each job."

@bortigno
Copy link
Contributor Author

bortigno commented Apr 5, 2019

Jobs failed because of a missing file

== CMSSW: edm::FileInPath unable to find file 
MuonReferenceEfficiencies/EfficienciesStudies/2018_trigger/EfficienciesAndSF_2018Data_AfterMuonHLTUpdate.root 
anywhere in the search path.

I checked in the logs/crab_SingleMu_2018A_2019_04_04_13_38_prod2018_pre-prod-v18p0p0/inputs/90cb3f69-331d-48eb-baf2-391a689f8fc4default.tgz and indeed this file is not included in the tar. Though there are other files included in the crab .tar which are not CMSSW, for example KaMuCa files. The reason is explained here [1] : "CRAB adds to the user input sandbox any data and interface directory recursively found in $CMSSW_BASE/src."

In my case EfficienciesAndSF_2018Data_AfterMuonHLTUpdate.root is not under data or interface so I need to add it manually from the crab configuration parameter JobType.inputFiles. Some information on how the input files will be seen is reported in [2] : " if your application expects to find mydir/file1 you should put in crab configuration config.JobType.inputFiles='mydir'. While if you put config.Data.inputFiles='mydir/file1' your application needs to open file1."

My application is looking for full path from src, but I don't want to add the full package, nor I want a different application for crab submission and for local running.

I am running some test to see if I can find a small hack.

Use of wilcard in JobType.inputFiles

  • JobType.inputFiles = ['MuonReferenceEfficiencies/EfficienciesStudies/*/EfficienciesAndSF_2018Data_AfterMuonHLTUpdate.root']

Now I am editing the crab/template/crab_config.py and run crab/make_crab_script.py as test_run=True and submit it as a dryrun:

python crab/make_crab_script.py 

Getting annotated tag...
Production using code version pre-prod-v18.0.0 starting
crab production directory = crab_2019_04_05_12_05-pre-prod-v18.0.0

Creating analyzer and crab config for SingleMu_2018A
  * Wrote crab_2019_04_05_12_05-pre-prod-v18.0.0/analyzers/SingleMu_2018A.py
  * Wrote crab_2019_04_05_12_05-pre-prod-v18.0.0/configs/SingleMu_2018A.py

Creating analyzer and crab config for H2Mu_gg
  * Wrote crab_2019_04_05_12_05-pre-prod-v18.0.0/analyzers/H2Mu_gg.py
  * Wrote crab_2019_04_05_12_05-pre-prod-v18.0.0/configs/H2Mu_gg.py

Creating submit_all.sh and check_all.sh scripts

crab submit -c crab_2019_04_05_12_05-pre-prod-v18.0.0/configs/H2Mu_gg.py --dryrun
[...]
The input file 'MuonReferenceEfficiencies/EfficienciesStudies/*/EfficienciesAndSF_2018Data_AfterMuonHLTUpdate.root' taken from parameter config.JobType.inputFiles cannot be found.

So including this will not work.

Adding the full package

du -hs ../../MuonReferenceEfficiencies/
3.8G	../../MuonReferenceEfficiencies/

Not feasible.

Changing the EDAnalyzer.py for crab

The full path is in the _cff, so I need to strip of the full path the cms.strings process.dimuons.Trig_eff_3_file, MuID_eff_3_file, and MuIso_eff_3_file in crab/template/EDAnalyzer.py.

Testing now using

crab submit -c <crab_dir>/config/SingleMu_2018A.py --dryrun

The processed dataset name for 2018A was wrong (needs a v2 instead of v1) so I had to fix this in pythons/Samples.py

Change of package and file strategy

There is no way. I keep facing similar errors so I decided to add the files to the repository in the Ntupliser/DiMuon/data folder as done in the past.
All looks good now.

Created an annotated tag pre-prod-v18.0.2 and launching crab_2019_04_05_18_01-pre-prod-v18.0.2 production now.

[1] https://twiki.cern.ch/twiki/bin/view/CMSPublic/CRAB3FAQ#What_are_the_files_CRAB_adds_to
[2] https://twiki.cern.ch/twiki/bin/view/CMSPublic/CRAB3FAQ#How_are_the_inputFiles_handled_i

@bortigno bortigno changed the title Prod-v18.0.0 Prod-v18.0.X Apr 5, 2019
@bortigno
Copy link
Contributor Author

The first production "crab_2019_04_05_18_01-pre-prod-v18.0.2" failed because I didn't recompiled the python files.

After compiling I launched "crab_2019_04_26_14_46-pre-prod-v18.0.2" for which most of the samples are now running and seems to be fine. Though some of the samples had an outdated dataset name. I fixed them in python/Sample.py.

There was also a problem with the submission of ZJets_hiM_MG cause the sample is still in "PRODUCTION" and not in "VALID" mode in DBS. In order to tell crab to allow running on samples not "VALID" I have to add Data.allowNonValidInputDataset = True in the config.

After doing that I relaunched with also the right naming for the missing samples "crab_2019_04_26_16_53-pre-prod-v18.0.2-3-g3985ed2" which is now submitted fine.

@bortigno
Copy link
Contributor Author

bortigno commented May 15, 2019

Data/MC comparison plots available in
http://bortigno.web.cern.ch/bortigno/xmm/hmm/2018/mc_data_stack_pre-prod-v18p0p2/
Merged into master and moved into Prod.v18.1.X #68

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant