Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CMS: provenance metadata for 2015 MC #68

Closed
katilp opened this issue Jan 1, 2020 · 7 comments
Closed

CMS: provenance metadata for 2015 MC #68

katilp opened this issue Jan 1, 2020 · 7 comments

Comments

@katilp
Copy link
Member

katilp commented Jan 1, 2020

(from #65)

Run2 legacy metadata of 2016-2018 will be extracted from the ultra-legacy production, see https://github.com/cernopendata/data-curation/tree/master/cms-run2-ultra-legacy-production.

Run1 metadata has been extracted with

2015 is a separate production and the script may need to be modified.

Run2 Fall15 MiniAOD v2 campaign (76X version 2)dataset=/*/*RunIIFall15MiniAODv2-PU25nsData2015v1*/*
Size: 227.4TB
7114 datasets

Check if the 2015 MC extraction works with the ultra-legacy script, or if the previous Run1 scripts should be used.

Run the script and prepare the records.

@katilp katilp changed the title CMS: provenac CMS: provenance metadata for 2015 MC Jan 1, 2020
@katilp katilp added this to the CMS-Spring20 milestone Jan 1, 2020
@katilp
Copy link
Member Author

katilp commented Apr 4, 2020

NB that for Run1, the datasets are in AODSIM format, but for Run2 they will be in MiniAODSIM, which is an additional step in the production chain, which was not present during Run1.
See an example of a MiniAODSIM dataset that has been released in the context of ML sample release July 2019. The outcome of the script should be like that.

@mokotus mokotus self-assigned this Apr 8, 2020
@katilp
Copy link
Member Author

katilp commented Apr 30, 2020

Because of the features in 2015 MC production, some McM entries do not have the input data field filled although it is available in DAS.

The script in https://github.com/cernopendata/data-curation/tree/master/cms-YYYY-simulated-datasets can be used for all further provenance extractions. The query logic (parent relationship from DAS), should not be modified for 2015, but can be done for UL afterwards.

To test:

  • Make a short CMS-0000-mc-datasets.txt in input
  • Go to bash
  • voms-proxy-init --voms cms --rfc --valid 190:00
  • Edit script to give no-create-eos-indexes and input to the first step
  • go to python3 e.g. with source /afs/cern.ch/user/r/reana/public/reana/bin/activate and check python version with python -V
  • Run YYYY=0000 ./make_local_cache.sh

To do:

  • comment out the config extraction, and check in how many cases input data is not available in McM if running through all files.

@katilp katilp assigned katilp and unassigned mokotus Feb 16, 2021
@katilp katilp removed this from the CMS-Spring20 milestone Oct 18, 2021
@katilp katilp added this to In progress in CMS-2015-Open-Data-Release Oct 18, 2021
@katilp
Copy link
Member Author

katilp commented Oct 18, 2021

Extraction done by @OsamaMomani
some modifications to the script, all to be added in a new cms-2015-simulated-datasets folder

OK, but provenance not found for three datasets. Manual search through DAS and MCM gives the following:

➡️ Leave out the invalid one, and build the provenance of the two others by hand.

@OsamaMomani
Copy link
Member

These 3 datasets affect three steps in the following records

  1. For the record /AToZhToLLTauTau_M-260_13TeV_madgraph_4f_LO/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM
  2. For the record /GG_M-2000To4000_Pt-70_13TeV-sherpa/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v1/MINIAODSIM
  3. For the record /ZprimeToA0hToA0chichihZZTo4l_2HDM_MZp-1400_MA0-300_13TeV-madgraph-pythia8/RunIIFall15MiniAODv2-PU25nsData2015v1_76X_mcRun2_asymptotic_v12-v2/MINIAODSIM

@katilp
Copy link
Member Author

katilp commented Dec 17, 2021

McM queries are empty for these output datasets.

@katilp
Copy link
Member Author

katilp commented Dec 18, 2021

The missing 5 configs are in /eos/opendata/cms/upload/kati/addedconfigs ready to be move under /eos/opendata/cms/configuration-files/MonteCarlo2015

The records are in

cernopendata/modules/fixtures/data/records/cms-simulated-datasets-2015-part_01.json:    "recid": "15333",
cernopendata/modules/fixtures/data/records/cms-simulated-datasets-2015-part_04.json:    "recid": "16770",
cernopendata/modules/fixtures/data/records/cms-simulated-datasets-2015-part_14.json:    "recid": "21714",

@tiborsimko
Copy link
Member

The 5 files are already in the target location, for example:

/eos/opendata/cms/configuration-files/MonteCarlo2015/1afa7fc4132ca1afecdba4b11c0d3b18.configFile.py

@katilp katilp closed this as completed Dec 20, 2021
CMS-2015-Open-Data-Release automation moved this from In progress to Done Dec 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants