Skip to content
Geonmo Ryu edited this page Dec 1, 2016 · 26 revisions

CATTuple production

Please, follow below instruction.

1. Prepare for Job submit

  1. Create new tab to production at google doc
  • Recommend) Clone from last version's page.
  • Please, set the cell colour to red to prevent to use bad dataset.
  1. Modify catGetDatasetInfo to add new page's gid
    • On google spreadsheet, you can find gid from address bar. egs) gid=12345678
    • Add this gid to catGetDatasetInfo scripts's dict list. catGetDatasetInfo#L16
  2. Run catGetDatasetInfo script to download "dataset json" from Google SpreadSheet.

2. crab job submit

  1. Submit crab jobs by using "submitCrab3.py" script on prod directory.
  2. Manage the crab jobs : "crab status" and resubmit failed jobs.
  • It will take 2 day ~ 1 week to finished.( Dependant to site tranfser Quality)
  • Notice) Real data must be 100% success.
  • Info) For MC dataset, few failed tasks are tolerated to analysis.
  1. For real data, please make a "processedLumis json" file by using "crab report"
  2. "processedLumis json" will be made for silver JSON. So, we need to make golden JSON version.joinLumiJsonByAnd is useful to generate golden json joinLumiJsonByAnd $CMSSW_BASE/src/CATTools/CatProducer/data/LumiMask/ Cert_13TeV_16Dec2015ReReco_Collisions15_25ns_JSON.txt
  3. Copy theses "processedlumi json" and additional information files to /xrootd/store/group/CAT dir

3. Validation and update Google spreadsheet

  1. Update LFN path to google doc. ( start with "/store/group/CAT". Please, see last version page. )
    1. These LFN paths are found from "catGetDatasetInfo" script's result.
  2. Run catGetDatasetInfo again to check the updated paths and create file lists.
  3. Calculate total visited events.
  • for i in dataset_*.txt; do cat $i | grep -v '^#' | xargs -P20 -n1 edmFileUtil | grep events | awk '{x+=$6}END{print "'$i' " x}'; done
  • It takes very long time to scan files.
  • If you find broken files, you need to modify dataset_*.txt file to remove(comment) trouble files.
  • Update dataset's total event number and cells color also roll back.
  1. Calculate dataset's luminosity
  • brilcalc lumi --normtag /afs/cern.ch/user/l/lumipro/public/normtag_file/OfflineNormtagV2.json -i processedLumis.json -b stable -u /pb

  • Please, use latest normtag from PdmV homepage or lumi group's hypernews. (Wrong normtag can change luminosity.)

  • Input lumi value to real data's lumi column cell.