Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Turn-off GEM-CSC integrated local trigger in valCscStage2Digis [11_3_X] #33679

Merged
merged 1 commit into from May 10, 2021
Merged

Turn-off GEM-CSC integrated local trigger in valCscStage2Digis [11_3_X] #33679

merged 1 commit into from May 10, 2021

Conversation

dildick
Copy link
Contributor

@dildick dildick commented May 10, 2021

PR description:

Turn-off GEM-CSC integrated local trigger in valCscStage2Digis. GEM-CSC trigger is not yet deployed at P5. Previously it was turned off in 11_2_X.

PR validation:

Tested with WF 11634.0.

Also tested this on a data event (341169:517:9310078) that was crashing the T0-replay. This recipe, which crashed before, should now run:

cmsrel CMSSW_11_3_0
cd CMSSW_11_3_0/src/
cmsenv
git cms-init
git cms-rebase-topic -u 33679
scram b -j 4
edmPickEvents.py "/Cosmics/Tier0_REPLAY_2021-v2105100737/RAW" 341169:517:9310078
edmCopyPickMerge outputFile=pickevents.root   eventsToProcess=341169:9310078   inputFiles=/store/backfill/1/data/Tier0_REPLAY_2021/Cosmics/RAW/v2105100737/000/341/169/00000/5467c67e-1dad-4e71-b9ca-a266809f2e01.root
cmsDriver.py test --conditions 112X_dataRun3_Prompt_v5 -s RAW2DIGI,L1Reco,RECO,DQM --process reRECO --data --era Run3 --eventcontent RECO,DQM --scenario cosmics --datatier RECO,DQMIO --customise Configuration/DataProcessing/RecoTLR.customiseCosmicData -n 100 --filein=file:pickevents.root

if this PR is a backport please specify the original PR and why you need to backport that PR:

Backport of #33678

Before submitting your pull requests, make sure you followed this checklist:

@cmsbuild cmsbuild added this to the CMSSW_11_3_X milestone May 10, 2021
@cmsbuild
Copy link
Contributor

A new Pull Request was created by @dildick (Sven Dildick) for CMSSW_11_3_X.

It involves the following packages:

DQM/L1TMonitor

@andrius-k, @kmaeshima, @ErnestaP, @ahmad3213, @cmsbuild, @jfernan2, @rvenditti can you please review it and eventually sign? Thanks.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@jfernan2
Copy link
Contributor

please test

@silviodonato
Copy link
Contributor

urgent
to fix 11_3_0 for the next MWGR (if I understand correctly)

@dildick
Copy link
Contributor Author

dildick commented May 10, 2021

I can confirm that this PR solves the problem. Here's the print-out from the test

-bash-4.2$ edmPickEvents.py "/Cosmics/Tier0_REPLAY_2021-v2105100737/RAW" 341169:517:9310078

edmCopyPickMerge outputFile=pickevents.root \
  eventsToProcess=341169:9310078 \
  inputFiles=/store/backfill/1/data/Tier0_REPLAY_2021/Cosmics/RAW/v2105100737/000/341/169/00000/5467c67e-1dad-4e71-b9ca-a266809f2e01.root

/v2105100737/000/341/169/00000/5467c67e-1dad-4e71-b9ca-a266809f2e01.root=341169:9310078   inputFiles=/store/backfill/1/data/Tier0_REPLAY_2021/Cosmics/RAW 
10-May-2021 11:42:19 CDT  Initiating request to open file root://cmsxrootd-site.fnal.gov//store/backfill/1/data/Tier0_REPLAY_2021/Cosmics/RAW/v2105100737/000/341/169/00000/5467c67e-1dad-4e71-b9ca-a266809f2e01.root
%MSG-w XrdAdaptorInternal:  file_open 10-May-2021 11:45:06 CDT pre-events
Failed to open file at URL root://cmsxrootd-site.fnal.gov:1094//store/backfill/1/data/Tier0_REPLAY_2021/Cosmics/RAW/v2105100737/000/341/169/00000/5467c67e-1dad-4e71-b9ca-a266809f2e01.root?xrdcl.requuid=81dd00cc-6bd7-4728-a66f-969642f48647.
%MSG
%MSG-w XrdAdaptorInternal:  file_open 10-May-2021 11:45:22 CDT pre-events
Failed to open file at URL root://cmsxrootd-site.fnal.gov:1094//store/backfill/1/data/Tier0_REPLAY_2021/Cosmics/RAW/v2105100737/000/341/169/00000/5467c67e-1dad-4e71-b9ca-a266809f2e01.root?tried=&xrdcl.requuid=14cbc1c8-2b17-4d35-a498-60702047c1bd.
%MSG
10-May-2021 11:45:22 CDT  Initiating request to open file root://cmsxrootd.fnal.gov//store/backfill/1/data/Tier0_REPLAY_2021/Cosmics/RAW/v2105100737/000/341/169/00000/5467c67e-1dad-4e71-b9ca-a266809f2e01.root
%MSG-w XrdAdaptor:  file_open 10-May-2021 11:46:04 CDT pre-events
Data is served from cern.ch instead of original site US-Regional-Nebraska
%MSG
10-May-2021 11:46:05 CDT  Successfully opened file root://cmsxrootd.fnal.gov//store/backfill/1/data/Tier0_REPLAY_2021/Cosmics/RAW/v2105100737/000/341/169/00000/5467c67e-1dad-4e71-b9ca-a266809f2e01.root
Begin processing the 1st record. Run 341169, Event 9310078, LumiSection 517 on stream 0 at 10-May-2021 11:46:12.708 CDT
10-May-2021 11:46:13 CDT  Closed file root://cmsxrootd.fnal.gov//store/backfill/1/data/Tier0_REPLAY_2021/Cosmics/RAW/v2105100737/000/341/169/00000/5467c67e-1dad-4e71-b9ca-a266809f2e01.root

=============================================

MessageLogger Summary

 type     category        sev    module        subroutine        count    total
 ---- -------------------- -- ---------------- ----------------  -----    -----
    1 FastCloningDisabled  -i AfterModBeginStr                       1*       1
    2 XrdAdaptorInternal   -i (NoModuleName)                        24*      24
    3 XrdAdaptorInternal   -i After BeginProce                       2*       2
    4 XrdAdaptorInternal   -i AfterFile                             38*      38
    5 XrdAdaptorInternal   -i PostStreamBeginR                       3*       3
    6 XrdAdaptorInternal   -i file_open                              8*       8
    7 XrdAdaptorInternal   -i source                                 3*       3
    8 XrdFileInfo          -i file_close                             1*       1
    9 XrdFileInfo          -i file_open                              2*       2
   10 FwkReport            -f AfterSource                            1        1
   11 XrdAdaptor           -w file_open                              1        1
   12 XrdAdaptorInternal   -w file_open                              2        2
   13 fileAction           -s file_close                             1        1
   14 fileAction           -s file_open                              3        3

* Some occurrences of this message were suppressed in all logs, due to limits.

 type    category    Examples: run/evt        run/evt          run/evt
 ---- -------------------- ---------------- ---------------- ----------------
    1 FastCloningDisabled  BeforeEvents                      
    2 XrdAdaptorInternal   pre-events       pre-events       pre-events
    3 XrdAdaptorInternal   pre-events       pre-events       
    4 XrdAdaptorInternal   pre-events       pre-events       pre-events
    5 XrdAdaptorInternal   PostStreamBeginR PostStreamBeginR PostStreamBeginR
    6 XrdAdaptorInternal   pre-events       pre-events       pre-events
    7 XrdAdaptorInternal   PreSource        PreSource        PreSource
    8 XrdFileInfo          PostGlobalEndRun                  
    9 XrdFileInfo          pre-events       pre-events       
   10 FwkReport            AfterSource                       
   11 XrdAdaptor           pre-events                        
   12 XrdAdaptorInternal   pre-events       pre-events       
   13 fileAction           PostGlobalEndRun                  
   14 fileAction           pre-events       pre-events       pre-events

Severity    # Occurrences   Total Occurrences
--------    -------------   -----------------
Info                   82                  82
FwkInfo                 1                   1
Warning                 3                   3
System                  4                   4

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-7081cf/14967/summary.html
COMMIT: 65cfc33
CMSSW: CMSSW_11_3_X_2021-05-10-1100/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/33679/14967/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 7 differences found in the comparisons
  • DQMHistoTests: Total files compared: 38
  • DQMHistoTests: Total histograms compared: 2877046
  • DQMHistoTests: Total failures: 56
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 2876967
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.004 KiB( 37 files compared)
  • DQMHistoSizes: changed ( 312.0 ): 0.004 KiB MessageLogger/Warnings
  • Checked 160 log files, 37 edm output root files, 38 DQM output files
  • TriggerResults: no differences found

@jfernan2
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next CMSSW_11_3_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_12_0_X is complete. This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@silviodonato
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit edcacec into cms-sw:CMSSW_11_3_X May 10, 2021
@dildick dildick deleted the from-CMSSW_11_3_X_2021-05-10-1100-disable-GEM-CSC-integrated-local-trigger branch May 11, 2021 02:01
@davidlange6
Copy link
Contributor

hi- does anyone understand why this issue did not come up during relvals?

@dildick
Copy link
Contributor Author

dildick commented May 11, 2021

When we were testing the 11_2_X version and 11_3_X version, they passed all normal integration tests. However, the 11_2_X version crashed at P5. 11_3_X was not tested at P5 then. I turned off the GEM-CSC trigger in 11_2_X, but did not do that in 11_3_X. That should have been done in 11_3_X as well.

@davidlange6
Copy link
Contributor

davidlange6 commented May 11, 2021 via email

@dildick
Copy link
Contributor Author

dildick commented May 11, 2021

The code should be run on data to check data-emulator agreements for the CSC local trigger. A few weeks ago they must have switched from running 11_2_X to 11_3_X at P5.

@davidlange6
Copy link
Contributor

davidlange6 commented May 11, 2021 via email

@silviodonato
Copy link
Contributor

hi- does anyone understand why this issue did not come up during relvals?

@davidlange6 @amassiro @boudoul you can easily reproduce the error by cherry-picking the crashing event and using step1 of 138.1

edmPickEvents.py "/Cosmics/Tier0_REPLAY_2021-v2105100737/RAW" 341169:517:9310078

edmCopyPickMerge outputFile=pickevents.root   eventsToProcess=341169:9310078   inputFiles=/store/backfill/1/data/Tier0_REPLAY_2021/Cosmics/RAW/v2105100737/000/341/169/00000/5467c67e-1dad-4e71-b9ca-a266809f2e01.root

cmsDriver.py test1  --conditions auto:run3_data_express -s RAW2DIGI,L1Reco,DQM --process reRECO --data  --era Run3 --eventcontent RECO,DQM --scenario cosmics --datatier RECO,DQMIO --customise Configuration/DataProcessing/RecoTLR.customiseCosmicData -n 100 --filein=file:pickevents.root 

I'm trying to run on /store/backfill/1/data/Tier0_REPLAY_2021/Cosmics/RAW/v2105100737/000/341/169/00000/5467c67e-1dad-4e71-b9ca-a266809f2e01.root and I didn't get any error after 3000 events (please note that I selected a file where I'm sure there is a crashing event, so there is some bias).

The command used in RelVals is runTheMatrix.py -w 'standard' -b 'PDMVRELVALS-109_CosmicRun' -t 8 -m 1500 -l 138.1,138.2 --noCaf --wm force (https://twiki.cern.ch/twiki/bin/viewauth/CMS/PdmVRelVals2021#11_3_0).
I see they ran in 7M of events

@germanfgv do you know how many events ran successfully in the replay test dmwm/T0#4572 before the crash?

@silviodonato
Copy link
Contributor


2021-05-19 17:40:16,623:CRITICAL:CMSSW:Error message: An exception of category 'InvalidDetId' occurred while
   [0] Processing  Event run: 341754 lumi: 229 event: 2906869 stream: 3
   [1] Running path 'dqmoffline_step'
   [2] Prefetching for module L1TStage2RegionalMuonCandComp/'l1tdeStage2EmtfComp'
   [3] Calling method for module L1TRawToDigi/'emtfStage2Digis'
Exception Message:
GEMDetId ctor: Invalid parameters:  region 1 ring 1 station -99 layer 0 chamber 32 roll 6

@dildick there is an error in the Express (it should be 11_2_4)
https://hypernews.cern.ch/HyperNews/CMS/get/tier0-Ops/2219.html

Can it be related to this PR? (I mean that this PR would solve this crash)

cc @cms-sw/l1-l2 @cms-sw/dqm-l2

@dildick
Copy link
Contributor Author

dildick commented May 20, 2021

@eyigitba This is a crash in emtfStage2Digis, which is the EMTF unpacker?

@germanfgv
Copy link
Contributor

@silviodonato honestly, I'm not sure where to look for that information, but we still have logs and PSet for those jobs. So maybe we can find out if you point me into the right direction.

@eyigitba
Copy link
Contributor

@dildick @silviodonato I am aware of that crash. It's in the EMTF unpacker part which was not being used until now. I can submit a PR tomorrow to fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants