Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with SoA persistency in HLT output files post-alpaka migration #44700

Open
mmusich opened this issue Apr 10, 2024 · 6 comments
Open

Issues with SoA persistency in HLT output files post-alpaka migration #44700

mmusich opened this issue Apr 10, 2024 · 6 comments

Comments

@mmusich
Copy link
Contributor

mmusich commented Apr 10, 2024

This issue is mostly for documentation purposes, in order to expose what has been discussed in the internal HLT JIRA project ticket CMSHLT-3147.

Quoting from JIRA ticket description:

After the migration of the pixel local+track+vertex reconstruction to Alpaka in the HLT menu (CMSHLT-3125), the related SoA collections have been thought possible to be persisted in the output files.

In CMSHLT-3125, the keep statements of the OutputModule of the DQMGPUvsCPU stream has already been updated to include the relevant products (DigiErrors, Clusters, RecHits, Tracks, Vertices).

This means that the DQM plugins which are currently in the Path DQM_PixelReconstruction_v (Sequence: HLTDQMPixelReconstruction) can be removed from the menu, as long as the corresponding monitoring sequence is moved into an appropriate online-DQM client (which reads the DQMGPUvsCPU streamer files).

When trying to implement the above, it was noticed that while the various monitoring plugins in the DQM/SiPixelHeterogeneous package are using SoA-s inputs, the current development HLT menu in CMSW_14_0_X persists legacy objects in DQMvsGPU stream.
The change from legacy to SoA objects in the hltOutputDQMGPUvsCPU was implemented via:

OutputModules (1):
  -> hltOutputDQMGPUvsCPU [GlobalEvFOutputModule] CHANGED
       untracked vstring outputCommands [CHANGED]
        =  drop *
        =  keep *Cluster*_hltSiPixelClustersSerialSync_*_*
        =  keep *Cluster*_hltSiPixelClusters_*_*
        +  keep *RecHit*_hltSiPixelRecHitsSoASerialSync_*_*  [-  keep *RecHit*_hltSiPixelRecHitsSerialSync_*_*]
        +  keep *RecHit*_hltSiPixelRecHitsSoA_*_*  [-  keep *RecHit*_hltSiPixelRecHits_*_*]
        =  keep *_hltEcalDigisSerialSync_*_*
        =  keep *_hltEcalDigis_*_*
        =  keep *_hltEcalUncalibRecHitSerialSync_*_*
        =  keep *_hltEcalUncalibRecHit_*_*
        =  keep *_hltHbherecoFromGPU_*_*
        =  keep *_hltHbherecoLegacy_*_*
        +  keep *_hltOnlineBeamSpot_*_*
        \/ keep *_hltParticleFlowClusterHBHESoASerialSync_*_*
        \/ keep *_hltParticleFlowClusterHBHESoA_*_*  [-  keep *_hltPixelTracksSerialSync_*_*]
        +  keep *_hltPixelTracksSoASerialSync_*_*  [-  keep *_hltPixelTracks_*_*]
        +  keep *_hltPixelTracksSoA_*_*  [-  keep *_hltPixelVerticesSerialSync_*_*]
        +  keep *_hltPixelVerticesSoASerialSync_*_*  [-  keep *_hltPixelVertices_*_*]

but tests at runtime (on CPU-only) using the following script:

#!/bin/bash -ex

# CMSSW_14_0_4

hltGetConfiguration /users/musich/tests/dev/CMSSW_14_0_0/CMSHLT-3147/GRun/V2 \
  --globaltag 140X_dataRun3_HLT_v3 \
  --data \
  --no-prescale \
  --output all \
  --eras Run3 --l1-emulator uGT --l1 L1Menu_Collisions2024_v1_1_0_xml \
  --max-events 100 \
  --input /store/data/Run2024B/EphemeralHLTPhysics0/RAW/v1/000/379/075/00000/44f5f661-b536-49d9-b455-8e31371b2d86.root \
  > hlt_mod.py

cmsRun hlt_mod.py &> hlt.log&

resulted in a runtime error:

----- Begin Fatal Exception 10-Apr-2024 13:41:43 CEST-----------------------
An exception of category 'FatalRootError' occurred while
   [0] Calling EventProcessor::runToCompletion (which does almost everything after beginJob and before endJob)
   Additional Info:
      [a] Fatal Root Error: @SUB=TStreamerInfo::Build
SiPixelHitStatusAndCharge: SiPixelHitStatus has no streamer or dictionary, data member "status" will not be saved

----- End Fatal Exception -------------------------------------------------

This seems to come from these keep statements:

'keep *RecHit*_hltSiPixelRecHitsSoASerialSync_*_*',
'keep *RecHit*_hltSiPixelRecHitsSoA_*_*',

reverting that, one encounters a different error related to the hltPixelTracksSoA*:

----- Begin Fatal Exception 10-Apr-2024 14:01:59 CEST-----------------------
An exception of category 'FatalRootError' occurred while
   [0] Calling EventProcessor::runToCompletion (which does almost everything after beginJob and before endJob)
   Additional Info:
      [a] Fatal Root Error: @SUB=TStreamerInfo::Build
reco::TrackSoA<pixelTopology::Phase1>::Layout<128,false>, unknown type: cms::alpakatools::OneToManyAssocSequential<unsigned int,32769,163840>* hitIndices_
----- End Fatal Exception -------------------------------------------------
@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 10, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

A new Issue was created by @mmusich.

@sextonkennedy, @makortel, @rappoccio, @antoniovilela, @smuzaffar, @Dr15Jones can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@makortel
Copy link
Contributor

There was some discussion in Core Software mattermost on this topic, https://mattermost.web.cern.ch/cms-o-and-c/pl/zgnaaebkq3f7bf3hf9xfaa3zoy onwards

@makortel
Copy link
Contributor

assign hlt, heterogeneous

FYI @AdrianoDee

@cmsbuild
Copy link
Contributor

New categories assigned: hlt,heterogeneous

@Martin-Grunewald,@mmusich,@fwyzard,@makortel you have been requested to review this Pull request/Issue and eventually sign? Thanks

@fwyzard
Copy link
Contributor

fwyzard commented Apr 11, 2024

Following an exchange with @AdrianoDee, I understand that the reason for the failure is that the pixel rechit SoA

  GENERATE_SOA_LAYOUT(Layout,
                      SOA_COLUMN(float, xLocal),
                      SOA_COLUMN(float, yLocal),
                      SOA_COLUMN(float, xerrLocal),
                      SOA_COLUMN(float, yerrLocal),
                      SOA_COLUMN(float, xGlobal),
                      SOA_COLUMN(float, yGlobal),
                      SOA_COLUMN(float, zGlobal),
                      SOA_COLUMN(float, rGlobal),
                      SOA_COLUMN(int16_t, iphi),
                      SOA_COLUMN(SiPixelHitStatusAndCharge, chargeAndStatus),
                      SOA_COLUMN(int16_t, clusterSizeX),
                      SOA_COLUMN(int16_t, clusterSizeY),
                      SOA_COLUMN(uint16_t, detectorIndex),
                      SOA_SCALAR(int32_t, offsetBPIX2),
                      SOA_COLUMN(PhiBinnerStorageType, phiBinnerStorage),
                      SOA_SCALAR(HitModuleStartArray, hitsModuleStart),
                      SOA_SCALAR(HitLayerStartArray, hitsLayerStart),
                      SOA_SCALAR(AverageGeometry, averageGeometry),
                      SOA_SCALAR(PhiBinner, phiBinner));

contains as scalar data members some classes that do not have a dictionary, like the PhiBinner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants