Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pr 112 x Displaced muon in BMTF and uGMT ( unpacker/emulator/packer) #31608

Conversation

rekovic
Copy link
Contributor

@rekovic rekovic commented Sep 28, 2020

PR description:

11_2_X: L1T Displaced muons: Pt from unconstrainedVtx

  • Unpacker/Emulator/Packer for KBMTF and uGMT (unpacker same for uGT)
  • Changes in DataFormat to accessors of the displaced-Pt and impact parameter

PR validation:

if this PR is a backport please specify the original PR and why you need to backport that PR:

Before submitting your pull requests, make sure you followed this checklist:

dinyar and others added 30 commits September 28, 2020 22:29
The BMTF with the Kalman filter sends inverted track addresses,
the uGMT cancel-out unit has been adapted for this. In this
commit it has also been configured to use the new mode and
pick barrel muons from the KBMTF collection.
The original field with track addresses was filled with incorrect or incomplete values for most TFs.
Using static function from RegionalMuonRawDigiTranslator that I factored out in order to generate the raw track address.
May make the interface to get them more clear.
This makes us a bit more consistent with the other fields.
Co-Authored-By: panoskatsoulis <panos.katsoulis@cern.ch>
This should be configured based on eras in the future.
In the MuonUnpacker the wrong overloaded function was being called, this is fixed now.
In the process I also rewrote my earlier attempt, it should be more readable now.
Also add displacement information to uGMT intermediate muons and light
refactoring for clarity.
Packs correctly for Run-2 2016, Run-2 from 2017 (w/ extrapolated coordiantes),
and Run-3
The original field with track addresses was filled with incorrect or incomplete values for most TFs.
Using static function from RegionalMuonRawDigiTranslator that I factored out in order to generate the raw track address.
Includes configuration of packer by Era.
Modifications still needed for the BMTF setup to make use of this.
Not done nicely, should be cleaned up.
@jfernan2
Copy link
Contributor

@DinayR explained them somehow in #31608 (comment)

@makortel
Copy link
Contributor

Apologies for the delay. I added the versioning in [1], I think this is what is required, am I right that L1MuBMTrack doesn't need an iorule for itself because it inherits it from RegionalMuonCand?

From the framework experts comment #31608 (comment), that seems to be the case.

Right, that should be sufficient. An explicit test of reading an old file could be good though, did you try to do that yet?

@dinyar
Copy link
Contributor

dinyar commented Oct 14, 2020

Apologies for the delay. I added the versioning in [1], I think this is what is required, am I right that L1MuBMTrack doesn't need an iorule for itself because it inherits it from RegionalMuonCand?

From the framework experts comment #31608 (comment), that seems to be the case.

Right, that should be sufficient. An explicit test of reading an old file could be good though, did you try to do that yet?

I'm currently running that test. I need to re-emulate the trigger as in previous runs the simKBmtfDigis with that field were dropped.

@dinyar
Copy link
Contributor

dinyar commented Oct 14, 2020

Hi @makortel,

I confirm this works now. Reran the kBMTF emulation to create simKBmtfDigis in vanilla CMSSW_11_0_2 and checked in CMSSW_11_2_X_2020-10-11-2300 with this patch applied. Could see the same value for m_hwPt2/m_hwPtUnconstrained in both.

Cheers,
Dinyar

@makortel
Copy link
Contributor

Thanks @dinyar!

@silviodonato
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 7f62a84 into cms-sw:master Oct 14, 2020
@silviodonato
Copy link
Contributor

image
https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisons/CMSSW_11_2_X_2020-10-12-2300+4ea00f/39285/10224.0_TTbar_13+2017PU+TTbar_13TeV_TuneCUETP8M1_GenSim+DigiPU+RecoFakeHLTPU+HARVESTFakeHLTPU+Nano/L1T__L1TStage2uGT_calol2ouput_vs_uGTinput_errorSummaryDen.png

I wanted to ask you about this plot, but I see it was already discussed. I repeat your comment here just to be 100% sure that it has been understood.

wf1306.0 and others show mismatches in between Layer-2 and uGT in the sum collection size in the L1T folder as well as the already mentioned difference in the electron trigger plot. I believe this might be because the uGT packer is now aware of eras (see [1]). I'm not an expert for the calo unpacker, so maybe @bundocka can comment on whether they are expecting a different uGT version to unpack the sums correctly.

@dinyar and @rekovic investigated and we confirm that the L1T RelVal plots are improved with this PR, i.e. the ratio plot of uGT objects (unpacked versus unpacked CaloLayer2). Since the HLT uses upacked uGT objects, the change in number of unpacked uGT can describe the change in HLT triggers. @bundocka

@silviodonato
Copy link
Contributor

@dinyar @rekovic we are getting segmentation fault in 136.8391. Could you have a look?

Current Modules:

Module: none (crashed)
Module: none
Module: L1TMuonEndCapTrackProducer:simEmtfDigis
Module: none

A fatal system signal has occurred: segmentation violation

https://cmssdt.cern.ch/SDT/cgi-bin/logreader/slc7_amd64_gcc820/CMSSW_11_2_X_2020-10-14-2300/pyRelValMatrixLogs/run/136.8391_RunExpressPhy2017F+RunExpressPhy2017F+HLTDR2_2017+RECODR2_2017reHLTSiPixelCalZeroBias_Prompt+HARVEST2017/step2_RunExpressPhy2017F+RunExpressPhy2017F+HLTDR2_2017+RECODR2_2017reHLTSiPixelCalZeroBias_Prompt+HARVEST2017.log#/37-37

@dinyar
Copy link
Contributor

dinyar commented Oct 15, 2020

Hi @silviodonato,

ok, I'll have a look and will get back to you.

Cheers,
Dinyar

@dinyar
Copy link
Contributor

dinyar commented Oct 15, 2020

Hi @silviodonato, @Dr15Jones,

I was trying to reproduce this on lxplus, but am currently struggling to do so. Is there a recipe? Just running runTheMatrix.py -l 136.8391 -i all --ibeos leads to different segfaults that make me think I'm doing things wrong..

Cheers,
Dinyar

@makortel
Copy link
Contributor

runTheMatrix.py -l 136.8391 -i all --ibeos

That is the recipe. In IB the stack trace is

#3  0x00002afc046a3359 in sig_dostack_then_abort () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/lib/slc7_amd64_gcc820/pluginFWCoreServicesPlugins.so
#4  <signal handler called>
#5  0x00002afbfce36027 in TClass::GetRealData(char const*) const () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/external/slc7_amd64_gcc820/lib/libCore.so
#6  0x00002afbfce37311 in TClass::GetDataMemberOffset(char const*) const () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/external/slc7_amd64_gcc820/lib/libCore.so
#7  0x00002afc29c1cdee in ROOT::read_l1tcLcLRegionalMuonCand_0(char*, TVirtualObject*) () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/lib/slc7_amd64_gcc820/libDataFormatsL1TMuon.so
#8  0x00002afbfc8fd0c8 in int TStreamerInfo::ReadBufferArtificial<char**>(TBuffer&, char** const&, TStreamerElement*, int, int) () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/external/slc7_amd64_gcc820/lib/libRIO.so
#9  0x00002afbfc9c56a7 in int TStreamerInfo::ReadBuffer<char**>(TBuffer&, char** const&, TStreamerInfo::TCompInfo* const*, int, int, int, int, int) () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/external/slc7_amd64_gcc820/lib/libRIO.so
#10 0x00002afbfc8775d4 in TStreamerInfoActions::VectorLooper::GenericRead(TBuffer&, void*, void const*, TStreamerInfoActions::TLoopConfiguration const*, TStreamerInfoActions::TConfiguration const*) () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/external/slc7_amd64_gcc820/lib/libRIO.so
#11 0x00002afbfc783bac in TBufferFile::ApplySequence(TStreamerInfoActions::TActionSequence const&, void*, void*) () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/external/slc7_amd64_gcc820/lib/libRIO.so
#12 0x00002afbfc8c1d84 in int TStreamerInfoActions::ReadSTL<&TStreamerInfoActions::ReadSTLMemberWiseSameClass, &TStreamerInfoActions::ReadSTLObjectWiseFastArray>(TBuffer&, void*, TStreamerInfoActions::TConfiguration const*) () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/external/slc7_amd64_gcc820/lib/libRIO.so
#13 0x00002afbfc78bb9a in TBufferFile::ReadClassBuffer(TClass const*, void*, TClass const*) () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/external/slc7_amd64_gcc820/lib/libRIO.so
#14 0x00002afbfc9c55a4 in int TStreamerInfo::ReadBuffer<char**>(TBuffer&, char** const&, TStreamerInfo::TCompInfo* const*, int, int, int, int, int) () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/external/slc7_amd64_gcc820/lib/libRIO.so
#15 0x00002afbfc84631d in TStreamerInfoActions::GenericReadAction(TBuffer&, void*, TStreamerInfoActions::TConfiguration const*) () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/external/slc7_amd64_gcc820/lib/libRIO.so
#16 0x00002afbfc784125 in TBufferFile::ApplySequence(TStreamerInfoActions::TActionSequence const&, void*) () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/external/slc7_amd64_gcc820/lib/libRIO.so
#17 0x00002afbfc29133d in TBranchElement::ReadLeavesMember(TBuffer&) () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/external/slc7_amd64_gcc820/lib/libTree.so
#18 0x00002afbfc289566 in TBranch::GetEntry(long long, int) () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/external/slc7_amd64_gcc820/lib/libTree.so
#19 0x00002afbfc2999f9 in TBranchElement::GetEntry(long long, int) () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/external/slc7_amd64_gcc820/lib/libTree.so
#20 0x00002afbfc2998c6 in TBranchElement::GetEntry(long long, int) () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/external/slc7_amd64_gcc820/lib/libTree.so
#21 0x00002afc27d9bb26 in edm::RootTree::getEntry(TBranch*, long long) const () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/lib/slc7_amd64_gcc820/pluginIOPoolInput.so
#22 0x00002afc27d74f72 in edm::RootDelayedReader::getProduct_(edm::BranchID const&, edm::EDProductGetter const*) () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/lib/slc7_amd64_gcc820/pluginIOPoolInput.so
#23 0x00002afbfbd6787f in edm::DelayedReader::getProduct(edm::BranchID const&, edm::EDProductGetter const*, edm::ModuleCallingContext const*) () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/lib/slc7_amd64_gcc820/libFWCoreFramework.so
#24 0x00002afbfbe213fd in edm::InputProductResolver::prefetchAsync_(edm::WaitingTask*, edm::Principal const&, bool, edm::ServiceToken const&, edm::SharedResourcesAcquirer*, edm::ModuleCallingContext const*) const::{lambda()#1}::operator()() const () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/lib/slc7_amd64_gcc820/libFWCoreFramework.so
#25 0x00002afbfbe21514 in edm::SerialTaskQueue::QueuedTask<edm::SerialTaskQueueChain::push<edm::InputProductResolver::prefetchAsync_(edm::WaitingTask*, edm::Principal const&, bool, edm::ServiceToken const&, edm::SharedResourcesAcquirer*, edm::ModuleCallingContext const*) const::{lambda()#1}&>(edm::InputProductResolver::prefetchAsync_(edm::WaitingTask*, edm::Principal const&, bool, edm::ServiceToken const&, edm::SharedResourcesAcquirer*, edm::ModuleCallingContext const*) const::{lambda()#1}&)::{lambda()#1}>::execute() () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/lib/slc7_amd64_gcc820/libFWCoreFramework.so
#26 0x00002afbfd58fbfd in tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::process_bypass_loop (this=this@entry=0x2afc00f37e00, context_guard=..., t=0x2afc4202cc40, t@entry=0x2afc00f49440, isolation=isolation@entry=0) at ../../src/tbb/custom_scheduler.h:393

From which the line

#7  0x00002afc29c1cdee in ROOT::read_l1tcLcLRegionalMuonCand_0(char*, TVirtualObject*) () from /cvmfs/cms-ib.cern.ch/nweek-02650/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-10-14-2300/lib/slc7_amd64_gcc820/libDataFormatsL1TMuon.so

hints that it is the reading of old l1t::RegionalMuonCand that crashes. So I'd guess something went wrong with the iorule.

@makortel
Copy link
Contributor

Compiling DataFormats/L1Muon with -g -O0 shows that the crashing line (in the dictionary definition) is

static Long_t offset_Onfile_l1tcLcLRegionalMuonCand_m_hwPt2 = oldObj->GetClass()->GetDataMemberOffset("m_hwPt2");

and oldObj->GetClass() returns nullptr.

@makortel
Copy link
Contributor

makortel commented Oct 15, 2020

The function ROOT::read_l1tcLcLRegionalMuonCand_0 looks like

static void read_l1tcLcLRegionalMuonCand_0( char* target, TVirtualObject *oldObj )
{
   static Long_t offset_Onfile_l1tcLcLRegionalMuonCand_m_hwPt2 = oldObj->GetClass()->GetDataMemberOffset("m_hwPt2");

The target points to a C-string of "\300q]\307\377\177", which looks a bit suspicious to me. If I interpret the contents of TVirtualObject correctly, the class name there would be "".

@dinyar
Copy link
Contributor

dinyar commented Oct 16, 2020

Hi @makortel,

Thanks a lot for that, it gave me the hint that of course the RegionalMuonCand dataformat had changed previously (the hwPt2 and hwDXY fields were added sometime in 2018). So I added another version and an iorule that just keeps hwPtUnconstrained 0 for that version in [1]. Could you have a look if that looks reasonable? If it is, should I make a new PR?

Related to this: Should I add a similar rule as [2] for hwDXY to guard against the same potential in the future?

Cheers,
Dinyar

[1] dinyar@9a307c6
[2] dinyar@9a307c6#diff-52d8e15a05ea83d31ae81653b0333f6aecaebd41b2757a4e2079289c340ec1feR9-R11

@silviodonato
Copy link
Contributor

@dinyar I tried to play a bit with the iorule. I found that we get segfaul using version="[0-10]" and version="[1-10]". It works with version="[2-10]" and above. I'm going to open a PR. I think this is related to

Version numbers 0, 1, and 2 have special meaning to ROOT and should not be used as version numbers.

(from https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideCreatingNewProducts#Class_versioning)

@Dr15Jones
Copy link
Contributor

@pcanal see earlier comments in this pull request about crashes in IO rules.

@mrodozov
Copy link
Contributor

mrodozov commented Oct 19, 2020

@dinyar I tried to play a bit with the iorule. I found that we get segfaul using version="[0-10]" and version="[1-10]". It works with version="[2-10]" and above. I'm going to open a PR. I think this is related to

Version numbers 0, 1, and 2 have special meaning to ROOT and should not be used as version numbers.

(from https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideCreatingNewProducts#Class_versioning)

@silviodonato please mention me in te consequent PR I was after this for tomorrow's report on IBs. If this is still present I'd like to mention what was done to be fixed . I'm leaving this for reference https://tinyurl.com/y2a6t6b8

@silviodonato
Copy link
Contributor

See #31848

@makortel
Copy link
Contributor

Thanks @silviodonato. Re-reading @dinyar's #31608 (comment) this makes sense, and I think #31848 is the correct fix. There is no need to set m_hwPtUnconstrained explicitly to zero (as in dinyar@9a307c6) because new fields are set to zero automatically by ROOT.

@Dr15Jones
Copy link
Contributor

There is no need to set m_hwPtUnconstrained explicitly to zero (as in dinyar/cmssw@9a307c6) because new fields are set to zero automatically by ROOT.

More precisely, the value will be what ever it was set to in the default constructor (ROOT doesn't itself zero any fields).

@makortel
Copy link
Contributor

There is no need to set m_hwPtUnconstrained explicitly to zero (as in dinyar/cmssw@9a307c6) because new fields are set to zero automatically by ROOT.

More precisely, the value will be what ever it was set to in the default constructor (ROOT doesn't itself zero any fields).

Thanks for the correction. In this case the default constructor sets the m_hwPtUnconstrained to zero.

@dinyar
Copy link
Contributor

dinyar commented Oct 19, 2020

Hi @silviodonato, @makortel, @Dr15Jones,

Thanks a lot for the explanations! This makes sense to me now.

Cheers,
Dinyar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants