Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make a more robust fix to deal with possible multiple creations of the MF geometries #38669

Closed
perrotta opened this issue Jul 9, 2022 · 10 comments

Comments

@perrotta
Copy link
Contributor

perrotta commented Jul 9, 2022

I'm copying here the comment of @namapane related to the PR #38640 and its backports #38641 and #38653, intended to urgently fix a crash reported when changng runs with dd4hep (see #38624)

The fix was merged, but for a more permanent and robust solution the following points should be addressed (quoting Nicola):

  1. The fix [implementes in Added protection from multiple creation of geometry #38640] consists in caching the DDDetector as a pointer in DD4hep_VolumeBasedMagneticFieldESProducerFromDB::produce. I am not sure if I understand correctly, that the issue is the fact that the MF geometry is re-created among different runs, and you want to prevent this, right?
    One thing that is possibly not self-evident, but that should be kept in mind, is that if a job spans different runs where the magnet current changes, then the MF geometry must be re-made, since a different one is used for different nominal values (becasue data tables for eg 2T maps exist only for a simplified MF geometry). The current fix, as far as I can tell, will prevent this and I expect it to crash in such a case. It is indeed a very exotic case, since we likely do not use anything other than the 3.8T nominal current. Still I think it would be a good idea to figure out a proper solution. That should be: check that the geometry is still valid for the new configuration (based on its geometryVersion); if so keep the existing one; otherwise... we are in trouble, since the existing one must be discarded and a new one must be created.
    In any case I would recommend to add a protection at: https://github.com/cms-sw/cmssw/pull/38653/files#diff-693b9a5f442f0cc09b587f18ad986f1d2f216002d1b44548532d07cf6a222e75R195
    to check that geometryVersion matches the previous' geometry version (which should be also cached) and throw otherwise.
    At least if an exotic case like the one I'm discussing happens we will have to debug a throw instead of some mysterious crash elsewhere due to a mismatch of MF tables and geometry.

  2. Unrelated to the crash, but pixel people beware: using magField->nominalValue() as a physical value as in https://github.com/cms-sw/cmssw/pull/38653/files#diff-cd4b91b1150cb8ea3ac6cdb24d49a6bd99f77a8320a5d4a60eeab3000ff42eb1R141 is not a good idea. nominalValue() is intended to be used as a label, and is truncated to int, with the corresponding loss of accuracy.
    The method inverseBzAtOriginInGeV() was added to the interface for the benefit of pixel code (Move "inverse of Bz at origin in GeV" from PixelRecoUtilities to MagneticField #35081), for example.

  3. MF in SiPixelLorentzAnglePCLHarvester.cc is just used to get Bz at (0,0,0). Building the full MF geometry is not necessary for this purpose. As discussed for example here, [RFC] Optimize MagGeoBuilder with TBB #37936 (comment), I think we should figure out a way to provide Bz at (0,0,0) without building the full map for these kind of purposes. It does not make a difference for processes where the full map is required anyhow, but it will make a lot in cases like the one mentioned above.
    I am not sure how this can be technically achieved; if there is interest in this we can maybe open a discussion in an appropriate issue.

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 9, 2022

A new Issue was created by @perrotta Andrea Perrotta.

@Dr15Jones, @perrotta, @dpiparo, @rappoccio, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@tvami
Copy link
Contributor

tvami commented Jul 9, 2022

assign geometry

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 9, 2022

New categories assigned: geometry

@mdhildreth,@ianna,@Dr15Jones,@makortel,@bsunanda,@civanch you have been requested to review this Pull request/Issue and eventually sign? Thanks

@mmusich
Copy link
Contributor

mmusich commented Jul 9, 2022

assign alca

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 9, 2022

New categories assigned: alca

@yuanchao,@francescobrivio,@malbouis,@tvami,@ChrisMisan you have been requested to review this Pull request/Issue and eventually sign? Thanks

@qliphy
Copy link
Contributor

qliphy commented Jul 11, 2022

Also pointing to the comment from @Dr15Jones here to be addressed together, as another future todo.

@namapane
Copy link
Contributor

Also pointing to the comment from @Dr15Jones here to be addressed together, as another future todo.

I think that @Dr15Jones' comment and point 2) above substantially overlap. I can prepare a PR for this one.
Some notes:
-As mentioned, the solution is to cache geometryVersion and check it is still the same.
-I would also skip the blob retrieving/manipulation, in case of a cache hit.
-In principle we could also think of a more aggressive caching of the full map in order to avoid re-reading data tables unnecessarily at each run boundary That would however require more checks (essentially, that the entire conf is identical), and would not repace the geometry caching as a protection for the observed issue.
So I would consider this for a later time, if there is interest in it.

@tvami
Copy link
Contributor

tvami commented Jul 27, 2022

+alca

@civanch
Copy link
Contributor

civanch commented Aug 9, 2022

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 9, 2022

This issue is fully signed and ready to be closed.

@qliphy qliphy closed this as completed Aug 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants