Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Produce a fake BeamSpot object in BeamSpotOnlineProducer when using transient record logic and OnlineBeamSpotESProducer returned a fake #41597

Merged
merged 7 commits into from
May 15, 2023

Conversation

mmusich
Copy link
Contributor

@mmusich mmusich commented May 9, 2023

PR description:

While working on #41193, I have noticed that despite OnlineBeamSpotESProducer has a refined logic in order to put into the event setup a fake BeamSpot when certain conditions are not met:

if (best) {
return std::shared_ptr<const BeamSpotObjects>(best, edm::do_nothing_deleter());
} else {
return std::shared_ptr<const BeamSpotObjects>(&fakeBS_, edm::do_nothing_deleter());
edm::LogInfo("OnlineBeamSpotESProducer")
<< "None of the Online BeamSpots in the ES is suitable, \n returning a fake one. ";
}

the current BeamSpotOnlineProducer logic, if useTransientRecord_ is true and the OnlineBeamSpotESProducer returned a fake beamspot, just falls back to DB (reading the content of BeamSpotObjectsRcd, populated by the PCL in realtime workflows) - which I have understood from the Beam Spot experts (@gennai) is not the right / expected behaviour.

if (useTransientRecord_) {
auto const& spotDB = iSetup.getData(beamTransientToken_);
if (spotDB.beamType() != 2) {
if (shoutMODE && beamTransientRcdESWatcher_.check(iSetup)) {
edm::LogWarning("BeamSpotFromDB")
<< "Online Beam Spot producer falls back to DB value because the ESProducer returned a fake beamspot ";
}
fallBackToDB = true;

In this PR (in commit ec786fe) I change the current behaviour, such that instead now it will indeed produce a fake BS for consumption online (as it was originally devised).
This has a consequence, that in the case in which we want indeed to fall-back to DB (e.g. for the Phase-2 HLT testing case, after #41193), the change will alter physics in an undesired way. For this reason, it has been necessary to supply to the auto:phase2_realistic GlobalTag key the right BeamSpotOnline*ObjectsRcd payloads in order for the OnlineBeamSpotESProducer to not pick up a fake BeamSpot. This is done in commit 5fbea8a, while commit 30121dd sets the the time threshold to 1e6 seconds (as it's done for the Run-3 menus, see https://github.com/cms-sw/cmssw/pull/41193/files#r1152358315) to the same effect.
The difference in Global Tags is:

To supply such payloads, I have found convenient to create a new plugin BeamSpotOnlineFromOfflineConverter that taking a BeamSpotObject in input creates an sqlite file in output containing a payload of the type BeamSpotOnlineObjects. This has been added in commit e870b17, while commits a03a99d and a03a99d add a utility accessor to the BeamSpotOnlineObjects condtion format to be able to readily copy from a BeamSpotObjects payload (friend class) all the common parameters. Finally commit ecd817c adds this converter to the battery of unit tests of the CondTools/BeamSpot package.

PR validation:

cmssw complies. Passes back unit tests.

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

To be backported down to 13.0.X for 2023 data-taking purposes

@francescobrivio @dzuolo

@cmsbuild
Copy link
Contributor

cmsbuild commented May 9, 2023

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-41597/35488

  • This PR adds an extra 32KB to repository

@cmsbuild
Copy link
Contributor

cmsbuild commented May 9, 2023

A new Pull Request was created by @mmusich (Marco Musich) for master.

It involves the following packages:

  • CondFormats/BeamSpotObjects (db, alca)
  • CondTools/BeamSpot (db, alca)
  • Configuration/AlCa (alca)
  • HLTrigger/Configuration (hlt)
  • RecoVertex/BeamSpotProducer (reconstruction, alca)

@Martin-Grunewald, @clacaputo, @cmsbuild, @missirol, @saumyaphor4252, @tvami, @mandrenguyen, @francescobrivio can you please review it and eventually sign? Thanks.
@beaucero, @fabiocos, @VourMa, @silviodonato, @SohamBhattacharya, @GiacomoSguazzoni, @JanFSchulte, @tocheng, @VinInn, @Martin-Grunewald, @missirol, @rovere, @mmusich, @mtosi, @dgulhan, @seemasharmafnal, @francescobrivio this is something you requested to watch as well.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@mmusich
Copy link
Contributor Author

mmusich commented May 9, 2023

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

cmsbuild commented May 9, 2023

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-1c9357/32504/summary.html
COMMIT: 63e821e
CMSSW: CMSSW_13_2_X_2023-05-08-2300/el8_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/41597/32504/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially removed 19 lines from the logs
  • Reco comparison results: 10 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3460836
  • DQMHistoTests: Total failures: 68
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3460746
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 207 log files, 159 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

@francescobrivio
Copy link
Contributor

+1

@missirol
Copy link
Contributor

the current BeamSpotOnlineProducer logic, if useTransientRecord_ is true and the OnlineBeamSpotESProducer returned a fake beamspot, just falls back to DB (reading the content of BeamSpotObjectsRcd, populated by the PCL in realtime workflows) - which I have understood from the Beam Spot experts (@gennai) is not the right / expected behaviour.

I'm afraid I need a clarification on the rationale behind this, and how it can impact HLT online in Run 3.

I asked @francescobrivio offline, and I understood that the main use case to favour the fake BS is to use that at HLT (instead of the BS from PCL) after a technical stop, in case the BS position has changed. This is basically the case 'both online payloads are too old', as far as I understand.

In the ESProducer, the requirement for using one of the online tags is that the payload is recent enough, and also that the corresponding BS is "good" (isGoodBS).

I'm wondering what happens if, in the middle of a run, both DQM clients return a "bad" BS for a given LS: in that case, with this PR, would HLT switch to a fake BS until one of the clients returns a "good" BS again ?

@mmusich
Copy link
Contributor Author

mmusich commented May 11, 2023

in that case, with this PR, would HLT switch to a fake BS until one of the clients returns a "good" BS again ?

yes, excepted that I am not sure this has ever happened in practice (I should add that depending on how old is the last PCL update, there is no guarantee that using that as opposed to the fake one is favored).

@francescobrivio
Copy link
Contributor

in that case, with this PR, would HLT switch to a fake BS until one of the clients returns a "good" BS again ?

yes, excepted that I am not sure this has ever happened in practice

I also don't rember ever seeing this happen so far.

(I should add that depending on how old is the last PCL update, there is no guarantee that using that as opposed to the fake one is favored).

I agree.

@missirol I guess your point here (IIUC) is if we should change the fallBackToDB logic to fall back NOT on the on BeamSpotObjectsRcd written by PCL (which can be as late as 48 hours), but on the most recent BeamSpotOnlineObjects produced by the DQM clients during the same run (and only eventually fall on the PCL BS).

I am not sure if this is possible/advisable.... @mmusich @gennai what do you think?

@mmusich
Copy link
Contributor Author

mmusich commented May 11, 2023

but on the most recent BeamSpotOnlineObjects produced by the DQM clients during the same run

Case in point I am not sure how you could instruct the framework to read an IoV different than the one used for processing the data. I guess you could add a 3rd online tag that stores the values of the already arbitrated beamspot with a latency with respect to the LS processed, but this looks a complication over an already fragile mechanism.
This PR just fullfills what the initial implementation promised, i.e. deliver a fake when you don't have better options.
Anything else requires going back to the drawing board.

@francescobrivio
Copy link
Contributor

Point in case I am not sure how you could instruct the framework to read an IoV different than the one used for processing the data. I guess you could add a 3rd online tag that stores the values of the already arbitrated beamspot with a latency with respect to the LS processed, but this looks a complication over an already fragile mechanism.

The fwk implementation was indeed my main concern. I agree with you Marco!

@mmusich
Copy link
Contributor Author

mmusich commented May 11, 2023

The fwk implementation was indeed my main concern. I agree with you Marco!

IF we want to differentiate the cases in which the transient beamspot is too old from the one in which it is not good (and fall back to DB in that case), I guess we could propagate forward the info from the ESProducer by means of the beamspot type,e.g. use unknown (-1) instead of fake (0). Let me know if you prefer that.

@francescobrivio
Copy link
Contributor

IF we want to differentiate the cases in which the transient beamspot is too old from the one in which it is not good (and fall back to DB in that case), I guess we could propagate forward the info from the ESProducer by means of the beamspot type,e.g. use unknown (-1) instead of fake (0). Let me know if you prefer that.

But no actual differentiation of treatment is possible right? So the propagating forward of the BS type would be only for the logging. In that case I think it can be postponed to a subsequent PR if deemed necessary in the future.

@mmusich
Copy link
Contributor Author

mmusich commented May 11, 2023

But no actual differentiation of treatment is possible right?

It is possible, just as we check here for beam types !=2

we can treat the case 0 and the case -1 differently.
Right now the ESProducer puts -1 for fakes indiscriminately

but we can speciate (of course the logic becomes more complicated)

@francescobrivio
Copy link
Contributor

After private discussion with Marco we concluded that, for the moment, this PR is the best way to proceed forward. Further discussion is needed in case we want to modify the logic and can be left to a following development.

@cms-sw/reconstruction-l2 a kind ping for your review and signature when possible

@clacaputo
Copy link
Contributor

+reconstruction

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants