Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiling at T0 AlCa and DQM workflows #38198

Open
tvami opened this issue Jun 2, 2022 · 25 comments
Open

Profiling at T0 AlCa and DQM workflows #38198

tvami opened this issue Jun 2, 2022 · 25 comments

Comments

@tvami
Copy link
Contributor

tvami commented Jun 2, 2022

Follow up to the issue 36282
and cmsTalk
https://cms-talk.web.cern.ch/t/high-memory-usage-in-promptreco-jobs-for-run-352516/11040

So the issue is that the wf chosen in github issue 36282 is based on the MET dataset, thus AlCaHcalHBHEMuonProducer is not run on it. (It's attached to the MinBias and SingleMuon).

This is a general issue for testing, certain ALCARECOs belong to certain PDs (as defined in the AlCaRECO matrix). i.e. we either do the testing on several wf, or just decide to pick one that has most of the ALCARECOs connected to it. That would be SingleMuon.

If that's the prefered solution, we can set up a new wf after the Run3 single muon PD is done (next week?)

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 2, 2022

A new Issue was created by @tvami Tamas Vami.

@Dr15Jones, @perrotta, @dpiparo, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@tvami
Copy link
Contributor Author

tvami commented Jun 2, 2022

assign alca,reconstruction

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 2, 2022

New categories assigned: dqm,alca

@jfernan2,@ahmad3213,@yuanchao,@micsucmed,@rvenditti,@emanueleusai,@francescobrivio,@malbouis,@tvami,@pmandrik you have been requested to review this Pull request/Issue and eventually sign? Thanks

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 2, 2022

New categories assigned: reconstruction

@jpata,@slava77,@clacaputo you have been requested to review this Pull request/Issue and eventually sign? Thanks

@tvami tvami changed the title Profiling a T0 prompt AlCa and DQM workflows Profiling at T0 AlCa and DQM workflows Jun 2, 2022
@makortel
Copy link
Contributor

makortel commented Jun 2, 2022

How many events are processed for the plots in http://cms-reco-profiling.web.cern.ch/cms-reco-profiling/results/summary_plot_html/CMSSW_12_4_step3_136.889.html ? (I'm confused of the x axis)

@jpata
Copy link
Contributor

jpata commented Jun 2, 2022

It's 5k events: https://github.com/cms-sw/cms-bot/blob/master/reco_profiling/profileRunner.py#L95.

The plot has event IDs on the x axis, we need to change that (cc @xoqhdgh1002) to be just numbers.

@makortel
Copy link
Contributor

makortel commented Jun 2, 2022

It's 5k events:

Thanks, good. That should be sufficient for this particular leak (or even smaller) to be visible (this would have been ~1.2 GB after 5k events).

@jpata
Copy link
Contributor

jpata commented Jun 16, 2022

@tvami, should we take any action here? Is there a new workflow we should switch to that would be more representative?

@tvami
Copy link
Contributor Author

tvami commented Jun 16, 2022

Hi @jpata we could use a run from this Monday. However, it will likely have limited stats, although if the tests go up to 5000 events that could probably be reached

@jpata
Copy link
Contributor

jpata commented Jun 16, 2022

We test about 5k now, so it wouldn't be a big change. Is there a workflow so we can give a try, and you can see if the results are useful for ALCA?

@tvami
Copy link
Contributor Author

tvami commented Jun 16, 2022

@tocheng is going to create a new wf that you can use. He promised to look at this tomorrow.

@tvami
Copy link
Contributor Author

tvami commented Jun 21, 2022

He promised to look at this tomorrow.

@tocheng do you have any updates?

@tocheng
Copy link
Contributor

tocheng commented Jun 23, 2022

@tvami
Copy link
Contributor Author

tvami commented Jun 23, 2022

Hi @tocheng
these are the ALCARECOs in the alcareco matrix connected to the single muon:

SingleMuon TkAlMuonIsolated, HcalCalIterativePhiSym, MuAlCalIsolatedMu, HcalCalHO, HcalCalHBHEMuonProducerFilter, SiPixelCalSingleMuonLoose, SiPixelCalSingleMuonTight

I think you missed some of them, please add those! Thanks!

@tvami
Copy link
Contributor Author

tvami commented Jun 23, 2022

And maybe we could add another one, which is purely a technical wf that adds all the ALCARECOs to the MinBias PD... this of course would physically be incorrect, but would test everything under one wf...

@tvami
Copy link
Contributor Author

tvami commented Jul 8, 2022

@tocheng please submit the PR, at this point we have good Run-3, 13.6 TeV input data

@tvami
Copy link
Contributor Author

tvami commented Jul 11, 2022

@tocheng ?

@francescobrivio
Copy link
Contributor

Being addressed in #38681

@tvami
Copy link
Contributor Author

tvami commented Jul 21, 2022

+alca

@tvami
Copy link
Contributor Author

tvami commented Jul 21, 2022

@jpata can you please take over from that? Thanks!

@jpata
Copy link
Contributor

jpata commented Aug 2, 2022

Thanks! Which of the two new workflows should we use this instead of 136.889? From the reco point of view they are all equivalent, so the question is, which has the most representative ALCA configuration.

Note that on the reco side, we basically submit and analyze this 8-threaded profiling job "by hand" for each prerelease - so we don't have the personpower to study a large number of workflows per release at this time.

@tvami
Copy link
Contributor Author

tvami commented Aug 2, 2022

I think you can go ahead with 1001.3, thanks!

@tvami
Copy link
Contributor Author

tvami commented Aug 9, 2022

hi @jpata do you have any update on this? thanks!

@tvami
Copy link
Contributor Author

tvami commented Oct 3, 2022

hi @cms-sw/reconstruction-l2 did this happen in the end?

@tvami
Copy link
Contributor Author

tvami commented Dec 15, 2022

@clacaputo @mandrenguyen hi guys, do you you know if the 1001.3 is being profiled after all?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants