Fast Calorimeter Learning (FACILE) algorithm for HCAL reconstruction #32272

jeffkrupa · 2020-11-24T23:19:41Z

PR description:

Introduction of Fast Calorimeter Learning (FACILE), an ML-based algorithm for HCAL energy reconstruction.
The producer is contained in RecoLocalCalo/HcalRecProducers/src/FacileHcalReconstructor.cc. It loads the HBHEChannelInfo collection and employs a client with Services for Optimized Network Inference in Coprocessors (SONIC) to communicate with a server in as-a-service operation using Triton Inference Server.
The HBHEChannelInfo collection is made with RecoLocalCalo/HcalRecProducers/src/HBHEPhase1Reconstructor.cc and the HBHERecHit collection is made with the new producer.
The trained ML model is facile_all_v5, in tensorflow SavedModel format (https://github.com/fastmachinelearning/sonic-models). We would like to request a space cms-data:RecoLocalCalo/HcalRecProducers to store the model.
We include HLT config (RecoLocalCalo/HcalRecProducers/test/sonic_hlt_test.py) and offline reco modifiers (RecoLocalCalo/Configuration/python/hcalGlobalReco_cff.py) for testing.
We will present slides with updated studies at an upcoming HCAL DPG meeting (old slides: https://indico.cern.ch/event/935592/#2-ml-reconstruction-for-hcal-a).

PR validation:

Algorithm performance was validated with test HLT config (included in PR) and runTheMatrix command: runTheMatrix.py -l 11634.0 --command="--procModifier enableSonicTriton", and cmsRun sonic_hlt_test.py.
Basic PR tests completed.

Note: to start a local instance of the Triton server for running the commands above, do:

cmsrel CMSSW_11_2_0_pre9
cd CMSSW_11_2_0_pre9/src
cmsenv
git cms-checkout-topic jeffkrupa:hcalreco-facile-squash
git cms-addpkg HeterogeneousCore/SonicTriton
scram b
git clone https://github.com/hls-fpga-machine-learning/sonic-models HeterogeneousCore/SonicTriton/data
cd HeterogeneousCore/SonicTriton/test
./triton start
cd $CMSSW_BASE/src/RecoLocalCalo/HcalRecProducers/test

@violatingcp

…ted on GPU with TritonClient, with example config for running in HLT

cmsbuild · 2020-11-24T23:26:04Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-32272/20056

This PR adds an extra 20KB to repository
There are other open Pull requests which might conflict with changes you have proposed:
- File RecoLocalCalo/Configuration/python/hcalGlobalReco_cff.py modified in PR(s): Squash all Patatrack developments on top of CMSSW_11_3_0_pre5 #27983, Patatrack integration - HCAL local reconstruction (8/N) #31720
- File RecoLocalCalo/HcalRecProducers/BuildFile.xml modified in PR(s): Squash all Patatrack developments on top of CMSSW_11_3_0_pre5 #27983, Patatrack integration - HCAL local reconstruction (8/N) #31720

cmsbuild · 2020-11-24T23:26:31Z

A new Pull Request was created by @jeffkrupa (Jeff Krupa) for master.

It involves the following packages:

Configuration/ProcessModifiers
RecoLocalCalo/Configuration
RecoLocalCalo/HcalRecProducers

@perrotta, @silviodonato, @cmsbuild, @franzoni, @slava77, @jpata, @qliphy, @fabiocos, @davidlange6 can you please review it and eventually sign? Thanks.
@makortel, @rovere, @Martin-Grunewald, @apsallid, @lecriste, @mariadalfonso, @abdoulline, @fabiocos this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

kpedro88 · 2020-11-24T23:30:42Z

please test

cmsbuild · 2020-11-24T23:31:04Z

The tests are being triggered in jenkins.

CMSSW_11_2_X_2020-11-24-2300/slc7_amd64_gcc820: https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/11013/console Started: 2020/11/25 02:40
CMSSW_11_3_X_2020-12-03-1100/slc7_amd64_gcc900: https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/11325/console Started: 2020/12/03 20:27

cmsbuild · 2020-11-25T03:14:24Z

+1
Tested at: f4b3e3a
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-723081/11013/summary.html
CMSSW: CMSSW_11_2_X_2020-11-24-2300
SCRAM_ARCH: slc7_amd64_gcc820

cmsbuild · 2020-11-25T03:14:25Z

Comparison job queued.

abdoulline · 2020-11-25T06:49:03Z

@igv4321 please, take a look.

cmsbuild · 2020-11-25T07:44:10Z

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-723081/11013/summary.html

Comparison Summary:

No significant changes to the logs found
Reco comparison results: 2274 differences found in the comparisons
DQMHistoTests: Total files compared: 37
DQMHistoTests: Total histograms compared: 2963516
DQMHistoTests: Total failures: 54292
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 2909202
DQMHistoTests: Total skipped: 22
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 36 files compared)
Checked 158 log files, 34 edm output root files, 37 DQM output files

fwyzard · 2020-12-08T23:03:26Z

From what I can surmise of the discussion so far, the use of case of FACILE, currently, is

not the Run 3 offline reconstruction
not the Run 3 online reconstruction
not compelling from the physics performance point of view (see the last HCAL DPG meeting)
not compelling from the computing performance point of view (see the last HCAL DPG meeting)

The integration of FACILE in CMSSW is not objected to by the HCAL DPG.
On the other hand, unlike what was said during the ORP today, the integration of FACILE is not requested by the HCAL DPG, and will not be maintained by them.

Currently, the only use case of FACILE is the further development of FACILE itself, and as a the first use case for Sonic + Triton. And, currently, Sonic + Triton is only useful for testing FACILE.

So the full request is

integrate a Triton server in the CMS distribution
integrate Sonic in CMSSW
integrate FACILE in CMSSW
add process modifiers for Sonic+Triton+FACILE to the production configuration

with the stated goal of furthering the development of Sonic and FACILE.

Is the summary correct ?

mariadalfonso · 2020-12-08T23:07:13Z

@mariadalfonso

this PR surprised everyone.

I don't think that's accurate. There was a followup DPG presentation shortly after the PR was submitted (and the initial work was presented months ago).

what is the goal here ? test HCAL algos or test triton ?

Yes.

What is the full path with this triton work beyond HCAL ?

To utilize coprocessors efficiently, and to make algorithms easily portable without having to write a lot of code.

We do not have modifiers for everything

That may be at the discretion of the developers. It certainly doesn't mean that adding modifiers is discouraged.

better do not have parallel way to reconstruct HCAL flying around (i.e. samples requests ...)

Sample requests are handled centrally by the DPGs (etc.) and PPD. This has never been used as an objection to the existence of an alternate test workflow.

You are no longer DPG convener, and I am no longer HCAL CMSSW convener, so let's leave it up to the people who currently have those responsibilities.

Do we need to get to this point ? since 2016 have the responsibility to maintain the HCAL local reconstruction (indeed have been HCAL=DPG only 2018-2019) so I think I have the scientific responsibility of the review at least of what HCAL will use for the initial part Run3. IN 2021 we will come back after a very long showdown , and we should not repeat the errors of 2015 !

I'm trying to help you guys here. I do not want to enter the details of the algorithm now, I think it has conceptual problem and the presentation at the CMSweek showed them
https://indico.cern.ch/event/978550/contributions/4126741/attachments/2154653/3633977/JK_regression_update.pdf
I'm curious to see if this server thing works, this is clearly beyond my knowledge and you will have fo discuss with Bocci and company.

I interpret your last message that you are not satisfied with merging code for study (as Igor and Markus mentioned). snick in a change in the production is not an option

kpedro88 · 2020-12-08T23:07:58Z

@fwyzard

not compelling from the computing performance point of view

Objectively false.

integrate a Triton server in the CMS distribution

integrate Sonic in CMSSW

These elements are already in CMSSW. As discussed at the core software meeting today, some further automation will be coming soon.

add process modifiers for Sonic+Triton+FACILE to the production configuration

I'm not sure I understand this point. To clarify, the availability of a process modifier is intended to facilitate further testing, not immediately for production.

kpedro88 · 2020-12-08T23:16:38Z

@mariadalfonso

we should not repeat the errors of 2015 !

As the person who actually led and coordinated the HCAL Phase 1 software development, I'm well aware of the importance of having things ready in advance. This PR in no way interferes with other Run 3 preparations.

I interpret your last message that you are not satisfied with merging code for study (as Igor and Markus mentioned). snick in a change in the production is not an option

I'm not satisfied with specious arguments and special pleading targeted at this PR because some people happen not to like it for subjective reasons.

There have been probably more than a dozen PRs this year alone introducing new, often ML-based reconstruction alternatives with associated modifiers. I'm honestly taken aback by the amount of spurious objections being thrown at this PR.

Throughout my tenure as a CMSSW L2, I've actively encouraged iterative development practices. If CMS no longer wants to support iterative development in CMSSW, perhaps many of us should reconsider how actively we want to contribute to the experiment software.

fwyzard · 2020-12-08T23:18:31Z

not compelling from the computing performance point of view

Objectively false.

OK, then I misunderstood the results I have seen
Do you have the numbers for the throughput for running FCILE vs MAHI ?
(on a full machine, same hardware, same input data, etc.)

integrate a Triton server in the CMS distribution

integrate Sonic in CMSSW

These elements are already in CMSSW. As discussed at the core software meeting today, some further automation will be coming soon.

They are in CMSSW, but the question as to "what are they currently useful for" remains.

add process modifiers for Sonic+Triton+FACILE to the production configuration

I'm not sure I understand this point. To clarify, the availability of a process modifier is intended to facilitate further testing, not immediately for production.

While I don't disagree, from the ongoing discussion it looks like not including the modifiers and the matrix workflow would simplify the integration ?

kpedro88 · 2020-12-08T23:30:20Z

@fwyzard

OK, then I misunderstood the results I have seen
Do you have the numbers for the throughput for running FCILE vs MAHI ?
(on a full machine, same hardware, same input data, etc.)

We have preliminary numbers and are working (right now) on making sure they're valid. (But we definitely see an improvement.) We know that FACILE inference, even on CPU, is significantly faster than MAHI (~10ms vs. ~60ms).

While I don't disagree, from the ongoing discussion it looks like not including the modifiers and the matrix workflow would simplify the integration ?

Let me lay out in more detail what I see as the conflict in the discussion here. Stated and at least semi-official CMSSW policies include:

Use modifiers for workflow changes whenever possible, preferring them to customization functions.
New code in PRs (outside of test directories) should be automatically tested. (This is definitely not a CMSSW-wide policy, but has been stated explicitly by the reco conveners for reconstruction code.)

We have explicitly followed 1. in this PR, and work is ongoing to make sure we can follow 2. Some of the suggestions in this PR discussion actively violate one or both policies for reasons that are definitely not being applied uniformly across all CMSSW PRs.

slava77 · 2020-12-09T00:09:52Z

Configuration/ProcessModifiers/python/enableSonicTriton_cff.py

@@ -0,0 +1,3 @@
+import FWCore.ParameterSet.Config as cms
+
+enableSonicTriton = cms.Modifier()


is this meant to enable only the FACILE variant or really everything that can potentially be using SonicTriton?
I'm trying to understand the expected design of the modifier mix.
IIUC, there will/can be many inference variants/models using SonicTriton
and many of them will be for tests only. There should be a way to distinguish.

I guess one option is to assume that this specific modifier will flip between one "default" variant of algorithms to another equivalent variant of the algorithms but remote-offloaded via SONIC+Triton.
This option would apply to #32048: in a mix with mlpf modifier enabled, addition of enableSonicTriton will switch the way how an equivalent algorithm is executed (sort of a switch-producer).
This is clearly not the case in this PR: a completely different algorithm is substituted.

It seems to me that we need a more targeted named modifier, smth like run3_HB_FACILE or something similarly compact.

Indeed, enableSonicTriton is intended to enable all available producers that use Sonic+Triton.

Introducing a separate modifier specific to this PR seems reasonable to me.

(I don't think we necessarily need to have non-offloaded versions of every ML producer, but again, I think this can be left to developers' discretion right now.)

mariadalfonso · 2020-12-09T00:14:59Z

@fwyzard

OK, then I misunderstood the results I have seen
Do you have the numbers for the throughput for running FCILE vs MAHI ?
(on a full machine, same hardware, same input data, etc.)

We have preliminary numbers and are working (right now) on making sure they're valid. (But we definitely see an improvement.) We know that FACILE inference, even on CPU, is significantly faster than MAHI (~10ms vs. ~60ms).

Kevin, we have been porting Mahi on GPU we get identical results and the HCAL speed is now negligible in Run3-HLT. We are good for HLT. So using the ML is not helping to save time.
Offline HCAL (60 ms ? ) it's negligible compared to the full offline Run3.

So what the real Mahi OOT algorithm problem you are trying to overcome ? not clear to me
the ML is an average regression trained with simHIT, no PU, does not take into account conditions and still need to had hoc calibration especially for data since is trained in MC.
Why are we duplicating the effort ?

I think Mahi is simple give you here a good benchmark to develop your ML also and test your triton.
So we are trying to help you, but we do not want to get more problems.

While I don't disagree, from the ongoing discussion it looks like not including the modifiers and the matrix workflow would simplify the integration ?

Let me lay out in more detail what I see as the conflict in the discussion here. Stated and at least semi-official CMSSW policies include:

Use modifiers for workflow changes whenever possible, preferring them to customization functions.

New code in PRs (outside of test directories) should be automatically tested. (This is definitely not a CMSSW-wide policy, but has been stated explicitly by the reco conveners for reconstruction code.)

We have explicitly followed 1. in this PR, and work is ongoing to make sure we can follow 2. Some of the suggestions in this PR discussion actively violate one or both policies for reasons that are definitely not being applied uniformly across all CMSSW PRs.

kpedro88 · 2020-12-09T00:25:34Z

The question of what constitutes duplicating effort is much larger than this particular PR. However, since you raised the topic: this PR achieves a significant speedup wrt MAHI (even on CPU) with O(100) lines of C++ code. (If you want to count the training code, which currently isn't included in CMSSW for most ML algorithms, this number grows somewhat, but stays at a similar order of magnitude.) The resulting algorithm can be ported to essentially any coprocessor with little more than the press of a button. It does not require anyone to write or maintain any coprocessor-specific code whatsoever. In the long run, I (and many others) expect that this approach will result in significantly less duplication of effort.

Also, FACILE is explicitly trained not just to be aware of PU, but to remove it, which we suspect results in the improvement in the MET resolution that we have observed and presented in the HCAL DPG. Conditions are also included in the training. Parameterizing the DNN in terms of the detector conditions, to reduce the need for retraining, is an interesting future R&D project that would be facilitated by having a workflow in CMSSW to run the inference and get high-level physics results.

makortel · 2020-12-09T00:34:26Z

We have preliminary numbers and are working (right now) on making sure they're valid. (But we definitely see an improvement.) We know that FACILE inference, even on CPU, is significantly faster than MAHI (~10ms vs. ~60ms).

Just being curious, I think it would be interesting to compare the performance of the inference server also to within-CMSSW inference with TensorFlow/ONNX to understand the magnitude of the overhead.

mariadalfonso · 2020-12-09T00:35:19Z

The question of what constitutes duplicating effort is much larger than this particular PR. However, since you raised the topic: this PR achieves a significant speedup wrt MAHI (even on CPU) with O(100) lines of C++ code. (If you want to count the training code, which currently isn't included in CMSSW for most ML algorithms, this number grows somewhat, but stays at a similar order of magnitude.) The resulting algorithm can be ported to essentially any coprocessor with little more than the press of a button. It does not require anyone to write or maintain any coprocessor-specific code whatsoever. In the long run, I (and many others) expect that this approach will result in significantly less duplication of effort.

Also, FACILE is explicitly trained not just to be aware of PU, but to remove it, which we suspect results in the improvement in the MET resolution that we have observed and presented in the HCAL DPG. Conditions are also included in the training. Parameterizing the DNN in terms of the detector conditions, to reduce the need for retraining, is an interesting future R&D project that would be facilitated by having a workflow in CMSSW to run the inference and get high-level physics results.

rechits at high energy should get the same estimated energy mahi or any other method.
Has been clearly showed the this is not the case, so we have a big problem.
(I suspect that the regression with ML is on average, Mahi is an analytical regression rechit by rechit ).
this is the first things to resolve.
The event displays we make tell us that mahi is in the right side.

Subtracting IT PU from one detector is bad.

the IT-PU average to zero, now you remove HCAL component of the IT and you trigger on the imbalance, when you have 100 PU or more events it matters. OOT is a different story: you need to subtract because the integration time of the detector is different and indeed we do at the same time.
the PU particles now have only with ECAL and tracker informations associated and are classified by PF all as electron and photons just because the ML made the HCAL deposit suddenly invisible

Technology wise I'm opened, that's why we are experimenting. but you need to discuss probably somewhere else.

mseidel42 · 2020-12-09T12:57:00Z

this PR surprised everyone.

I don't think that's accurate.

Hi Kevin, we were not expecting any PR before the issues brought up in the DPG meetings would be adressed. And it was even more surprising to hear from third party that this was sold as HCAL request in the ORP meeting.

if requested by the DPG, if there is a path to this being used in production in the future

We are happy to have this in as a technology demonstrator for SonicTriton and to easen future R&D. But note that currently there is no clear path to production (and maintenance by the HCAL DPG), as our default algorithm is fast enough to be a negligible portion of the whole reco step (28 ms on CPU for Phase 2) and respects all relevant input conditions.

Iff reco conveners prefer to have modifiers added for every experimental algorithm, fine for us.

On the physics content:

How to deal with different pulse shapes in data/MC and evolving pedestals? --> future R&D planned by authors
The resolution might be improved in certain energy ranges but this still has to be proven with 2018 data.
Subtraction of intime-pileup should not be done at calorimeter level but when combining with tracking information, see also slides here. It seems that the proposed SimCluster changes (finecalo patches v1: boundary-crossing logic and better history #32083) may help us to obtain a robust truth definition (also useful for evaluating MAHI).

jpata · 2020-12-09T13:41:19Z

From the reco point of view:

we request a test workflow in runTheMatrix, this is best achieved with a process modifier. Per the discussion in Core, the Service-based mechanism for launching the test with Sonic&Triton is being set up, so we expect this will happen before this PR is merged
the modifier should be named in a more specific way: Fast Calorimeter Learning (FACILE) algorithm for HCAL reconstruction #32272 (comment)
the modifications to hcalGlobalReco_cff.py are thus necessary and minimal, without any effect on standard reco
there is commitment from the authors (Fast Calorimeter Learning (FACILE) algorithm for HCAL reconstruction #32272 (comment)) to maintaining them as needed, or removing them once the R&D is concluded and a decision is reached to adopt or reject this approach in general.
if the HCAL DPG or Reco finds for any reason that maintaining the code added by this PR is a burden, things can be reassessed in the future and the code may be moved or removed as needed.

Next, we expect 1-2 to be addressed by the authors. Thanks!

kpedro88 · 2020-12-09T14:00:33Z

@intrepid42 just to make sure we're all on the same page:

the statement in the ORP was that the HCAL DPG was aware of FACILE as an ongoing R&D project.
having the inference code available in an alternative workflow will facilitate improving the physics performance (as reported and discussed in the DPG meeting).

mseidel42 · 2020-12-09T14:18:58Z

Ok, thank you for clarifying this misunderstanding!

silviodonato · 2020-12-10T11:41:59Z

Here some personal considerations about R&D projects in CMSSW.

R&D code needs to be supported by its group and its CMSSW L2 because the project looks somehow promising/interesting.
A customization function is better suitable for an on-going R&D project in order to avoid changes of some already existing files. This means no risk of mistakes in the current productions and no changes needed for a possible future update/removal of the R&D code.
The new files should stay preferably in a separate package. In this way the R&D project is clearly separate from the officially-maintained CMSSW code. This means also that no dependencies are added to the already existing packages. If it is just matter of one single file, it can stay in /plugins or even better in /test
I have also some concerns about adding the new modules in runTheMatrix. What happens if the R&D workflows crashes, who should fix the crash? The developers of the R&D or the author of the PR causing the crash? Should the CMSSW L2s take care of this code in the same way of the rest of CMSSW code?

About this specific PR, given that HCAL DPG is ok with merging this PR to facilitate the R&D, I'm ok to have this code in CMSSW.
However, I prefer to have a customization function instead of the modifier. I would also prefer to have this code in a separate package for the reasons written above.

kpedro88 · 2020-12-10T15:26:41Z

@silviodonato

If it is just matter of one single file, it can stay in /plugins or even better in /test

I agree that putting it in test is one way to isolate the dependencies (for now). Coupled with the expressed preference for a customization function (below), this also implies removing the cfi file and putting the module configuration inside the customization function. This might be okay for now.

A customization function is better suitable for an on-going R&D project in order to avoid changes of some already existing files.

If this is the preference of the release management team, we can look into this alternative. I think it would be useful to have a clearer delineation regarding when customization functions are preferable vs. modifiers. My working understanding, since modifiers were introduced, was that they were always preferred whenever feasible.

I have also some concerns about adding the new modules in runTheMatrix.

We currently allow non-functional workflows in the upgrade matrix to facilitate development. They are simply not forwarded to the regular matrix, so not automatically tested in IBs. This has been an accepted practice for upgrade workflows for many years now.

Maybe it's worth adding a separate upgrade-workflow test command to the bot (for which it would explicitly use runTheMatrix.py -w upgrade), so these workflows can be tested in PRs but kept out of IBs? @smuzaffar

silviodonato · 2020-12-11T08:27:46Z

I agree that putting it in test is one way to isolate the dependencies (for now). Coupled with the expressed preference for a customization function (below), this also implies removing the cfi file and putting the module configuration inside the customization function. This might be okay for now.

Ok

If this is the preference of the release management team, we can look into this alternative. I think it would be useful to have a clearer delineation regarding when customization functions are preferable vs. modifiers. My working understanding, since modifiers were introduced, was that they were always preferred whenever feasible.

I think the modifiers are more useful when you have something "official" that you want to run in many conditions/eras and then you need to play with the other modifiers. For R&D and testing, I think it is better to use a customization function without touching the already existing sequences.
Do you see any bad side effects @cms-sw/reconstruction-l2 @cms-sw/core-l2 ?

We currently allow non-functional workflows in the upgrade matrix to facilitate development. They are simply not forwarded to the regular matrix, so not automatically tested in IBs. This has been an accepted practice for upgrade workflows for many years now.

Ok, no problem from my side with using the upgrade matrix.

fwyzard · 2020-12-11T09:29:18Z

Ok, no problem from my side with using the upgrade matrix.

Does it mean that any other non-working stuff can go in, too ?

slava77 · 2020-12-11T13:32:43Z

On 12/11/20 12:28 AM, Silvio Donato wrote: If this is the preference of the release management team, we can look into this alternative. I think it would be useful to have a clearer delineation regarding when customization functions are preferable vs. modifiers. My working understanding, since modifiers were introduced, was that they were always preferred whenever feasible. I think the modifiers are more useful when you have something "official" that you want to run in many conditions/eras and then you need to play with the other modifiers. For R&D and testing, I think it is better to use a customization function without touching the already existing sequences. Do you see any bad side effects @cms-sw/reconstruction-l2 <https://github.com/orgs/cms-sw/teams/reconstruction-l2> @cms-sw/core-l2 <https://github.com/orgs/cms-sw/teams/core-l2> ?

no particularly strong objections to customize functions. I think that use of modifiers for a test-like case is more motivational for stronger support commitment and more clarity to this becoming a production setup soon.

makortel · 2020-12-11T15:32:30Z

I think the modifiers are more useful when you have something "official" that you want to run in many conditions/eras and then you need to play with the other modifiers. For R&D and testing, I think it is better to use a customization function without touching the already existing sequences.
Do you see any bad side effects @cms-sw/reconstruction-l2 @cms-sw/core-l2 ?

no particularly strong objections to customize functions. I think that use of modifiers for a test-like case is more motivational for stronger support commitment and more clarity to this becoming a production setup soon.

I agree.

silviodonato · 2020-12-11T17:03:00Z

Does it mean that any other non-working stuff can go in, too ?

Currently, there are about 10k workflows in the upgrade matrix. I'm not worried too much about it, as far as the change does not affect the regular matrix and the other workflows.

no particularly strong objections to customize functions. I think that
use of modifiers for a test-like case is more motivational for stronger
support commitment and more clarity to this becoming a production setup
soon.

ok, so let's keep the customize function for the time being.

jpata · 2021-02-08T14:44:06Z

kind ping, given that the TritonService PR #32576 was merged.

jpata · 2021-02-22T14:50:12Z

kind ping - let us know how you want to proceed with this (conflicts since some time)

jeffkrupa · 2021-02-25T16:59:45Z

We have decided to close this pull request pending additional timing studies. Creating the HBHEChannelInfo with HBHEPhase1Reconstructor.cc in a pre-processing step (to create the inputs for FACILE) minimally alters the existing workflow but adds significant latency; we will submit a separate PR when we have optimized this.

Jeffrey Krupa and others added 7 commits November 24, 2020 16:43

add FacileHcalReconstructor for ML-based Hcal reconstruction accelera…

6185960

…ted on GPU with TritonClient, with example config for running in HLT

config format change

4a2bf9e

add specific file to HLT test config

74b7dcf

modifier for offline test

01b4510

centralize configuration

e5b5550

use facile for run3

e86f01e

Update model, remove first ieta slice

f4b3e3a

cmsbuild added this to the CMSSW_11_2_X milestone Nov 24, 2020

cmsbuild added code-checks-pending comparison-pending operations-pending orp-pending pending-signatures reconstruction-pending tests-pending labels Nov 24, 2020

jeffkrupa mentioned this pull request Nov 24, 2020

Port FACILE producer fastmachinelearning/SonicCMS#13

Open

cmsbuild added code-checks-approved and removed code-checks-pending labels Nov 24, 2020

cmsbuild added tests-started and removed tests-pending labels Nov 24, 2020

cmsbuild added tests-approved and removed tests-started labels Nov 25, 2020

slava77 reviewed Dec 9, 2020

View reviewed changes

This was referenced Dec 15, 2020

silence compilation warnings for MahiGPU.cu #32501

Closed

Clean some more BuildFiles #32570

Merged

cmsbuild mentioned this pull request Dec 23, 2020

Introduce TritonService and workflows #32576

Merged

cmsbuild mentioned this pull request Feb 22, 2021

[RFC] Outline of HGCAL ML ntuples using NanoAOD #32187

Closed

jeffkrupa closed this Feb 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fast Calorimeter Learning (FACILE) algorithm for HCAL reconstruction #32272

Fast Calorimeter Learning (FACILE) algorithm for HCAL reconstruction #32272

jeffkrupa commented Nov 24, 2020 •

edited

cmsbuild commented Nov 24, 2020

cmsbuild commented Nov 24, 2020

kpedro88 commented Nov 24, 2020

cmsbuild commented Nov 24, 2020 •

edited

cmsbuild commented Nov 25, 2020

cmsbuild commented Nov 25, 2020

abdoulline commented Nov 25, 2020

cmsbuild commented Nov 25, 2020

fwyzard commented Dec 8, 2020

mariadalfonso commented Dec 8, 2020

kpedro88 commented Dec 8, 2020

kpedro88 commented Dec 8, 2020

fwyzard commented Dec 8, 2020

kpedro88 commented Dec 8, 2020

slava77 Dec 9, 2020

kpedro88 Dec 9, 2020

mariadalfonso commented Dec 9, 2020

kpedro88 commented Dec 9, 2020

makortel commented Dec 9, 2020

mariadalfonso commented Dec 9, 2020 •

edited

mseidel42 commented Dec 9, 2020

jpata commented Dec 9, 2020

kpedro88 commented Dec 9, 2020

mseidel42 commented Dec 9, 2020

silviodonato commented Dec 10, 2020

kpedro88 commented Dec 10, 2020

silviodonato commented Dec 11, 2020

fwyzard commented Dec 11, 2020

slava77 commented Dec 11, 2020 via email

makortel commented Dec 11, 2020

silviodonato commented Dec 11, 2020

jpata commented Feb 8, 2021

jpata commented Feb 22, 2021

jeffkrupa commented Feb 25, 2021

		@@ -0,0 +1,3 @@
		import FWCore.ParameterSet.Config as cms

		enableSonicTriton = cms.Modifier()

Fast Calorimeter Learning (FACILE) algorithm for HCAL reconstruction #32272

Fast Calorimeter Learning (FACILE) algorithm for HCAL reconstruction #32272

Conversation

jeffkrupa commented Nov 24, 2020 • edited

PR description:

PR validation:

cmsbuild commented Nov 24, 2020

cmsbuild commented Nov 24, 2020

kpedro88 commented Nov 24, 2020

cmsbuild commented Nov 24, 2020 • edited

cmsbuild commented Nov 25, 2020

cmsbuild commented Nov 25, 2020

abdoulline commented Nov 25, 2020

cmsbuild commented Nov 25, 2020

fwyzard commented Dec 8, 2020

mariadalfonso commented Dec 8, 2020

kpedro88 commented Dec 8, 2020

kpedro88 commented Dec 8, 2020

fwyzard commented Dec 8, 2020

kpedro88 commented Dec 8, 2020

slava77 Dec 9, 2020

Choose a reason for hiding this comment

kpedro88 Dec 9, 2020

Choose a reason for hiding this comment

mariadalfonso commented Dec 9, 2020

kpedro88 commented Dec 9, 2020

makortel commented Dec 9, 2020

mariadalfonso commented Dec 9, 2020 • edited

mseidel42 commented Dec 9, 2020

jpata commented Dec 9, 2020

kpedro88 commented Dec 9, 2020

mseidel42 commented Dec 9, 2020

silviodonato commented Dec 10, 2020

kpedro88 commented Dec 10, 2020

silviodonato commented Dec 11, 2020

fwyzard commented Dec 11, 2020

slava77 commented Dec 11, 2020 via email

makortel commented Dec 11, 2020

silviodonato commented Dec 11, 2020

jpata commented Feb 8, 2021

jpata commented Feb 22, 2021

jeffkrupa commented Feb 25, 2021

jeffkrupa commented Nov 24, 2020 •

edited

cmsbuild commented Nov 24, 2020 •

edited

mariadalfonso commented Dec 9, 2020 •

edited