Skip to content

Commit f0101a2

Browse files
New blog post by Galin Bistrev (#333)
* New blog post by Galin Bistrev * Fixed spelling * Fixed some things * Some more fixes * Fixed punctuation errors in blog post and added presentation * Spelling errors
1 parent 9db5c45 commit f0101a2

File tree

3 files changed

+148
-0
lines changed

3 files changed

+148
-0
lines changed

.github/actions/spelling/allow/terms.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ Backpropagation
55
CINT
66
CMSSW
77
Caa
8+
Codegen
89
Cppyy
910
Debian
1011
EPC
@@ -30,6 +31,7 @@ Ohridski
3031
OMP
3132
OpenMP
3233
PTX
34+
RAII
3335
Resugaring
3436
SBO
3537
Slib
@@ -44,10 +46,12 @@ biodynamo
4446
bioinformatics
4547
blogs
4648
cms
49+
codegen
4750
consteval
4851
cppyy
4952
cytokine
5053
cytokines
54+
doxygen
5155
gitlab
5256
gpu
5357
gridlay
Lines changed: 144 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
title: "Results from CERN Summer School 2025: Supporting Automatic
2+
Differentiation in CMS Combine profile likelihood scans"
3+
layout: post
4+
excerpt: "A CERN Summer Student 2025 project aiming at the support of
5+
automatic differentiation (AD) for likelihood scans in the CMS Combine
6+
tool to accelerate statistical inference by leveraging RooFit's AD
7+
support and LLVM-based gradient generation."
8+
sitemap: false
9+
author: Galin Bistrev
10+
permalink: blogs/2025_galin_bistrev_results_blog/
11+
banner_image: /images/blog/banner-cern.jpg
12+
date: 2025-09-25
13+
tags: cern cms root combine c++ RooFit automatic-differentiation
14+
---
15+
16+
### **Introduction**
17+
Greetings! I’m Galin Bistrev, a fourth-year student specializing in
18+
Nuclear and Particle Physics at the University of Sofia "St. Kliment Ohridski."
19+
As part of the CERN Summer Student Programme 2025, I was working on a
20+
project that aimed to provide support for Automatic Differentiation
21+
(AD) into the CMS Combine tool profile likelihood scans.
22+
23+
Mentors: Jonas Rembser, Vassil Vasilev, David Lange
24+
25+
### **Description of the Project**
26+
27+
This project aims to enhance support for Automatic Differentiation (AD)
28+
in likelihood scans within the CMS Combine framework, the primary
29+
statistical analysis tool of the CMS experiment at CERN. Combine is
30+
built on top of RooFit, which has recently introduced AD to improve
31+
minimization techniques. By providing computationally efficient
32+
gradients through AD, RooFit achieves substantial performance
33+
improvements. In RooFit, Clad converts internal likelihood
34+
representations into standalone C++ code, from which gradient
35+
routines for AD are generated. This strategy not only speeds up the
36+
fitting process but also increases the portability and shareability
37+
of likelihood models, making them usable even by those without
38+
detailed knowledge of RooFit or Combine internals.
39+
40+
### **Brief overview of the CMS Combine engine**
41+
Combine is a statistical analysis framework that compares models of
42+
expected observations with real data. It is widely used for tasks such
43+
as searching for new particles or processes, setting limits on
44+
potential new physics, and measuring physical quantities like cross-sections.
45+
Although developed with High Energy Physics (HEP)
46+
applications in mind, Combine contains no intrinsic physics assumptions,
47+
making it fully general and independent of any specific analysis.
48+
This flexibility allows it to be applied across a broad range of
49+
statistical problems.
50+
51+
Roughly, Combine performs three main functions:
52+
53+
- Builds a statistical model of expected observations.
54+
- Runs statistical tests comparing the model with observed data.
55+
- Provides tools for validating, inspecting, and understanding both the
56+
model and the results of the statistical tests.
57+
58+
### **Project goals**
59+
60+
In order for AD to be supported in Combine likelihood scans, a number of goals needed to be achieved:
61+
62+
- Refactoring some of Combine's logic into RooFit, so that Combine can
63+
reuse the AD-enabled minimization algorithm already present there.
64+
- Integrate gradient computation into likelihood scans, ensuring that
65+
derivatives are correctly propagated for efficient and accurate minimization.
66+
- Validate correctness and performance, confirming that the AD-based
67+
scans produce results consistent with traditional methods while
68+
offering improved performance.
69+
70+
## **Overview of Completed Work**
71+
Over the course of the project, several major tasks were completed to achieve the stated objectives:
72+
73+
- Imported the `RooMultiPdf` class in RooFit from Combine, enabling
74+
switching between multiple PDF-s, applying statistical penalties,
75+
and supporting code generation for AD.
76+
77+
- The implementation of the new class was made to be supported by
78+
`codegen` in RooFit by adding a new function in `MathFunc.h` and
79+
extending `CodegenImpl.cxx` to generate code for models making use of it.
80+
81+
- Imported three pieces of code from Combine that handle the
82+
minimization procedures within the framework in RooFit's `RooMinimizer.cxx`.
83+
The first is a class imported by Jonas Rembser
84+
called `FreezeDisconnectedParametersRAII`, which automatically
85+
freezes and unfreezes parameters disconnected from the likelihood graph.
86+
The second is the function `generateOrthogonalCombinations`, which
87+
generates a list of index combinations by initializing a base
88+
configuration with all indices set to zero and then varying one category at a time.
89+
The third and final piece of code is a function called `reorderCombinations`,
90+
which takes the set of indices produced by `generateOrthogonalCombinations`
91+
and adjusts each combination by adding the corresponding base values
92+
modulo the maximum allowed index, effectively shifting the combinations
93+
relative to the current best indices.
94+
95+
- Using the above-stated functions, the discrete profiling algorithm,
96+
which is the main minimization algorithm in Combine, was imported
97+
into `RooMinimizer.cxx`.
98+
- A [tutorial](https://root.cern/doc/master/rf619__discrete__profiling_8py.html)
99+
was created along with a [benchmark](https://github.com/vgvassilev/clad/issues/1521),
100+
made by Jonas Rembser, demonstrating discrete profiling with RooMultiPdf objects
101+
and evaluating the performance of AD in the likelihood scans.
102+
103+
## **Results**
104+
With those objectives accomplished, RooFit now provides AD support for
105+
discrete profiling. However, the developed benchmark indicates that AD
106+
does not currently improve efficiency, as the gradient code generated by
107+
Clad introduces overhead. Further optimization in Clad is needed to achieve
108+
the potential performance gains for RooFit likelihood scans. More information
109+
regarding the issue can be found at [#1521](https://github.com/vgvassilev/clad/issues/1521).
110+
111+
## **Conclusions**
112+
Thanks to this project, RooFit now enables AD support for discrete profiling in Combine,
113+
which, after addressing the current overhead in Clad, would allow for
114+
significantly faster and more efficient likelihood scans while maintaining
115+
accurate optimization of both discrete and continuous parameters.
116+
117+
## **Future work**
118+
- Further benchmarking is required to quantify the potential performance
119+
gains from automatic differentiation.
120+
- Additional optimization of Clad is needed to eliminate unnecessary
121+
overhead in gradient generation.
122+
- The discrete profiling logic implemented in RooMinimizer should be
123+
tested across different models to evaluate the minimizer’s behavior and
124+
robustness.
125+
- Extend doxygen documentation of RooMinimizer to describe treatment of discrete
126+
parameters.
127+
- Test if the implementation of discrete profiling works also inside CMS Combine ,
128+
replacing their implementation in `CascadeMinimizer.cxx`.
129+
130+
## **Acknowledgements**
131+
I would like to express my sincere gratitude to the CERN Summer School
132+
for the opportunity to participate in such an inspiring project.
133+
I extend special thanks to Jonas Rembser, Vassil Vassilev, and David Lange for
134+
their invaluable guidance and for providing continuous learning opportunities throughout this journey.
135+
I am also grateful to the ROOT team for welcoming me and supporting me throughout my stay at CERN.
136+
137+
## **Related Links**
138+
- [CMS Combine GitHub page](https://cms-analysis.github.io/HiggsAnalysis-CombinedLimit/latest/)
139+
- [ROOT official repository](https://github.com/root-project/root)
140+
- [My GitHub profile](https://github.com/GalinBistrev2)
141+
- [Presentation](/assets/presentations/CaaS_Weekly_25_09_2025_Galin_Bistrev_AD_in_CMS_Combine.pdf)
142+
143+
144+
786 KB
Binary file not shown.

0 commit comments

Comments
 (0)