`pyhf`: pure-Python implementation of HistFactory with tensors and automatic differentiation

.huge.blue[Matthew Feickert]
.huge[(University of Wisconsin-Madison)]

matthew.feickert@cern.ch

International Conference on High Energy Physics (ICHEP) 2022

July 8th, 2022

`pyhf` team

Lukas Heinrich

Technical University of Munich ] .kol-1-3.center[ .circle.width-80[]

Matthew Feickert

University of Wisconsin-Madison
(Illinois for work presented today) ] .kol-1-3.center[ .circle.width-75[]

Giordon Stark

University of California Santa Cruz SCIPP ] ]

.center.large[plus more than 20 contributors]

Goals of physics analysis at the LHC

Make precision measurements ] .kol-1-3.center[ .width-110[[![SUSY-2018-31_limit](figures/SUSY-2018-31_limit.png)](https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PAPERS/SUSY-2018-31/)]

Provide constraints on models through setting best limits ] ]

All require .bold[building statistical models] and .bold[fitting models] to data to perform statistical inference
Model complexity can be huge for complicated searches
Problem: Time to fit can be .bold[many hours]
.blue[Goal:] Empower analysts with fast fits and expressive models

HistFactory Model

A flexible probability density function (p.d.f.) template to build statistical models in high energy physics
Developed in 2011 during work that lead to the Higgs discovery [CERN-OPEN-2012-016]
Widely used by ATLAS for .bold[measurements of known physics] and .bold[searches for new physics]

.kol-2-5.center[ .width-90[] .bold[Standard Model] ] .kol-3-5.center[ .width-100[]
.bold[Beyond the Standard Model] ]

HistFactory Template: at a glance

$$ f\left(\mathrm{data}\middle|\mathrm{parameters}\right) = f\left(\textcolor{#00a620}{\vec{n}}, \textcolor{#a3130f}{\vec{a}}\middle|\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}}\right) = \textcolor{blue}{\prod_{c \,\in\, \textrm{channels}} \prod_{b \,\in\, \textrm{bins}_c} \textrm{Pois} \left(n_{cb} \middle| \nu_{cb}\left(\vec{\eta}, \vec{\chi}\right)\right)} \,\textcolor{red}{\prod_{\chi \,\in\, \vec{\chi}} c_{\chi} \left(a_{\chi}\middle|\chi\right)} $$

.center[$\textcolor{#00a620}{\vec{n}}$: .obsdata[events], $\textcolor{#a3130f}{\vec{a}}$: .auxdata[auxiliary data], $\textcolor{#0495fc}{\vec{\eta}}$: .freepars[unconstrained pars], $\textcolor{#9c2cfc}{\vec{\chi}}$: .conpars[constrained pars]]

$$ \nu_{cb}(\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}}) = \sum_{s \,\in\, \textrm{samples}} \underbrace{\left(\sum_{\kappa \,\in\, \vec{\kappa}} \kappa_{scb}(\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}})\right)}_{\textrm{multiplicative}} \Bigg(\nu_{scb}^{0}(\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}}) + \underbrace{\sum_{\Delta \,\in\, \vec{\Delta}} \Delta_{scb}(\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}})}_{\textrm{additive}}\Bigg) $$

.bold[Use:] Multiple disjoint channels (or regions) of binned distributions with multiple samples contributing to each with additional (possibly shared) systematics between sample estimates

.blue[Main Poisson p.d.f. for simultaneous measurement of multiple channels]
.katex[Event rates] $\nu_{cb}(\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}})$ (nominal rate $\nu_{scb}^{0}$ with rate modifiers)
- encode systematic uncertainties (e.g. normalization, shape)
.red[Constraint p.d.f. (+ data) for "auxiliary measurements"]

HistFactory Template: at a second glance

$$ f\left(\mathrm{data}\middle|\mathrm{parameters}\right) = f\left(\textcolor{#00a620}{\vec{n}}, \textcolor{#a3130f}{\vec{a}}\middle|\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}}\right) = \prod_{c \,\in\, \textrm{channels}} \prod_{b \,\in\, \textrm{bins}_c} \textrm{Pois} \left(\textcolor{#00a620}{n_{cb}} \middle| \nu_{cb}\left(\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}}\right)\right) \,\prod_{\chi \,\in\, \vec{\chi}} c_{\chi} \left(\textcolor{#a3130f}{a_{\chi}}\middle|\textcolor{#9c2cfc}{\chi}\right) $$

.center[$\textcolor{#00a620}{\vec{n}}$: .obsdata[events], $\textcolor{#a3130f}{\vec{a}}$: .auxdata[auxiliary data], $\textcolor{#0495fc}{\vec{\eta}}$: .freepars[unconstrained pars], $\textcolor{#9c2cfc}{\vec{\chi}}$: .conpars[constrained pars]]

$$ \nu_{cb}(\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}}) = \sum_{s \,\in\, \textrm{samples}} \underbrace{\left(\sum_{\kappa \,\in\, \vec{\kappa}} \kappa_{scb}(\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}})\right)}_{\textrm{multiplicative}} \Bigg(\nu_{scb}^{0}(\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}}) + \underbrace{\sum_{\Delta \,\in\, \vec{\Delta}} \Delta_{scb}(\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}})}_{\textrm{additive}}\Bigg) $$

.bold[Use:] Multiple disjoint channels (or regions) of binned distributions with multiple samples contributing to each with additional (possibly shared) systematics between sample estimates

.blue[Main Poisson p.d.f. for simultaneous measurement of multiple channels]
.katex[Event rates] $\nu_{cb}(\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}})$ (nominal rate $\nu_{scb}^{0}$ with rate modifiers)
- encode systematic uncertainties (e.g. normalization, shape)
.red[Constraint p.d.f. (+ data) for "auxiliary measurements"]

HistFactory Template: grammar

$$ f\left(\mathrm{data}\middle|\mathrm{parameters}\right) = f\left(\textcolor{#00a620}{\vec{n}}, \textcolor{#a3130f}{\vec{a}}\middle|\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}}\right) = \textcolor{blue}{\prod_{c \,\in\, \textrm{channels}} \prod_{b \,\in\, \textrm{bins}_c} \textrm{Pois} \left(n_{cb} \middle| \nu_{cb}\left(\vec{\eta}, \vec{\chi}\right)\right)} \,\textcolor{red}{\prod_{\chi \,\in\, \vec{\chi}} c_{\chi} \left(a_{\chi}\middle|\chi\right)} $$

Mathematical grammar for a simultaneous fit with:

.blue[multiple "channels"] (analysis regions, (stacks of) histograms) that can have multiple bins
with systematic uncertainties that modify the event rate $\nu_{cb}(\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}})$
coupled to a set of .red[constraint terms]

.center.width-40[] .center[Example: .bold[Each bin] is separate (1-bin) channel, each .bold[histogram] (color)
is a sample and share a .bold[normalization systematic] uncertainty]

HistFactory Template: implementation

$$ f\left(\mathrm{data}\middle|\mathrm{parameters}\right) = f\left(\textcolor{#00a620}{\vec{n}}, \textcolor{#a3130f}{\vec{a}}\middle|\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}}\right) = \prod_{c \,\in\, \textrm{channels}} \prod_{b \,\in\, \textrm{bins}_c} \textrm{Pois} \left(\textcolor{#00a620}{n_{cb}} \middle| \nu_{cb}\left(\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}}\right)\right) \,\prod_{\chi \,\in\, \vec{\chi}} c_{\chi} \left(\textcolor{#a3130f}{a_{\chi}}\middle|\textcolor{#9c2cfc}{\chi}\right) $$

.center[$\textcolor{#00a620}{\vec{n}}$: .obsdata[events], $\textcolor{#a3130f}{\vec{a}}$: .auxdata[auxiliary data], $\textcolor{#0495fc}{\vec{\eta}}$: .freepars[unconstrained pars], $\textcolor{#9c2cfc}{\vec{\chi}}$: .conpars[constrained pars]]

$$ \nu_{cb}(\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}}) = \sum_{s \,\in\, \textrm{samples}} \underbrace{\left(\sum_{\kappa \,\in\, \vec{\kappa}} \kappa_{scb}(\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}})\right)}_{\textrm{multiplicative}} \Bigg(\nu_{scb}^{0}(\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}}) + \underbrace{\sum_{\Delta \,\in\, \vec{\Delta}} \Delta_{scb}(\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}})}_{\textrm{additive}}\Bigg) $$

.center[.bold[This is a mathematical representation!] Nowhere is any software spec defined] .center[.bold[Until 2018] the only implementation of HistFactory has been in ROOT]

.center.width-70[]

`pyhf`: HistFactory in pure Python

.kol-1-2.large[

First non-ROOT implementation of the HistFactory p.d.f. template
- .width-40[]
pure-Python library as second implementation of HistFactory
- $ python -m pip install pyhf
- No dependence on ROOT!

.center.width-100[] ] .kol-1-2.large[

Open source tool for all of HEP
- IRIS-HEP supported Scikit-HEP project
- Used in ATLAS SUSY, Exotics, and Top groups in 22 published analyses (inference and published models)
- Used by Belle II
  (DOI: 10.1103/PhysRevLett.127.181802)
- Used in analyses and for reinterpretation by phenomenology community, SModelS
  (DOI: 10.1016/j.cpc.2021.107909), and MadAnalysis 5 (arXiv:2206.14870)
- Ongoing IRIS-HEP supported Fellow work to provide conversion support to CMS Combine as of Summer 2022! ]

`pyhf`: HistFactory in pure Python

.kol-1-2.large[

First non-ROOT implementation of the HistFactory p.d.f. template
- .width-40[]
pure-Python library as second implementation of HistFactory
- $ python -m pip install --pre pyhf
- No dependence on ROOT!

.center.width-100[] ] .kol-1-2.large[

Open source tool for all of HEP
- IRIS-HEP supported Scikit-HEP project
- Used in ATLAS SUSY, Exotics, and Top groups in 22 published analyses (inference and published models)
- Used by Belle II
  (DOI: 10.1103/PhysRevLett.127.181802)
- Used in analyses and for reinterpretation by phenomenology community, SModelS
  (DOI: 10.1016/j.cpc.2021.107909), and MadAnalysis 5 (arXiv:2206.14870)
- Ongoing IRIS-HEP supported Fellow work to provide conversion support to CMS Combine as of Summer 2022! ]

Machine Learning Frameworks for Computation

All numerical operations implemented in .bold[tensor backends] through an API of $n$-dimensional array operations
Using deep learning frameworks as computational backends allows for .bold[exploitation of auto differentiation (autograd) and GPU acceleration]
As huge buy in from industry we benefit for free as these frameworks are .bold[continually improved] by professional software engineers (physicists are not)

.kol-1-2.center[ .width-80[] ] .kol-1-2[

Hardware acceleration giving .bold[order of magnitude speedup] in interpolation for systematics!
- does suffer some overhead
Noticeable impact for large and complex models
- hours to minutes for fits ] ] .kol-1-4.center[ .width-85[] .width-85[] .width-85[]

Automatic differentiation

With tensor library backends gain access to exact (higher order) derivatives — accuracy is only limited by floating point precision

$$ \frac{\partial L}{\partial \mu}, \frac{\partial L}{\partial \theta_{i}} $$

.grid[ .kol-1-2[ .large[Exploit .bold[full gradient of the likelihood] with .bold[modern optimizers] to help speedup fit!]

.large[Gain this through the frameworks creating computational directed acyclic graphs and then applying the chain rule (to the operations)] ] .kol-1-2[ .center.width-80[] ] ]

JSON spec fully describes the HistFactory model

.kol-1-4.width-100[

Human & machine readable .bold[declarative] statistical models
Industry standard
- Will be with us forever
Parsable by every language
- Highly portable
- Bidirectional translation
  with ROOT
Versionable and easily preserved
- JSON Schema describing
  HistFactory specification
- Attractive for analysis preservation
- Highly compressible ] .kol-3-4.center[ .width-105[]

ATLAS validation and publication of models

.center.width-90[]

.center[(ATLAS, 2019)] ] .kol-1-2[ .center.width-100[[![CERN_news_story](figures/CERN_news_story.png)](https://home.cern/news/news/knowledge-sharing/new-open-release-allows-theorists-explore-lhc-data-new-way)] .center[(CERN, 2020)] ]

Large community adoption followed (2020 on)

Extending and visualization: cabinetry

.bold[pyhf] focuses on the modeling (library not a framework)
Leverage the design of the .bold[Scikit-HEP ecosystem] and close communication between pyhf dev team and cabinetry lead dev Alexander Held
.bold[cabinetry] designs & steers template profile likelihood fits
Uses pyhf as the inference engine
Provides common visualization for inference validation ] .kol-2-3[ .center.width-50[] .center.width-100[]

Core part of IRIS-HEP Analysis Systems pipeline

.large[Analysis Systems pipeline: deployable stack of experiment agnostic infrastructure]
- c.f. demonstration at IRIS-HEP Analysis Grand Challenge Tools Workshop 2022
.large[Accelerating fitting (reducing time to .bold[insight] (statistical inference)!)] (pyhf + cabinetry)
.large[An enabling technology for .bold[reinterpretation]] (pyhf + RECAST)

Browser native ecosystem as of April 2022

.center.width-100[

Browser native ecosystem as of April 2022

.center.width-100[]

Browser native ecosystem as of April 2022

.center.width-100[]

Browser native ecosystem as of April 2022

.center.width-100[]

Enabling full web apps with PyScript

.center.width-55[]

.center[Future software/statistics training, web applications, schemea validation enabled with Pyodide and PyScript]

Enabling full web apps with PyScript

.center.width-55[]

.center[Future software/statistics training, web applications, schemea validation enabled with Pyodide and PyScript]

Enabling full web apps with PyScript

.center.width-55[]

.center[Future software/statistics training, web applications, schemea validation enabled with Pyodide and PyScript]

HEPData support for HistFactory JSON and more

Summary

.large[Library for modeling and .bold[accelerated] fitting]
- reducing time to insight/inference!
- Hardware acceleration on GPUs and vectorized operations
- Backend agnostic Python API and CLI
.large[Flexible .bold[declarative] schema]
- JSON: ubiquitous, universal support, versionable
.large[Enabling technology for .bold[reinterpretation]]
- JSON Patch files for efficient computation of new signal models
- Unifying tool for theoretical and experimental physicists
.large[Growing use community across .bold[all of HEP]]
- Theory and experiment
.large[Project in growing .bold[Pythonic HEP ecosystem]]
- Openly developed on GitHub and welcome contributions
- Comprehensive open tutorials ] .kol-1-3[

.center.width-100[[![pyhf_logo](https://iris-hep.org/assets/logos/pyhf-logo.png)](https://github.com/scikit-hep/pyhf)] ]

Thanks for listening!

Come talk with us!

.large[www.scikit-hep.org/pyhf] ] .grid[ .kol-1-3.center[ .width-90[] ] .kol-1-3.center[
.width-90[] ] .kol-1-3.center[

.width-100[] ] ]

Backup

Why is the likelihood important?

.kol-1-2.width-90[

High information-density summary of analysis
Almost everything we do in the analysis ultimately affects the likelihood and is encapsulated in it
- Trigger
- Detector
- Combined Performance / Physics Object Groups
- Systematic Uncertainties
- Event Selection
Unique representation of the analysis to reuse and preserve ] .kol-1-2.width-100[

]

HistFactory Template: systematic uncertainties

In HEP common for systematic uncertainties to be specified with two template histograms: "up" and "down" variation for parameter $\theta \in \{\textcolor{#0495fc}{\vec{\eta}}, \textcolor{#9c2cfc}{\vec{\chi}} \}$
- "up" variation: model prediction for $\theta = +1$
- "down" variation: model prediction for $\theta = -1$
- Interpolation and extrapolation choices provide .bold[model predictions $\nu(\vec{\theta},)$ for any $\vec{\theta}$]

Constraint terms $c_{j} \left(\textcolor{#a3130f}{a_{j}}\middle|\textcolor{#9c2cfc}{\theta_{j}}\right)$ used to model auxiliary measurements. Example for Normal (most common case):
- Mean of nuisance parameter $\textcolor{#9c2cfc}{\theta_{j}}$ with normalized width ($\sigma=1$)
- Normal: auxiliary data $\textcolor{#a3130f}{a_{j} = 0}$ (aux data function of modifier type)
- Constraint term produces penalty in likelihood for pulling $\textcolor{#9c2cfc}{\theta_{j}}$ away from auxiliary measurement value
- As $\nu(\vec{\theta},)$ constraint terms inform rate modifiers (.bold[systematic uncertainties]) during simultaneous fit ] .kol-3-7[ .center.width-70[] .center[Image credit: Alexander Held] ]

Full likelihood serialization...

.center.width-90[ ]

In an "open world" of statistics this is a difficult problem to solve
What to preserve and how? All of ROOT?
Idea: Focus on a single more tractable binned model first

JSON Patch for signal model (reinterpretation)

.center[JSON Patch gives ability to .bold[easily mutate model]] .center[Think: test a .bold[new theory] with a .bold[new patch]!] .center[(c.f. Lukas Heinrich's RECAST talk from Snowmass 2021 Computational Frontier Workshop)]
.center[Combined with RECAST gives powerful tool for .bold[reinterpretation studies]]

.center.width-100[] ] .kol-1-5[

.center.width-100[] .center[Signal model B] ]

Probability models reserved on HEPData

pyhf pallet:
- Background-only model JSON stored
- Hundreds of signal model JSON Patches stored together as a pyhf "patch set" file
Fully preserve and publish the full statistical model and observations to give likelihood
- with own DOI! .width-20[]

...can be used from HEPData

pyhf pallet:
- Background-only model JSON stored
- Hundreds of signal model JSON Patches stored together as a pyhf "patch set" file
Fully preserve and publish the full statistical model and observations to give likelihood
- with own DOI! .width-20[]

.center.width-90[]

API Example: Hypothesis test

$ python -m pip install pyhf[jax,contrib]
$ pyhf contrib download https://doi.org/10.17182/hepdata.90607.v3/r3 1Lbb-pallet

import json
import pyhf

pyhf.set_backend("jax")  # Optional for speed
spec = json.load(open("1Lbb-pallet/BkgOnly.json"))
patchset = pyhf.PatchSet(json.load(open("1Lbb-pallet/patchset.json")))

workspace = pyhf.Workspace(spec)
model = workspace.model(patches=[patchset["C1N2_Wh_hbb_900_250"]])

test_poi = 1.0
data = workspace.data(model)
cls_obs, cls_exp_band = pyhf.infer.hypotest(
    test_poi, data, model, test_stat="qtilde", return_expected_set=True
)
print(f"Observed CLs: {cls_obs}")
# Observed CLs: 0.4573416902360917
print(f"Expected CLs band: {[exp.tolist() for exp in cls_exp_band]}")
# Expected CLs band: [0.014838293214187472, 0.05174259485911152,
# 0.16166970886709053, 0.4097850957724176, 0.7428200727035176]

]

Python API Example: Upper limit

$ python -m pip install pyhf[jax,contrib]
$ pyhf contrib download https://doi.org/10.17182/hepdata.90607.v3/r3 1Lbb-pallet

import json
import matplotlib.pyplot as plt
import numpy as np
import pyhf
from pyhf.contrib.viz.brazil import plot_results

pyhf.set_backend("jax")  # Optional for speed

spec = json.load(open("1Lbb-pallet/BkgOnly.json"))
patchset = pyhf.PatchSet(json.load(open("1Lbb-pallet/patchset.json")))

workspace = pyhf.Workspace(spec)
model = workspace.model(patches=[patchset["C1N2_Wh_hbb_900_250"]])

test_pois = np.linspace(0, 5, 41)  # POI step of 0.125
data = workspace.data(model)
obs_limit, exp_limits, (test_pois, results) = pyhf.infer.intervals.upperlimit(
    data, model, test_pois, return_results=True
)

print(f"Observed limit: {obs_limit}")
# Observed limit: 2.547958147632675
print(f"Expected limits: {[limit.tolist() for limit in exp_limits]}")
# Expected limits: [0.7065311975182036, 1.0136453820160332,
# 1.5766626372587724, 2.558234487679955, 4.105381941514062]

fig, ax = plt.subplots()
artists = plot_results(test_pois, results, ax=ax)
fig.savefig("upper_limit.pdf")

] ]

.kol-2-5[ .center.width-100[] ]

API Example: Extend with cabinetry

import json
import cabinetry
import pyhf
from cabinetry.model_utils import prediction
from pyhf.contrib.utils import download

# download the ATLAS bottom-squarks analysis probability models from HEPData
download("https://www.hepdata.net/record/resource/1935437?view=true", "bottom-squarks")

# construct a workspace from a background-only model and a signal hypothesis
bkg_only_workspace = pyhf.Workspace(
    json.load(open("bottom-squarks/RegionC/BkgOnly.json"))
)
patchset = pyhf.PatchSet(json.load(open("bottom-squarks/RegionC/patchset.json")))
workspace = patchset.apply(bkg_only_workspace, "sbottom_600_280_150")

# construct the probability model and observations
model, data = cabinetry.model_utils.model_and_data(workspace)

# produce visualizations of the pre-fit model and observed data
prefit_model = prediction(model)
cabinetry.visualize.data_mc(prefit_model, data)

# fit the model to the observed data
fit_results = cabinetry.fit.fit(model, data)

# produce visualizations of the post-fit model and observed data
postfit_model = prediction(model, fit_results=fit_results)
cabinetry.visualize.data_mc(postfit_model, data)

] ] .kol-2-7.center[ .center.width-90[] .center.width-90[] ]

Rapid adoption in ATLAS...

22 ATLAS SUSY, Exotics, Top analyses with full probability models published to HEPData
ATLAS SUSY will be continuing to publish full Run 2 likelihoods ] .kol-2-3[
direct staus, doi:10.17182/hepdata.89408 (2019)
sbottom multi-b, doi:10.17182/hepdata.91127 (2019)
1Lbb, doi:10.17182/hepdata.92006 (2019)
3L eRJR, doi:10.17182/hepdata.90607 (2020)
ss3L search, doi:10.17182/hepdata.91214 (2020) ] .kol-1-1[ .kol-1-1[ .kol-1-2[ .center.width-70[] ] .kol-1-2[ .center.width-70[] ] ] .center.smaller[SUSY EWK 3L RPV analysis (ATLAS-CONF-2020-009): Exclusion curves as a function of mass and branching fraction to $Z$ bosons] ]

...and by theory

pyhf likelihoods discussed in
- Les Houches 2019 Physics at TeV Colliders: New Physics Working Group Report
- Higgs boson potential at colliders: status and perspectives
SModelS team has implemented a SModelS/pyhf interface [arXiv:2009.01809]
- tool for interpreting simplified-model results from the LHC
- designed to be used by theorists
- SModelS authors giving tutorial later today! ] .kol-2-3[ .center.width-100[] .center.smaller[Feedback on use of public probability models, Sabine Kraml
  (ATLAS Exotics + SUSY Reinterpretations Workshop)]

]

Have produced three comparisons to published ATLAS likelihoods: ATLAS-SUSY-2018-04, ATLAS-SUSY-2018-31, ATLAS-SUSY-2019-08
- Compare simplified likelihood (bestSR) to full likelihood (pyhf) using SModelS

Ongoing work to interface CMS Combine

.kol-1-2.large[

pyhf users in 2022: ATLAS, Belle II, phenomenology community, IRIS-HEP
Working to create a bridge for CMS to use and validate with a converter to CMS Combine
- Difficult as HistFactory is "closed world" of models and CMS Combine is RooFit "open world"
IRIS-HEP Fellow Summer 2022 project is ongoing with some promising preliminary results ] .kol-1-2[

.center.width-100[] .center.smaller[.bold[A pyhf converter for binned likelihood models in CMS Combine]] ]

References

F. James, Y. Perrin, L. Lyons, .italic[Workshop on confidence limits: Proceedings], 2000.
ROOT collaboration, K. Cranmer, G. Lewis, L. Moneta, A. Shibata and W. Verkerke, .italic[HistFactory: A tool for creating statistical models for use with RooFit and RooStats], 2012.
L. Heinrich, H. Schulz, J. Turner and Y. Zhou, .italic[Constraining $A_{4}$ Leptonic Flavour Model Parameters at Colliders and Beyond], 2018.
A. Read, .italic[Modified frequentist analysis of search results (the $\mathrm{CL}_{s}$ method)], 2000.
K. Cranmer, .italic[CERN Latin-American School of High-Energy Physics: Statistics for Particle Physicists], 2013.
ATLAS collaboration, .italic[Search for bottom-squark pair production with the ATLAS detector in final states containing Higgs bosons, b-jets and missing transverse momentum], 2019
ATLAS collaboration, .italic[Reproducing searches for new physics with the ATLAS experiment through publication of full statistical likelihoods], 2019
ATLAS collaboration, .italic[Search for bottom-squark pair production with the ATLAS detector in final states containing Higgs bosons, b-jets and missing transverse momentum: HEPData entry], 2019

The end.

Files

talk.md

Latest commit

History

talk.md

File metadata and controls

pyhf: pure-Python implementation of HistFactory with tensors and automatic differentiation

pyhf team

Goals of physics analysis at the LHC

HistFactory Model

HistFactory Template: at a glance

HistFactory Template: at a second glance

HistFactory Template: grammar

HistFactory Template: implementation

pyhf: HistFactory in pure Python

pyhf: HistFactory in pure Python

Machine Learning Frameworks for Computation

Automatic differentiation

JSON spec fully describes the HistFactory model

ATLAS validation and publication of models

Large community adoption followed (2020 on)

Extending and visualization: cabinetry

Core part of IRIS-HEP Analysis Systems pipeline

Browser native ecosystem as of April 2022

Browser native ecosystem as of April 2022

Browser native ecosystem as of April 2022

Browser native ecosystem as of April 2022

Enabling full web apps with PyScript

Enabling full web apps with PyScript

Enabling full web apps with PyScript

HEPData support for HistFactory JSON and more

Summary

Thanks for listening!

Come talk with us!

Why is the likelihood important?

HistFactory Template: systematic uncertainties

Full likelihood serialization...

JSON Patch for signal model (reinterpretation)

Probability models reserved on HEPData

...can be used from HEPData

API Example: Hypothesis test

Python API Example: Upper limit

API Example: Extend with cabinetry

Rapid adoption in ATLAS...

...and by theory

Ongoing work to interface CMS Combine

References

`pyhf`: pure-Python implementation of HistFactory with tensors and automatic differentiation

`pyhf` team

`pyhf`: HistFactory in pure Python

`pyhf`: HistFactory in pure Python