# Index
 
 *Author: Jose A. Hernando*, January 2020

*Instituto Galego de Altas Enerxías. Universidade de Santiago de Compostela, Spain.*
Jose A. Hernando

Enero 2020


In [1]:
import time
print(' Last version ', time.asctime() )

 Last version  Mon Feb 27 16:24:45 2023


**About **

These lectures are about statistical method for rare event searches in Particle Physics using Python. 

They cover Hypothesis Testing and Confidence Intervals. They are based on the excellent lectures on statistics by Prosper ([lectures](https://indico.cern.ch/event/358542), [pdf](https://arxiv.org/pdf/1504.00945.pdf)), Cowan ([lectures](http://indico.cern.ch/event/173726/)) and Cramer ([lectures](https://indico.cern.ch/event/48425/)) given at CERN Academic Training.

We will use the Python scientic toolkits, Matplotlib, Numpy, Scipy, that are distributed with Anaconda Python. 

A disclaimer: *I am neither a statistician nor a programmer!* 

## Introduction to these lectures

Sometimes, we do an experiment to discover a new particle. 

If the particle exits in Nature we maybe find only few events. This is a **rare search**.

Rare events usually follow poissonian distributions. But Statistics are nicely and friendly in the "Gaussian domain".


If we 'observe' some rare events, then:

When we could clain an **observation** or a **discovery** of the new particle? 

And if not, what is the **limit** in a given observable (i.e. the half lifetime) we could impose?

In fact, *what does it mean discovery, observation, a limit, a confidence interval?*

And, *how do we compute them from data?*

These are the question we try to answer in these lectures.

We will discovery that we need to cover three issues:

- Hypothesis Testing

- Confidence Interval

- Regression 


They seem different but they are not!

We start with **simple hypothesis**, $H_1$ (i.e BSM) to be confronted with the current accepted hypothesis, $H_0$, (called null, i.e SM). 

Soon we realize that in general the alternative hypothesis depends on a **strength parameter** $\mu$, that is, $H_1(\mu)$. For example the half-lifetime of $\beta\beta0\nu$ decay. This is called **composite hypothesis**.

And then we are back to estimate the region of the parameter $\mu$ "compatible" with data, $x$, in other words, to define some **confidence interval**, which in most cases implies to estimate the parameter, $\hat{\mu}$, that is **regression**.

The starting point is a bifurcation: either we follow a **Bayes** or a **Frequentist** path.

Be a Bayesian usually implies do **integration** (sometime complicated integrals!).

Be a frequentist implies usually either do regression (fits!) or do **simulations**. 
But thanks to the current computer power, we can play the **frequentist game**!

In this lectures we will use Python (in fact, scipy, numpy and mathplotlib) for our (mostly) frequentist journey!

But let's start with refreshing some basic ideas about probability density functions, likelihoods and posterior probabilities.

*** 
## Index

* [Basic concepts](./ta_basic_concepts.ipynb)
    
* [Simple hypothesis testings](./ta_hypothesis_test.ipynb)

* [Confidence Intervals](./ta_confidence_intervals.ipynb)

* [Composite hypothesis testing](./ta_hypothesis_test_composite.ipynb)

* [An example of composite hypothesis testing](./ta_hypothesis_test_composite_example.ipynb)

***

## Bibliography

[1] "Practical Statistic for LHC physicist," H. B. Prosper, CERN Academic Training Lectures (2015). https://indico.cern.ch/event/358542/ https://arxiv.org/pdf/1504.00945.pdf

[2] "Statistic for HEP," G. Cowan. CERN Academic Training Lectures (2012). http://indico.cern.ch/event/173726/

[3] "Statistics for Particle Physics," K. Cranmer, CERN Academic Training Lectures (2009). 
https://indico.cern.ch/event/48425/

[4] "Unified approach to the classical statistical analysis of small signals, "G. J. Feldman and R. D. Cousins, Phys. Rev. D57 (1998) 3873. http://journals.aps.org/prd/abstract/10.1103/PhysRevD.57.3873

[5] “Asymptotic formulae for likelihood-based tests of new physics,” Glen Cowan, Kyle Cranmer, Eilam Gross, Ofer Vitells. Eur. Phys. J. C71 1554 (2011). https://arxiv.org/abs/1007.1727

[6] "Incorporating systematic uncertainties into an upper limit," R.D. Cousins and V.L. Highland. Nucl. Instrum. Meth. A320, 331 (1992). http://www.sciencedirect.com/science/article/pii/0168900292907945

[7] "Confidence Level Computation for Combining Searches with Small Statistics," T. Junk, Nucl. Instrum. Meth. A434, 435 (1999). https://arxiv.org/abs/hep-ex/9902006

[8] "How good are your fits? Unbinned multivariate goodness-of-fit tests in high energy physics," M. Willians, https://arxiv.org/abs/1006.3019

[9] ROOT https://root.cern.ch, TMVA http://tmva.sourceforge.net, RooFit https://root.cern.ch/roofit

[10] Anaconda https://anaconda.org, SciPy https://www.scipy.org, NumPy http://www.numpy.org, Scikit-learn http://scikit-learn.org/stable/, Matplotlib http://matplotlib.org

[11] "Lectures on Statistics in Theory: Prelude to Statistics in Practice" B. Cousins, https://arxiv.org/abs/1807.05996

[12] "Statistics for physics". D. Tonelli. Invisible School 2019, Canfranc.