# 1. Searching for $H\rightarrow b\bar{b}$

One of the primary physics objectives of the Large Hadron Collider is to study the Higgs Boson and understand the process with which particles acquire mass. The Higgs Boson's decays channels along with the respective branching ratios are shown in the figure below. 

<img src="https://physicsforme.files.wordpress.com/2012/07/higgs_decay.jpg" width="400" />



In 2012 the Higgs Boson was discovered in the $H\rightarrow\gamma\gamma$ and $H\rightarrow ZZ \rightarrow 4l$. The plot below shows a clear excess of signal events over background around 125 GeV in the $H\rightarrow ZZ \rightarrow 4l$ 'golden channel'. 

<img src="https://3c1703fe8d.site.internapcdn.net/newman/gfx/news/hires/2017/newatlasprec.png" width="400" />


But if the Higgs Boson predominantly decays to a pair of $b$-quarks, why wasn't it discovered in the $H\rightarrow b\bar{b}$ channel? 

The discovery channels leave a clean signature in the detector that is easy to isolate making searches in the $H\rightarrow\gamma\gamma$ or $H\rightarrow ZZ \rightarrow 4l$ favourable. However, $H\rightarrow b\bar{b}$ searches aren't so straightforward!

Here's why:

## Large background 

Firstly, the $H\rightarrow b\bar{b}$ process at the LHC has enourmous amount of background with final states that are the same as the signal i.e. two $b$-quarks. The figure below shows the cross sections (likelihood) of a Higgs event and compared to a bb event at the LHC. In the LHC current operational energy, only around 1 in a trillion proton-proton collisions create a Higgs boson. 


<img src="../docs/images/lhc_cross_sections.png" width="400" />

Hence, the signal can get drowned out in with the signal-like background. 

## Jet Production

Secondly, precisely measuring and distinguishing $b$-quarks from say $c$-quarks is also challenging. When a pair of $b$-quarks is created in the ATLAS detector they go through a process known as hadronisation and form [jets](https://www.youtube.com/watch?v=FMH3T05G\_to). The jets must be identified as originating from b-quarks (b-jets) by a process known as $b$-tagging. This extra level of complexity along with the large background makes searches for $H\rightarrow b\bar{b}$ challenging. 

### 1.2 The WH 1-Lepton Channel

The process we will be searching for is shown in the Feynman Diagram below. A Higgs Boson is radiated off a $W^\pm$ boson which subsequently decays to a pair of _b_-quarks. The $W^\pm$ boson then goes on to decay into a lepton and a corresponding neutrino. 

<img src="../docs/images/one-lepton.png" width="350" />


The final state products of a 1-lepton channel $H\rightarrow b\bar{b}$ process are:
   * A Neutrino [characterised as missing transverse energy in the detector].
   * A charged Lepton (e u) [characterised by the transverse momentum and direction].
   * 2 _b_-jets [characterised by their transverse momentum, direction, distance between them and their reconstructed mass].


### 1.3 Separating Signal from Background


We separate ATLAS events using kinematic and topological parameters. A list of the variables that can be used in this exercise is shown below: 



| Variable        | Description           | Label  |
| ------------- |:-------------:| -----:|
|$n_J$                   | Number of jet in the event (this is always 2 in this exercise) | nJ |
|$n_{\text{Tags}}$       | Number of b-tagged jets in the event (this is always two in this exercise) | nTags |
|$\Delta R(b_1b_2)$      | Angular distance between the two *b*-tagged jets | dRBB |
| $p_T^B1$                | Reconstructed transverse momentum of the b-tagged jet with the highest $p_{T}$                      | pTB1 |
| $p_T^B2$                | Reconstructed transverse momentum of the b-tagged jet with the 2nd highest $p_{T}$                      | pTB2 |
| $p_T^V$                | Reconstructed transverse momentum of the vector boson                      | pTV |
| $m_{BB}$               | Reconstructed mass of the Higgs boson from the two b-tagged jets                     | mBB |
| $m_{top}$              | Reconstructed top quark mass                     | Mtop |
| $m_{T}^{W}$              | Reconstructed transverse mass of the W boson                     | mTW |
| $E^{Miss}_{T}$         | Missing transverse energy                        | MET |
| $m^{W}_{T}$            | Reconstructed transverse mass of the W boson                        | mTW |
| $dY(W, H)$             | Separation between the W boson and the Higgs candidate                        | dYWH |
| $d\phi(W, H)$          | Angular seperationg in the transverse plane between the W boson and the Higgs candidate                        | dPhiVBB |
| $MV1^{B1}_\text{cont}$        | The classification output of whether the leading jet is a b or not (the higher the value the more likely it is a b-jet) | MV1cB_cont |
| $MV1^{B2}_{\text{cont}}$        | The classification output of whether the sub-leading jet is a b or not (the higher the value the more likely it is a b-jet) | MV1cB2_cont |
| $n^{\text{Jets}}_{\text{cont}}$        | Number of additional jets found in the event | nTrackJetsOR |


                    Table 1: Kinematic and topological paramaeters used to identify events. 



### 1.4 Tasks

Baseline
- Produce an optimised cut-based analysis using the di-jet mass as a discriminant to use as baseline and for comparison (shouldn’t spend more than 1-2 days on this). A notebook that reads in the data, visualises the various distributions and calculates the signal sensitivity is provided as a starting point here:
	+ https://github.com/samvanstroud/in2HEP/blob/practicalMLproject/practicalMLproject/ATLAS_Cut_Based.ipynb
- Produce a simple optimised NN-based supervised classifier to seperate signal vs background. A notebook that reads in the data, draws the classifier output and calculates the _signal sensitivity_ is provided as a starting point here: 
	+ https://github.com/samvanstroud/in2HEP/blob/practicalMLproject/practicalMLproject/ATLAS_NN.ipynb
	+ Number of nodes in layers
	+ Number of hidden layers
	+ Training parameters
	+ Activation functions
	+ Optimisation algorithms
- Determine:
	+ Improvement over a cut-based approach
	+ If we have enough training statistics (vary number of input statistics separately for signal and background from 0 -> 100%, does it plateau?)
	+ The importance of the input variables (remove one variable at a time and retrain, how much does the sensitivity degrade by?)

Extensions:
- Investigate which events are selected in the most sensitive region (the high NN output region), are these similar to those selected in the cut-based approach?
- Determine training uncertainty (how much does performance vary when re-training an identical configuration)
- Try automated hyper-parameter optimisation (Baysien, stochastic), compare to a grid search
- Train a multi-classifier or MVA cascade for the different backgrounds (V+bb, tt, diboson), how does this compare 





**Based upon material originally produced by hackingEducation for use in outreach**  
<img src="../docs/images/logo-black.png" width="50" align = 'left'/>