## 1. $H\rightarrow b\bar{b}$ via Machine Learning 

Machine learning algorithms are one of the ways with which we create Artificial Intelligence. The distinguishing featuere of machine learning compared to usual programming is training. Instead of programming the steps to get to the solution to a problem we program a machine learning algorithm to be trained to learn the steps towards the desired solution. 

On the ATLAS $H\rightarrow b\bar{b}$ analysis, using machine learning allows us to extract information about the correlations between the different kinematic variables to improve discrimination between signal and background. 

In this exercise you will build and train a Neural Network algorithm to classify $H\rightarrow b\bar{b}$ events instead of using sequential cuts. 

### 1.1 What is a Neural Network?

The neural network model is loosely based on the human brain. A Neuron receives a number of input signals, these signals are then processed at the neuron which passes on a new signal (fires) to the next neuron only if a certain condition is met. 

A neural network is a collection of artificial neurons organised in layers (as shown in the image below) that connect together allowing the neural network to learn complexities and make sophisticated decisions. The artificial neurons work by simple multiplication between the input signals (variables) coming into the node and a set of weights. If the produce of this multiplication is above a certain threshold, the neuron fires! 


![Neural Network](https://i.stack.imgur.com/OH3gI.png)

The neurons must be trained to process the signal appropriately so as to maximise the accuracy in decision making. To understand this, let's go back to the human brain analogy. Say you were teaching your younger sibling how to recognise numbers. You would repeatedly show them pictures of numbers and tell them which one is which. You might test their knowledge by giving them a new image and then seeing whether they recognised what digit it was. You might then praise them for getting it right and correct them if they got it wrong. 


Neurons are trained in a similar way. To train a neural network to learn how to recognise handwritten digits we would give it thousands of images of labelled handwritten digits. It will pass these images through the network in the form of numbers between 0 and 1, compute the product between these inputs (the image in the form of numbers) and the weights at all neurons. The final product at the end of the  neural network will be used to reach a decision. The training process updates the weights such that the accuracy is maximised. 

This process is similar to maximisation (or minimisation) problems you may have done in A-Level Maths using differentiation to find the minimum/maximum value of function. 

All this might sound a bit confusing! Don’t worry it is a lot to take in within a short space of time. Ask your mentors about anything you’re unsure about. 

We're now going to learn how to build and apply a neural network in Python using the Keras library. 


### 1.2 Neural Networks in Keras

The algorithm for doing the $H\rightarrow b\bar{b}$ can be summarised with the following 6 steps:

* Import libraries and load data.
* Build neural network. 
* Training Neural Network. 
* Make decisions about events using the Neural Network. 
* Calculate sensitivity. 


### Importing libraries and loading 

The Keras library is a free open-source tool to quickly and easily develop with neural networks. It does a lot of the heavy lifting for us making it an ideal tool for prototyping. Just like any popular Python library, it is well documented with guides explaining how to use it. These are available here: https://keras.io/

In this exercise, your mentors will guide you through a lot of the code that has already been written for you so that you can focus specifically on building the neural network. It is still valuable to understand what is going on so make sure you ask questions! 

In [None]:
import pandas as pd
from copy import deepcopy
from ucl_masterclass import *
from sklearn.preprocessing import scale
from keras.models import Sequential
from keras.layers import Dense
from keras.models import model_from_yaml
from keras.layers import Input, Dense, Dropout, Flatten
from keras import backend as K
from time import time


df_even= pd.read_csv('../data-v1/VHbb_data_2jet_even.csv')
df_odd = pd.read_csv('../data-v1/VHbb_data_2jet_odd.csv')

### Training 

You may have wondered why the above data is loaded into two data frames labelled odd and even. We do this to avoid *overtraining*. But before we go into that, let's cover the basics of training. 

The neural network is given a set of training events that have a Class label of 1 for signal and 0 for background. During the training phase, the neural network goal is to *learn* how to map input variables given to it (kinematic and topological quantities of the event) to 0 for background and 1 for signal. 

To ensure that the neural network is learning general features about the signal and background process and not just **artifacts** unique to the training data set, we split the data set into a training and validation set. Two neural network are trained with one being trained on *even* events and tested on *odd* events and the other being trained on *odd* events and tested on *even* events.

**In the section below we extract the input (x) and target values (y) used to train the neural network.**

In [None]:
# List of variables used in training
variables = ['dRBB','mBB','MET','Mtop','pTV',]

# Even events
x_even = scale(df_even[variables].values)
y_even = df_even['Class'].values
w_even = df_even['training_weight'].values

# Odd events
x_odd = scale(df_odd[variables].values)
y_odd = df_odd['Class'].values
w_odd = df_odd['training_weight'].values

### Building a Neural Network

To build the neural network we will be using the Keras *Sequential* model. This works by allowing us to add layers into our model using the following line of code which adds a layer with 10 neurons. 
```python
model.add(Dense(units=10,activation='relu')) # adds a layer with 10 neurons. 
```

By sequentially adding layers you can create a fully connected deep neural network within minutes! 


**Ask your mentor to explain what the following terms are:**
* Activiation
* Loss
* Gradient Descent 



### Task
Earlier you developed an analysis for $H\rightarrow b\bar{b}$ using sequential cuts on 'dRBB','mBB','MET','Mtop', and'pTV'. You will now build a neural network to classify the simulated events as signal or background. 

After discussing neural network architectures with your mentor you will: 

* In the code cell below, define the architecture for your neural network. 
* Set the batch size used in training and the number of epochs in training. 

Once you have done this run the code to start training your neural network. The time taken to train the network will be printed at the end. 

In [None]:
start = time()
num_variables = len(variables)

# Define architecture 
def classifier():
    """
    Creates a model for higgs to bb classification
    
    returns: Keras model
    """
    
    model = Sequential()
    
    # The input layer
    model.add(Dense(units=num_variables,input_dim = num_variables,activation='relu'))
    
    # Add hidden layers here
    # ======================
    
    
    

    # Output layer
    model.add(Dense(1,activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer='SGD', metrics=['accuracy'])
    return model


# Create and compile models
model_even = classifier()
model_odd = classifier()

# Set these parameters
# ====================
epochs = 1
batchSize = 5

# train
model_even.fit(x_even,y_even,sample_weight = w_even, epochs=epochs, batch_size=batchSize,verbose = 1)
model_odd.fit(x_odd,y_odd,sample_weight = w_odd, epochs=epochs, batch_size=batchSize,verbose = 1)

print("model trained in " + str(round(time()-start,2))+"s")

### Evaluating our Neural Network

The code below will now test your neural networks on unseen events. Remember if a neural network was trained on the odd data set and hence it will be tested on the even data set. The neural network is used to classify the events and calculate the sensitivity using the neural network scores plotted in the histogram below.



In [None]:
df_odd['decision_value'] = model_even.predict_proba(x_odd)
df_even['decision_value'] = model_odd.predict_proba(x_even)
df = pd.concat([df_odd,df_even])
print("A sensitivity of", round(sensitivity_NN(df)[0],2),'was achieved')
nn_output_plot(df)

### Exercises:

* Build and evaluate a neural network with 3 'hidden' layers.
* Vary the number of epochs and the batch_size. 

Note down the sensitivity achieved and the time taken in training.


### Challenge
In your groups, you will now prototype your own neural network architecture. At the end of the event judges will be a prize for the group with best neural network based on the following criteria: 
* Sensitivity achieved.
* Time taken in training. 
* Elegance of architecture. (used in tie break situations) 

You can change any of the parameters in the function above. 

In [None]:
start = time()
num_variables = len(variables)

# Define architecture 
def classifier():
    """
    Creates a model for higgs to bb classification
    
    returns: Keras model
    """
    
    model = Sequential()
    
    # The input layer
    model.add(Dense(units=num_variables,input_dim = num_variables,activation='relu'))
    
    # Add hidden layers here
    # ======================
    
    
    
    
    # Output layer
    model.add(Dense(1,activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer='SGD', metrics=['accuracy'])
    return model


# Create and compile models
model_even = classifier()
model_odd = classifier()


# Set these parameters
# ====================
epochs = 
batchSize = 


# train
model_even.fit(x_even,y_even,sample_weight = w_even, epochs=epochs, batch_size=batchSize,verbose = 1)
model_odd.fit(x_odd,y_odd,sample_weight = w_odd, epochs=epochs, batch_size=batchSize,verbose = 1)

print("model trained in " + str(round(time()-start,2))+"s")

In [None]:
## Evaluation Code
df_odd['decision_value'] = model_even.predict_proba(x_odd)
df_even['decision_value'] = model_odd.predict_proba(x_even)
df = pd.concat([df_odd,df_even])
print("A sensitivity of", round(sensitivity_NN(df)[0],2),'was achieved')
nn_output_plot(df)

## Summary 

Neural Networks can learn correlations between variables and create a better understanding of the classification process hence leading to better signal sensitivity. They have been very popular in industry and regularly win competitions on the Kaggle machine learning forum. If you are interested in doing more machine learning this is a free 'nanodegree' on deep learning: https://www.youtube.com/watch?v=vOppzHpvTiQ&list=PL2-dafEMk2A7YdKv4XfKpfbTH5z6rEEj3

**This material was produced by hackingEducation**  
<img src="../images/logo-black.png" width="50" align = 'left'/>
