# Finding the Mass of a W-boson from Four-momentum

Given the 4 momentum of the W-boson, I am hoping for the neural network to learn the formula $$M^2 = E^2 - p^2$$

This time, instead of making it from scratch, I will try doing it using TensorFlow, which I've installed on my computer.

By the way, I've figured out what was wrong with my neural network from scratch code. I used the sigmoid activation function, which apparently guarentees a value between 0 and 1. This is why it kept being stuck at 1 no matter how many nodes we added to it. As such, this time I'll be using the ReLU activation function.

Also FYI, while I did look at Adam's notebook on this project as well as the [TensorFlow tutorial](https://www.tensorflow.org/tutorials/keras/classification) on the official site, I primarily used this [YouTube video](https://www.youtube.com/watch?v=Edhv7-4t0lc&list=PLqnslRFeH2Uqfv1Vz3DqeQfy0w20ldbaV&index=3) to guide me on constructing the neural network.

PID of W-boson = -24

Inputting the data should be relatively similar. (Using 1TeVDecay.csv file)

I assume that this meets [high-level guiding principle](https://github.com/agree019/Machine-Learning-MT2/blob/working/Nathan%20Meetings/9.28.22.pdf) #2 because if you wanted to get the 4 momentum squared or mass squared you could always just do np.square in front of the px/py/pz/e/m (although I suppose that's changing 5 lines instead of 1).

In [1]:
import pandas as pd
import math
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
TeVData = pd.read_csv("100TeVDecay.csv")
dataInput = TeVData[TeVData.PID == -11][['Px', 'Py', 'Pz', 'E']]
print(dataInput)
dataOutput = TeVData[TeVData.PID == -11][['M']]
print(len(dataInput))

# px = np.asarray(TeVData[TeVData.PID == -24]['Px'])
# py = np.asarray(TeVData[TeVData.PID == -24]['Py'])
# pz = np.asarray(TeVData[TeVData.PID == -24]['Pz'])
# e = np.asarray(TeVData[TeVData.PID == -24]['E'])
# m = np.asarray(TeVData[TeVData.PID == -24]['M'])
x_train, x_test, y_train, y_test = train_test_split(dataInput, dataOutput, test_size=0.25, train_size = 0.75)

              Px         Py         Pz          E
1      34.082959   0.447484  21.329689  40.209501
16     18.124688 -34.795598   8.807158  40.209501
34    -30.133966 -11.240563 -24.132920  40.209501
52    -11.321488 -32.140145  21.345702  40.209501
55    -24.034663 -31.920715  -4.495212  40.209501
...          ...        ...        ...        ...
29932 -11.878805  15.684833 -35.066850  40.209501
29935 -22.737105 -24.291624  22.577534  40.209501
29959  30.230409  -1.883528  26.445769  40.209501
29962  25.366209   4.016472 -30.939091  40.209501
29992 -18.683008 -20.249168 -29.286864  40.209501

[2468 rows x 4 columns]
2468


The next part will be importing TensorFlow, setting up the network topology (input nodes, # of hidden layers, # of hidden nodes per layer, output nodes), the loss-function, the optimizer (our learning rate, not sure if you can put momentum in Adam optimizer), and our metrics (what we want the program to print out while epochs are running, like % of datasets program got correct during training/testing, full list on [TensorFlow website](https://www.tensorflow.org/api_docs/python/tf/keras/metrics))

Hopefully this part covers [high-level guiding principle](https://github.com/agree019/Machine-Learning-MT2/blob/working/Nathan%20Meetings/9.28.22.pdf) #4 (changing network "topology" easily), principle #1 also covered because we are using TensorFlow.

In [2]:
import tensorflow as tf
from tensorflow import keras
import numpy as np

#model
layers = []
model = tf.keras.Sequential()
layers.append(keras.layers.Dense(6, activation = 'relu', input_dim = 4)) #number of input nodes
model.add(layers[0])
for i in range(1): #number of hidden layers - 1, can be changed
    layers.append(keras.layers.Dense(6, activation = 'relu'))
    model.add(layers[i + 1])
    #Note about hidden layers: This might not be able to be changed in 1 line if we want a different number of nodes for each
    #hidden layer (for example hidden layer 1 has 5 nodes, hidden layer 2 has 12 nodes, hidden layer 3 has 15 nodes)
    #This for loop only really works if the same # of nodes are in each hidden layer (in this case 6 for each)
    #I'll look into this to see if it can be changed but according to the video + other resources I'm not quite sure if this
    #is something you can just do in one line
layers.append(keras.layers.Dense(6, activation = 'relu'))
model.add(layers[len(layers) - 1])
#num. of output nodes I assume output layer size is 1 because we are just printing out the 
#mass of the W-boson, but output nodes can be changed
print(model.summary()) #check if topology working correctly

#loss and optimizer
loss = keras.losses.MeanSquaredError() #using MSE for error calculation
optimizer = keras.optimizers.Adam(learning_rate = 0.1) #put in learning rate, using 0.1
metrics = ["accuracy"] #not sure if I need to include other metrics besides accuracy but this can be changed
model.compile(loss = loss, optimizer = optimizer, metrics = metrics) #configure model for training

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 6)                 30        
                                                                 
 dense_1 (Dense)             (None, 6)                 42        
                                                                 
 dense_2 (Dense)             (None, 6)                 42        
                                                                 
Total params: 114
Trainable params: 114
Non-trainable params: 0
_________________________________________________________________
None


After that, we will be training the model.

In [6]:
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib
#training data
batch_size = 32 #batch size
epochs = 36 #epochs, can be adjusted
model.fit(x_train, y_train, batch_size = batch_size, epochs = epochs, shuffle = True, verbose = 2)
#not sure what verbose is, but I think it makes the output more specific when printing out errors

# print out weights/layers
# for i in layers:
#     print(i.get_weights())
print(layers[len(layers) - 1].output)

Epoch 1/36
58/58 - 0s - loss: 0.0000e+00 - accuracy: 1.0000 - 139ms/epoch - 2ms/step
Epoch 2/36
58/58 - 0s - loss: 0.0000e+00 - accuracy: 1.0000 - 140ms/epoch - 2ms/step
Epoch 3/36
58/58 - 0s - loss: 0.0000e+00 - accuracy: 1.0000 - 154ms/epoch - 3ms/step
Epoch 4/36
58/58 - 0s - loss: 0.0000e+00 - accuracy: 1.0000 - 152ms/epoch - 3ms/step
Epoch 5/36
58/58 - 0s - loss: 0.0000e+00 - accuracy: 1.0000 - 144ms/epoch - 2ms/step
Epoch 6/36
58/58 - 0s - loss: 0.0000e+00 - accuracy: 1.0000 - 150ms/epoch - 3ms/step
Epoch 7/36
58/58 - 0s - loss: 0.0000e+00 - accuracy: 1.0000 - 140ms/epoch - 2ms/step
Epoch 8/36
58/58 - 0s - loss: 0.0000e+00 - accuracy: 1.0000 - 150ms/epoch - 3ms/step
Epoch 9/36
58/58 - 0s - loss: 0.0000e+00 - accuracy: 1.0000 - 171ms/epoch - 3ms/step
Epoch 10/36
58/58 - 0s - loss: 0.0000e+00 - accuracy: 1.0000 - 156ms/epoch - 3ms/step
Epoch 11/36
58/58 - 0s - loss: 0.0000e+00 - accuracy: 1.0000 - 192ms/epoch - 3ms/step
Epoch 12/36
58/58 - 0s - loss: 0.0000e+00 - accuracy: 1.0000 - 

This part is testing the neural network.

In [4]:
model.evaluate(x_test, y_test, batch_size = batch_size, verbose = 2)

20/20 - 0s - loss: 0.0000e+00 - accuracy: 1.0000 - 312ms/epoch - 16ms/step


[0.0, 1.0]