Neural Network Model

Import necessary libraries and modules for data manipulation,neural network modeling, and visualization.

In [2]:
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
import time
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import classification_report
from sklearn.metrics import ConfusionMatrixDisplay
from sklearn import preprocessing



Load the dataset from the specified file path.

In [3]:
data = pd.read_csv("../Dataset/higgs10k.csv", index_col=False)
data_encoded = pd.get_dummies(data, drop_first=False)
print(data_encoded)

      process type  lepton  pT  lepton  eta  lepton  phi  \
0              1.0    0.748873    -0.874678    -1.194860   
1              0.0    1.630977     0.634973    -0.305970   
2              1.0    0.299586     1.489144     0.709529   
3              1.0    0.455144    -0.117904     1.524621   
4              1.0    0.540243    -0.597097    -0.997884   
...            ...         ...          ...          ...   
9995           1.0    0.774678     1.220329     1.540712   
9996           1.0    1.155702     0.395377    -0.800909   
9997           0.0    1.578819     1.777439    -0.199436   
9998           0.0    0.348449     2.227412    -0.784263   
9999           0.0    0.503824    -0.240624     0.396588   

      missing energy magnitude  missing energy phi  jet 1 pt  jet 1 eta  \
0                     0.766419           -0.553490  0.394003   1.605137   
1                     0.215144           -0.452114  0.550285   0.697096   
2                     1.582114           -1.352004  0.

Extract the target variable 'process type' and store it as 'y'. The rest of the data columns, excluding 'process type', are stored as 'x'. Then, the data is split into training and testing sets using an 80-20 split.


In [4]:
y = (data_encoded['process type']).values
x = (data_encoded.drop(columns=['process type'])).values
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)

Use the StandardScaler to normalize the data to have mean=0 and variance=1 for better performance during modeling.

In [5]:
scaler = preprocessing.StandardScaler().fit(x_train)
x_train = scaler.transform(x_train)
x_test = scaler.transform(x_test)

Define the neural network model and set hyperparameters for tuning. In this case, we're using the MLPClassifier with a maximum of 1000 iterations, and early stopping enabled.


In [6]:
nn = MLPClassifier(max_iter=1000, random_state=1, early_stopping=True)
parameter_space = {
    'hidden_layer_sizes': [(100, 50, 30), (20,), (5, 2), (100, 50), (50, 30, 20)],
    'activation': ['tanh', 'relu', 'logistic'],
    'solver': ['sgd', 'adam'],
    'alpha': [0.0001, 0.001, 0.005, 0.01, 0.05],
    'learning_rate': ['constant', 'adaptive'],
}

Use GridSearchCV to search for the best hyperparameter by performing cross-validation on the training set.

In [7]:
# Measure time for training
start_time = time.time()

clf = GridSearchCV(nn, parameter_space, n_jobs=-1, cv=10)
clf.fit(x_train, y_train)

# Training time
elapsed_time = time.time() - start_time
print(f"Training completed in {elapsed_time:.2f} seconds")


Training completed in 2244.89 seconds


Display the best hyperparameters obtained from the GridSearchCV.

In [8]:
print('Best parameters found:\n', clf.best_params_)
means = clf.cv_results_['mean_test_score']
stds = clf.cv_results_['std_test_score']
for mean, std, params in zip(means, stds, clf.cv_results_['params']):
    print("%0.3f (+/-%0.03f) for %r" % (mean, std * 2, params))

Best parameters found:
 {'activation': 'tanh', 'alpha': 0.005, 'hidden_layer_sizes': (100, 50, 30), 'learning_rate': 'constant', 'solver': 'adam'}
0.623 (+/-0.043) for {'activation': 'tanh', 'alpha': 0.0001, 'hidden_layer_sizes': (100, 50, 30), 'learning_rate': 'constant', 'solver': 'sgd'}
0.667 (+/-0.051) for {'activation': 'tanh', 'alpha': 0.0001, 'hidden_layer_sizes': (100, 50, 30), 'learning_rate': 'constant', 'solver': 'adam'}
0.623 (+/-0.044) for {'activation': 'tanh', 'alpha': 0.0001, 'hidden_layer_sizes': (100, 50, 30), 'learning_rate': 'adaptive', 'solver': 'sgd'}
0.667 (+/-0.051) for {'activation': 'tanh', 'alpha': 0.0001, 'hidden_layer_sizes': (100, 50, 30), 'learning_rate': 'adaptive', 'solver': 'adam'}
0.599 (+/-0.032) for {'activation': 'tanh', 'alpha': 0.0001, 'hidden_layer_sizes': (20,), 'learning_rate': 'constant', 'solver': 'sgd'}
0.636 (+/-0.039) for {'activation': 'tanh', 'alpha': 0.0001, 'hidden_layer_sizes': (20,), 'learning_rate': 'constant', 'solver': 'adam'}
0.

Use the trained model to make predictions on the test set and print a classification report and accuracy score.

In [9]:
y_true, y_pred = y_test, clf.predict(x_test)
print('Results on the test set:')
print(classification_report(y_true, y_pred))
print('Accuracy: ', clf.score(x_test, y_test))

Results on the test set:
              precision    recall  f1-score   support

         0.0       0.65      0.62      0.64       950
         1.0       0.67      0.69      0.68      1050

    accuracy                           0.66      2000
   macro avg       0.66      0.66      0.66      2000
weighted avg       0.66      0.66      0.66      2000

Accuracy:  0.6605


Plot a confusion matrix to visualize the performance of the model on different classes in the test set.

In [None]:
fig = plot_confusion_matrix(clf, x_test, y_test, display_labels=clf.classes_)
fig.figure_.suptitle("Confusion Matrix")
plt.show()