<a href="https://colab.research.google.com/github/ppajewski/ANN-Students-Performance-Prediction/blob/master/Student_Performance_Prediction_with_ANN_with_Classification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Artificial Neural Network

In this notebook we are going to train ANN model to predict students grades. Description of the data may be found at:
https://archive.ics.uci.edu/ml/datasets/student+performance

### Importing the libraries

In [2]:
import numpy as np
import pandas as pd
import tensorflow as tf

## Part 1 - Data Preprocessing

### Importing the dataset

In [3]:
dataset = pd.read_csv('student-mat.csv', sep=';')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values

Let's say that we want to predict the grade that student gets. We divide points into 5 categories:

Grade | Points needed for the grade 
------ | -------
0 | $0 \leq x < 4$
1 | $4 \leq x < 8$
2 | $8 \leq x < 12$
3 | $12 \leq x < 16$
4 | $16 \leq x $

In [4]:
for i in range(y.shape[0]):
  if y[i] < 4:
    y[i] = 0
  if 4 <= y[i] < 8:
    y[i] = 1
  if 8 <= y[i] < 12:
    y[i] = 2
  if 12 <= y[i] <= 16:
    y[i] = 3
  if 16 <= y[i]:
    y[i] = 4

Checking for missing values

In [5]:
print(dataset.isna().sum())

school        0
sex           0
age           0
address       0
famsize       0
Pstatus       0
Medu          0
Fedu          0
Mjob          0
Fjob          0
reason        0
guardian      0
traveltime    0
studytime     0
failures      0
schoolsup     0
famsup        0
paid          0
activities    0
nursery       0
higher        0
internet      0
romantic      0
famrel        0
freetime      0
goout         0
Dalc          0
Walc          0
health        0
absences      0
G1            0
G2            0
G3            0
dtype: int64


No values missing

### Encoding categorical data

Label Encoding the binaray categories

In [6]:
from sklearn.preprocessing import LabelEncoder
r = [0, 1, 3, 4, 5]
le = LabelEncoder()
for i in r:  
  X[:, i] = le.fit_transform(X[:, i]) 
for i in range(15, 23, 1):
  X[:, i] = le.fit_transform(X[:, i]) 

In [7]:
from keras.utils import to_categorical
y = to_categorical(y, num_classes=5)

One Hot Encoding the text columns

In [8]:
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(transformers=[('encoder', OneHotEncoder(), [8, 9, 10, 11])], remainder='passthrough')
X = np.array(ct.fit_transform(X))
ct = ColumnTransformer(transformers=[('encoder', OneHotEncoder(), [0])], remainder='passthrough')

### Splitting the dataset into the Training set and Test set

In [9]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2)

### Feature Scaling

In [10]:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

## Part 2 - Building the ANN

### Initializing the ANN

In [11]:
ann = tf.keras.models.Sequential()

### Adding the input layer and the first hidden layer

In [12]:
from keras.layers import Dropout
ann.add(tf.keras.layers.Dense(units=40, activation='relu'))
ann.add(Dropout(0.2, input_shape=(40,)))

  super().__init__(**kwargs)


### Adding the second hidden layer

In [13]:
ann.add(tf.keras.layers.Dense(units=100, activation='relu'))
ann.add(Dropout(0.2, input_shape=(100,)))
ann.add(tf.keras.layers.Dense(units=60, activation='relu'))
ann.add(Dropout(0.2, input_shape=(60,)))

### Adding the output layer

In [14]:
ann.add(tf.keras.layers.Dense(units=5, activation='softmax'))

Number of units in the output layer corresponds to number of categories to be predicted - here we have grades from 0 to 4 so we need 5 units.

## Part 3 - Training the ANN

### Compiling the ANN

In [15]:
ann.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['categorical_accuracy'])

For binary outcome we must use 'binary_crossentropy' for the loss function. \
For non-binary outcome we use 'categorical_crossentropy'

### Training the ANN on the Training set

Batch learning is alwys more efficent - it takes batches of predictions and compares them to batches of results rather than comparing one by one. 32 is recomended, default value (we can fine-tune it to the problem as well). 

In [25]:
ann.fit(X_train, y_train, batch_size = 32, epochs = 2000,verbose=0)

<keras.src.callbacks.history.History at 0x1e1ac189bb0>

## Part 4 - Making the predictions and evaluating the model

### Making the Confusion Matrix

In [26]:
from sklearn.metrics import confusion_matrix, accuracy_score
from sklearn.metrics import multilabel_confusion_matrix
y_pred = ann.predict(X_test)
y_pred = (y_pred > 0.5)
cm = multilabel_confusion_matrix(y_pred, y_test)
print(cm)
print(f"Accuracy score: {accuracy_score(y_test, y_pred)*100}%")

[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[[[68  8]
  [ 0  3]]

 [[70  2]
  [ 5  2]]

 [[39  2]
  [ 9 29]]

 [[45  3]
  [ 5 26]]

 [[75  4]
  [ 0  0]]]
Accuracy score: 75.9493670886076%


# MLP

In [18]:
from sklearn.neural_network import MLPRegressor

regr = MLPRegressor(random_state=1, max_iter=500).fit(X_train, y_train)


In [19]:
regr.predict(X_test[:2])


array([[-0.05169965,  0.5942192 ,  0.00747026,  0.27865202, -0.10819261],
       [ 0.04216736,  0.45824339,  0.86807028,  0.08590825, -0.06359042]])

In [20]:
regr.score(X_test, y_test)

-0.5181777190999677