### Problem Statement
You are a data scientist working for a school

You are asked to predict the GPA of the current students based on the following provided data:

 0   StudentID  int64  
 1   Age    int64  
 2   Gender int64  
 3   Ethnicity  int64  
 4   ParentalEducation  int64  
 5   StudyTimeWeekly    float64

 6   Absences   int64  
 7   Tutoring   int64  
 8   ParentalSupport    int64  
 9   Extracurricular    int64  
 10  Sports int64  
 11  Music  int64  
 12  Volunteering   int64  
 13  GPA    float64
 14  GradeClass float64

The GPA is the Grade Point Average, typically ranges from 0.0 to 4.0 in most educational systems, with 4.0 representing an 'A' or excellent performance.

The minimum passing GPA can vary by institution, but it's often around 2.0. This usually corresponds to a 'C' grade, which is considered satisfactory.

You need to create a Deep Learning model capable to predict the GPA of a Student based on a set of provided features.
The data provided represents 2,392 students.

In this excersice you will be requested to create a total of three models and select the most performant one.


In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import Conv1D, MaxPooling1D, Flatten
from tensorflow.keras.regularizers import l2
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from google.colab import drive
from sklearn.metrics import mean_squared_error, r2_score

In [2]:
drive.mount('/content/drive')
data = pd.read_csv('/content/drive/MyDrive/Student_performance_data _.csv')
data

Mounted at /content/drive


Unnamed: 0,StudentID,Age,Gender,Ethnicity,ParentalEducation,StudyTimeWeekly,Absences,Tutoring,ParentalSupport,Extracurricular,Sports,Music,Volunteering,GPA,GradeClass
0,1001,17,1,0,2,19.833723,7,1,2,0,0,1,0,2.929196,2.0
1,1002,18,0,0,1,15.408756,0,0,1,0,0,0,0,3.042915,1.0
2,1003,15,0,2,3,4.210570,26,0,2,0,0,0,0,0.112602,4.0
3,1004,17,1,0,3,10.028829,14,0,3,1,0,0,0,2.054218,3.0
4,1005,17,1,0,2,4.672495,17,1,3,0,0,0,0,1.288061,4.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2387,3388,18,1,0,3,10.680555,2,0,4,1,0,0,0,3.455509,0.0
2388,3389,17,0,0,1,7.583217,4,1,4,0,1,0,0,3.279150,4.0
2389,3390,16,1,0,2,6.805500,20,0,2,0,0,0,1,1.142333,2.0
2390,3391,16,1,1,0,12.416653,17,0,2,0,1,1,0,1.803297,1.0


In [3]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2392 entries, 0 to 2391
Data columns (total 15 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   StudentID          2392 non-null   int64  
 1   Age                2392 non-null   int64  
 2   Gender             2392 non-null   int64  
 3   Ethnicity          2392 non-null   int64  
 4   ParentalEducation  2392 non-null   int64  
 5   StudyTimeWeekly    2392 non-null   float64
 6   Absences           2392 non-null   int64  
 7   Tutoring           2392 non-null   int64  
 8   ParentalSupport    2392 non-null   int64  
 9   Extracurricular    2392 non-null   int64  
 10  Sports             2392 non-null   int64  
 11  Music              2392 non-null   int64  
 12  Volunteering       2392 non-null   int64  
 13  GPA                2392 non-null   float64
 14  GradeClass         2392 non-null   float64
dtypes: float64(3), int64(12)
memory usage: 280.4 KB


In [4]:
dataset = data.drop(columns = ['Gender', 'Ethnicity', 'StudentID'])
dataset

Unnamed: 0,Age,ParentalEducation,StudyTimeWeekly,Absences,Tutoring,ParentalSupport,Extracurricular,Sports,Music,Volunteering,GPA,GradeClass
0,17,2,19.833723,7,1,2,0,0,1,0,2.929196,2.0
1,18,1,15.408756,0,0,1,0,0,0,0,3.042915,1.0
2,15,3,4.210570,26,0,2,0,0,0,0,0.112602,4.0
3,17,3,10.028829,14,0,3,1,0,0,0,2.054218,3.0
4,17,2,4.672495,17,1,3,0,0,0,0,1.288061,4.0
...,...,...,...,...,...,...,...,...,...,...,...,...
2387,18,3,10.680555,2,0,4,1,0,0,0,3.455509,0.0
2388,17,1,7.583217,4,1,4,0,1,0,0,3.279150,4.0
2389,16,2,6.805500,20,0,2,0,0,0,1,1.142333,2.0
2390,16,0,12.416653,17,0,2,0,1,1,0,1.803297,1.0


In [11]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, BatchNormalization
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras import regularizers
X = dataset.drop(columns = 'GPA')
scaler = StandardScaler()
X = scaler.fit_transform(X)
y = dataset['GPA']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)



def build_model(layers, dropout_rate=0, batch_norm=False, learning_rate=0.0001, l2_lambda=0.01):
    model = Sequential()
    for i, units in enumerate(layers):
        if i == 0:
            model.add(Dense(units, activation='relu', input_shape=(X_train.shape[1],),
                            kernel_regularizer=regularizers.l2(l2_lambda)))
        else:
            model.add(Dense(units, activation='relu',
                            kernel_regularizer=regularizers.l2(l2_lambda)))
        if batch_norm:
            model.add(BatchNormalization())
        if dropout_rate > 0:
            model.add(Dropout(dropout_rate))
    model.add(Dense(1, kernel_regularizer=regularizers.l2(l2_lambda)))  # Output layer with L2
    optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)
    model.compile(optimizer=optimizer, loss='mean_squared_error', metrics=['mae', 'mse', 'mape'])
    return model

# Experiment 1: Single Dense Hidden Layer
model_1 = build_model([64])
history_1 = model_1.fit(X_train, y_train, epochs=500, validation_split=0.2, verbose=1)

# Experiment 2: Three Dense Hidden Layers
model_2 = build_model([64, 32, 16])
history_2 = model_2.fit(X_train, y_train, epochs=500, validation_split=0.2, verbose=1)

# Experiment 3: Three Dense Layers with Dropout
model_3 = build_model([64, 32, 16], dropout_rate=0.2)
history_3 = model_3.fit(X_train, y_train, epochs=500, validation_split=0.2, verbose=1)

# Experiment 4: Three Dense Layers with Dropout and Batch Normalization
model_4 = build_model([64, 32, 16], dropout_rate=0.2, batch_norm=True)
history_4 = model_4.fit(X_train, y_train, epochs=500, validation_split=0.2, verbose=1)

# Evaluate models
results = {
    'Experiment': ['Single Dense Layer', 'Three Dense Layers', 'Three Layers + Dropout', 'Three Layers + Dropout + BatchNorm'],
    'MAE (Train)': [model_1.evaluate(X_train, y_train, verbose=0)[1],
                    model_2.evaluate(X_train, y_train, verbose=0)[1],
                    model_3.evaluate(X_train, y_train, verbose=0)[1],
                    model_4.evaluate(X_train, y_train, verbose=0)[1]],
    'MAE (Test)': [model_1.evaluate(X_test, y_test, verbose=0)[1],
                   model_2.evaluate(X_test, y_test, verbose=0)[1],
                   model_3.evaluate(X_test, y_test, verbose=0)[1],
                   model_4.evaluate(X_test, y_test, verbose=0)[1]],
    'MSE (Train)': [model_1.evaluate(X_train, y_train, verbose=0)[2],
                    model_2.evaluate(X_train, y_train, verbose=0)[2],
                    model_3.evaluate(X_train, y_train, verbose=0)[2],
                    model_4.evaluate(X_train, y_train, verbose=0)[2]],
    'MSE (Test)': [model_1.evaluate(X_test, y_test, verbose=0)[2],
                   model_2.evaluate(X_test, y_test, verbose=0)[2],
                   model_3.evaluate(X_test, y_test, verbose=0)[2],
                   model_4.evaluate(X_test, y_test, verbose=0)[2]],
    'MAPE (Train)': [model_1.evaluate(X_train, y_train, verbose=0)[3],
                     model_2.evaluate(X_train, y_train, verbose=0)[3],
                     model_3.evaluate(X_train, y_train, verbose=0)[3],
                     model_4.evaluate(X_train, y_train, verbose=0)[3]],
    'MAPE (Test)': [model_1.evaluate(X_test, y_test, verbose=0)[3],
                    model_2.evaluate(X_test, y_test, verbose=0)[3],
                    model_3.evaluate(X_test, y_test, verbose=0)[3],
                    model_4.evaluate(X_test, y_test, verbose=0)[3]]
}

# Create comparative table
results_df = pd.DataFrame(results)

# Save table as PDF
results_df.to_csv('comparative_table.csv', index=False)

# Display table
print(results_df)


Epoch 1/500


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


[1m48/48[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 29ms/step - loss: 2.8074 - mae: 1.4279 - mape: 4338015.5000 - mse: 2.5938 - val_loss: 2.6733 - val_mae: 1.3855 - val_mape: 3690253.2500 - val_mse: 2.4593
Epoch 2/500
[1m48/48[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - loss: 2.5145 - mae: 1.3448 - mape: 3025514.0000 - mse: 2.3005 - val_loss: 2.3047 - val_mae: 1.2724 - val_mape: 4289132.0000 - val_mse: 2.0901
Epoch 3/500
[1m48/48[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - loss: 2.1272 - mae: 1.2126 - mape: 4275583.0000 - mse: 1.9124 - val_loss: 1.9876 - val_mae: 1.1674 - val_mape: 4846182.5000 - val_mse: 1.7722
Epoch 4/500
[1m48/48[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - loss: 1.8078 - mae: 1.1033 - mape: 5273133.5000 - mse: 1.5922 - val_loss: 1.7178 - val_mae: 1.0719 - val_mape: 5343704.0000 - val_mse: 1.5015
Epoch 5/500
[1m48/48[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - loss: 1.

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


[1m48/48[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 9ms/step - loss: 4.1920 - mae: 1.5801 - mape: 2708411.7500 - mse: 3.3278 - val_loss: 3.7012 - val_mae: 1.4310 - val_mape: 1758731.7500 - val_mse: 2.8383
Epoch 2/500
[1m48/48[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - loss: 3.3087 - mae: 1.3098 - mape: 3952216.0000 - mse: 2.4461 - val_loss: 2.8906 - val_mae: 1.1803 - val_mape: 3668889.7500 - val_mse: 2.0287
Epoch 3/500
[1m48/48[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - loss: 2.5304 - mae: 1.0581 - mape: 2723914.2500 - mse: 1.6688 - val_loss: 2.1842 - val_mae: 0.9515 - val_mape: 5354301.5000 - val_mse: 1.3234
Epoch 4/500
[1m48/48[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - loss: 1.9543 - mae: 0.8525 - mape: 6727165.5000 - mse: 1.0939 - val_loss: 1.7090 - val_mae: 0.7734 - val_mape: 6540108.0000 - val_mse: 0.8502
Epoch 5/500
[1m48/48[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - loss: 1.5

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


[1m48/48[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 10ms/step - loss: 5.0934 - mae: 1.8088 - mape: 1463899.0000 - mse: 4.2454 - val_loss: 4.7379 - val_mae: 1.7341 - val_mape: 869311.4375 - val_mse: 3.8911
Epoch 2/500
[1m48/48[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - loss: 4.5589 - mae: 1.6866 - mape: 1108372.2500 - mse: 3.7124 - val_loss: 4.0553 - val_mae: 1.5511 - val_mape: 2491454.7500 - val_mse: 3.2096
Epoch 3/500
[1m48/48[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 7ms/step - loss: 3.7238 - mae: 1.4499 - mape: 2706802.2500 - mse: 2.8783 - val_loss: 3.4171 - val_mae: 1.3692 - val_mape: 4292183.5000 - val_mse: 2.5720
Epoch 4/500
[1m48/48[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 5ms/step - loss: 3.4362 - mae: 1.3609 - mape: 3783178.5000 - mse: 2.5913 - val_loss: 2.8479 - val_mae: 1.1970 - val_mape: 5860305.0000 - val_mse: 2.0037
Epoch 5/500
[1m48/48[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 5ms/step - loss: 2.9

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/500
[1m48/48[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 17ms/step - loss: 9.4997 - mae: 2.3396 - mape: 10940247.0000 - mse: 8.6539 - val_loss: 6.1699 - val_mae: 2.0298 - val_mape: 1838247.6250 - val_mse: 5.3247
Epoch 2/500
[1m48/48[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - loss: 8.3235 - mae: 2.1607 - mape: 7004120.5000 - mse: 7.4784 - val_loss: 6.1711 - val_mae: 2.0436 - val_mape: 574682.0625 - val_mse: 5.3266
Epoch 3/500
[1m48/48[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - loss: 7.6726 - mae: 2.1131 - mape: 10924151.0000 - mse: 6.8282 - val_loss: 5.9992 - val_mae: 2.0260 - val_mape: 1542745.1250 - val_mse: 5.1553
Epoch 4/500
[1m48/48[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 9ms/step - loss: 7.0514 - mae: 2.0078 - mape: 3954158.5000 - mse: 6.2077 - val_loss: 5.6667 - val_mae: 1.9745 - val_mape: 3415551.2500 - val_mse: 4.8235
Epoch 5/500
[1m48/48[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/st

As we can see, our MSE value is very low, which implies our model is pretty good, since it is accurately predicting the actual values with a minimal error.

All the models were adequate.