<img src="https://news.illinois.edu/files/6367/543635/116641.jpg" alt="University of Illinois" width="250"/>

# Project 22: UIUC GPA

## Team Members
* Yiping Li - [yipingl4@illinois.edu](mailto:yipingl4@illinois.edu)
* Leo Yang - [junjiey3@illinois.edu](mailto:junjiey3@illinois.edu)
* Shijie Sun - [shijies5@illinois.edu](mailto:shijies5@illinois.edu)
* Richwell Perez - [richwell@illinois.edu](mailto:richwell@illinois.edu)

## Problem Summary
The purpose of this project is to implement deep learning concepts and 
techniques on a real dataset: UIUC GPA. The general questions that will require the application of deep learning is predicting the GPA/grade distribution of UIUC courses in the future. The project will provide some visualization of the data and descriptive statistics, implement linear or logistic regression, and recurrent neural networks.

## License
Dataset is obtained from Professor Ulmschneider's uiuc-gpa-dataset. Project 
curated by Jared Canty (Summer 2022 Blackwell Program). All rights are reserved.


Dataset on UIUC GPA is available at
https://github.com/wadefagen/datasets/tree/master/gpa (“uiuc-gpa-dataset.csv”)



###**Dataset to pandas**

In [1]:
import numpy as np
import pandas as pd
import time
import random
import matplotlib
#%matplotlib notebook
import matplotlib.pyplot as plt
import scipy.stats
import matplotlib.offsetbox as offsetbox
from matplotlib.ticker import StrMethodFormatter
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score, mean_squared_error
from sklearn.model_selection import train_test_split
import copy

In [2]:
#for some reason, this needs to be in a separate cell
params={
    "font.size":15,
    "lines.linewidth":5,
}
plt.rcParams.update(params)

In [3]:
file_url = "https://raw.githubusercontent.com/wadefagen/datasets/master/gpa/uiuc-gpa-dataset.csv"

In [4]:
gpa_data = pd.read_csv(file_url, header=0)
gpa_data

Unnamed: 0,Year,Term,YearTerm,Subject,Number,Course Title,Sched Type,A+,A,A-,...,B-,C+,C,C-,D+,D,D-,F,W,Primary Instructor
0,2022,Spring,2022-sp,AAS,100,Intro Asian American Studies,LCD,6,13,0,...,1,0,3,0,1,1,0,0,0,"Lee, Sang S"
1,2022,Spring,2022-sp,AAS,100,Intro Asian American Studies,DIS,0,11,5,...,2,1,0,1,1,0,0,0,0,"Zheng, Reanne"
2,2022,Spring,2022-sp,AAS,100,Intro Asian American Studies,DIS,0,10,7,...,1,0,0,0,0,0,0,2,0,"Zheng, Reanne"
3,2022,Spring,2022-sp,AAS,100,Intro Asian American Studies,DIS,17,8,1,...,0,0,0,0,0,0,0,0,0,"Rosado-Torres, Alexander"
4,2022,Spring,2022-sp,AAS,100,Intro Asian American Studies,OD,0,8,4,...,2,1,0,0,0,0,1,3,1,"Wang, Yu"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
64043,2010,Summer,2010-su,STAT,410,Statistics and Probability II,LEC,5,10,2,...,1,0,1,3,0,0,0,2,1,"Stepanov, Alexei G"
64044,2010,Summer,2010-su,STAT,440,Statistical Data Management,LEC,4,12,8,...,0,0,0,0,0,0,0,0,0,"Unger, David"
64045,2010,Summer,2010-su,TAM,212,Introductory Dynamics,LEC,0,1,3,...,7,5,1,1,0,2,0,1,0,"Morgan, William T"
64046,2010,Summer,2010-su,TAM,251,Introductory Solid Mechanics,LCD,1,2,2,...,0,3,3,2,0,0,1,1,0,"Ott-Monsivais, Stephanie"


In [5]:
gpa_scale = {
  'A+' : 4.0,
  'A' : 4.0,
  'A-' : 3.67,
  'B+' : 3.33,
  'B' : 3.0,
  'B-' : 2.67,
  'C+' : 2.33,
  'C' : 2.0,
  'C-' : 1.67,
  'D+' : 1.33,
  'D' : 1.0,
  'D-' : 0.67,
  'F' : 0.0,
} # defined from https://registrar.illinois.edu/courses-grades/explanation-of-grades/

letterGrades = list(gpa_scale.keys())
gpa_data['Students_Completed'] = gpa_data[letterGrades].sum(axis=1) # Student pop. per class without W

for l in gpa_scale:
  gpa_data[l + 'asNum'] = gpa_data[l] * gpa_scale[l]

newLetterGrades = [l + 'asNum' for l in letterGrades]
gpa_data['GPA'] = gpa_data[newLetterGrades].sum(axis=1) / gpa_data['Students_Completed'] # Label

letterGrades.append('W')
gpa_data['Students'] = gpa_data[letterGrades].sum(axis=1) # Student pop. per class including with W

gpa_data

Unnamed: 0,Year,Term,YearTerm,Subject,Number,Course Title,Sched Type,A+,A,A-,...,B-asNum,C+asNum,CasNum,C-asNum,D+asNum,DasNum,D-asNum,FasNum,GPA,Students
0,2022,Spring,2022-sp,AAS,100,Intro Asian American Studies,LCD,6,13,0,...,2.67,0.00,6.0,0.00,1.33,1.0,0.00,0.0,3.413793,29
1,2022,Spring,2022-sp,AAS,100,Intro Asian American Studies,DIS,0,11,5,...,5.34,2.33,0.0,1.67,1.33,0.0,0.00,0.0,3.440400,25
2,2022,Spring,2022-sp,AAS,100,Intro Asian American Studies,DIS,0,10,7,...,2.67,0.00,0.0,0.00,0.00,0.0,0.00,0.0,3.358519,27
3,2022,Spring,2022-sp,AAS,100,Intro Asian American Studies,DIS,17,8,1,...,0.00,0.00,0.0,0.00,0.00,0.0,0.00,0.0,3.928571,28
4,2022,Spring,2022-sp,AAS,100,Intro Asian American Studies,OD,0,8,4,...,5.34,2.33,0.0,0.00,0.00,0.0,0.67,0.0,2.921429,22
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
64043,2010,Summer,2010-su,STAT,410,Statistics and Probability II,LEC,5,10,2,...,2.67,0.00,2.0,5.01,0.00,0.0,0.00,0.0,3.183226,32
64044,2010,Summer,2010-su,STAT,440,Statistical Data Management,LEC,4,12,8,...,0.00,0.00,0.0,0.00,0.00,0.0,0.00,0.0,3.774643,28
64045,2010,Summer,2010-su,TAM,212,Introductory Dynamics,LEC,0,1,3,...,18.69,11.65,2.0,1.67,0.00,2.0,0.00,0.0,2.595714,28
64046,2010,Summer,2010-su,TAM,251,Introductory Solid Mechanics,LCD,1,2,2,...,0.00,6.99,6.0,3.34,0.00,0.0,0.67,0.0,2.603333,21


In [6]:
gpa_data['GPA']

0        3.413793
1        3.440400
2        3.358519
3        3.928571
4        2.921429
           ...   
64043    3.183226
64044    3.774643
64045    2.595714
64046    2.603333
64047    3.205641
Name: GPA, Length: 64048, dtype: float64

In [7]:
gpa_data.columns

Index(['Year', 'Term', 'YearTerm', 'Subject', 'Number', 'Course Title',
       'Sched Type', 'A+', 'A', 'A-', 'B+', 'B', 'B-', 'C+', 'C', 'C-', 'D+',
       'D', 'D-', 'F', 'W', 'Primary Instructor', 'Students_Completed',
       'A+asNum', 'AasNum', 'A-asNum', 'B+asNum', 'BasNum', 'B-asNum',
       'C+asNum', 'CasNum', 'C-asNum', 'D+asNum', 'DasNum', 'D-asNum',
       'FasNum', 'GPA', 'Students'],
      dtype='object')

In [8]:
# drop unused columns
gpa_data = gpa_data.drop(columns=newLetterGrades)
gpa_data = gpa_data.drop(columns=['YearTerm', 'Sched Type', 'Students_Completed']) # keeping number of W's for reversable computation

In [9]:
gpa_data.columns

Index(['Year', 'Term', 'Subject', 'Number', 'Course Title', 'A+', 'A', 'A-',
       'B+', 'B', 'B-', 'C+', 'C', 'C-', 'D+', 'D', 'D-', 'F', 'W',
       'Primary Instructor', 'GPA', 'Students'],
      dtype='object')

In [10]:
gpa_data = gpa_data.reindex(columns=['Term', 'Year', 'Students', 'Subject', 'Number', 'A+', 'A', 'A-',
       'B+', 'B', 'B-', 'C+', 'C', 'C-', 'D+', 'D', 'D-', 'F', 'W', 'Course Title',
       'Primary Instructor', 'GPA'])

In [11]:
gpa_data

Unnamed: 0,Term,Year,Students,Subject,Number,A+,A,A-,B+,B,...,C,C-,D+,D,D-,F,W,Course Title,Primary Instructor,GPA
0,Spring,2022,29,AAS,100,6,13,0,0,4,...,3,0,1,1,0,0,0,Intro Asian American Studies,"Lee, Sang S",3.413793
1,Spring,2022,25,AAS,100,0,11,5,3,1,...,0,1,1,0,0,0,0,Intro Asian American Studies,"Zheng, Reanne",3.440400
2,Spring,2022,27,AAS,100,0,10,7,4,3,...,0,0,0,0,0,2,0,Intro Asian American Studies,"Zheng, Reanne",3.358519
3,Spring,2022,28,AAS,100,17,8,1,1,1,...,0,0,0,0,0,0,0,Intro Asian American Studies,"Rosado-Torres, Alexander",3.928571
4,Spring,2022,22,AAS,100,0,8,4,1,1,...,0,0,0,0,1,3,1,Intro Asian American Studies,"Wang, Yu",2.921429
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
64043,Summer,2010,32,STAT,410,5,10,2,2,5,...,1,3,0,0,0,2,1,Statistics and Probability II,"Stepanov, Alexei G",3.183226
64044,Summer,2010,28,STAT,440,4,12,8,1,3,...,0,0,0,0,0,0,0,Statistical Data Management,"Unger, David",3.774643
64045,Summer,2010,28,TAM,212,0,1,3,2,5,...,1,1,0,2,0,1,0,Introductory Dynamics,"Morgan, William T",2.595714
64046,Summer,2010,21,TAM,251,1,2,2,1,5,...,3,2,0,0,1,1,0,Introductory Solid Mechanics,"Ott-Monsivais, Stephanie",2.603333


In [12]:
subject = gpa_data['Subject'].unique()
len(subject)

170

In [13]:
course = gpa_data['Course Title'].unique()
len(course)

5574

In [14]:
instructor = gpa_data['Primary Instructor'].unique()
len(instructor)

8867

In [15]:
term = gpa_data['Term'].unique()
print(len(term))
subject = gpa_data['Subject'].unique()
print(len(subject))
course = gpa_data['Course Title'].unique()
print(len(course))
instructor = gpa_data['Primary Instructor'].unique()
print(len(instructor))

4
170
5574
8867


In [16]:
# drop columns containing NaN
letterGrades.append('GPA')
print(letterGrades)
gpa_data = gpa_data.dropna().reset_index(drop=True)
display(gpa_data)

['A+', 'A', 'A-', 'B+', 'B', 'B-', 'C+', 'C', 'C-', 'D+', 'D', 'D-', 'F', 'W', 'GPA']


Unnamed: 0,Term,Year,Students,Subject,Number,A+,A,A-,B+,B,...,C,C-,D+,D,D-,F,W,Course Title,Primary Instructor,GPA
0,Spring,2022,29,AAS,100,6,13,0,0,4,...,3,0,1,1,0,0,0,Intro Asian American Studies,"Lee, Sang S",3.413793
1,Spring,2022,25,AAS,100,0,11,5,3,1,...,0,1,1,0,0,0,0,Intro Asian American Studies,"Zheng, Reanne",3.440400
2,Spring,2022,27,AAS,100,0,10,7,4,3,...,0,0,0,0,0,2,0,Intro Asian American Studies,"Zheng, Reanne",3.358519
3,Spring,2022,28,AAS,100,17,8,1,1,1,...,0,0,0,0,0,0,0,Intro Asian American Studies,"Rosado-Torres, Alexander",3.928571
4,Spring,2022,22,AAS,100,0,8,4,1,1,...,0,0,0,0,1,3,1,Intro Asian American Studies,"Wang, Yu",2.921429
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
63867,Summer,2010,32,STAT,410,5,10,2,2,5,...,1,3,0,0,0,2,1,Statistics and Probability II,"Stepanov, Alexei G",3.183226
63868,Summer,2010,28,STAT,440,4,12,8,1,3,...,0,0,0,0,0,0,0,Statistical Data Management,"Unger, David",3.774643
63869,Summer,2010,28,TAM,212,0,1,3,2,5,...,1,1,0,2,0,1,0,Introductory Dynamics,"Morgan, William T",2.595714
63870,Summer,2010,21,TAM,251,1,2,2,1,5,...,3,2,0,0,1,1,0,Introductory Solid Mechanics,"Ott-Monsivais, Stephanie",2.603333


In [17]:
debugging_data = gpa_data.sample(6000) # ~10% of data
working_data = gpa_data.sample(30000) # ~50% of data

In [18]:
from google.colab import drive
from matplotlib import pyplot as plt
from google.colab import auth

import os
import numpy as np  
import re, time
import tensorflow as tf
import random
import math
from tensorflow import keras

import argparse
import pandas as pd

from glob import glob
from tqdm import tqdm
from keras import backend as K

import tensorflow as tf
from tensorflow import keras
from keras import models
from keras.layers import Conv3D, MaxPool3D, Dense, Flatten, UpSampling3D, BatchNormalization
from keras.models import *
from keras.layers import *
from keras.optimizers import *
from keras.utils import to_categorical
from tensorflow.keras.callbacks import ModelCheckpoint

%pip install tensorflow-addons
from tensorflow_addons.metrics import RSquare

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


##**11/29 Milestone:**

In [19]:
from google.colab import drive
from matplotlib import pyplot as plt
from google.colab import auth

import os
import numpy as np  
import re, time
import tensorflow as tf
import random
import math
from tensorflow import keras

import argparse
import pandas as pd

from glob import glob
from tqdm import tqdm
from keras import backend as K

import tensorflow as tf
from tensorflow import keras
from keras import models
from keras.layers import Conv3D, MaxPool3D, Dense, Flatten, UpSampling3D, BatchNormalization
from keras.models import *
from keras.layers import *
from keras.optimizers import *
from keras.utils import to_categorical
from tensorflow.keras.callbacks import ModelCheckpoint

%pip install tensorflow-addons
from tensorflow_addons.metrics import RSquare

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


###Enable GPU###

In [20]:
enable_gpu = True

%tensorflow_version 2.x
import tensorflow as tf

if enable_gpu:
  device_name = tf.test.gpu_device_name()
  if device_name != '/device:GPU:0':
    raise SystemError('GPU device not found')
  print('Found GPU at: {}'.format(device_name))
  gpu_info = !nvidia-smi
  gpu_info = '\n'.join(gpu_info)
  if gpu_info.find('failed') >= 0:
    print('Select the Runtime > "Change runtime type" menu to enable a GPU accelerator, ')
    print('and then re-execute this cell.')
  else:
    print(gpu_info)

Colab only includes TensorFlow 2.x; %tensorflow_version has no effect.
Found GPU at: /device:GPU:0
Wed Nov 30 05:54:35 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   70C    P0    32W /  70W |    312MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                     

###**Deep Learning Model**

Label: 
* average GPA or Any Letter Grade (number of students who get A+, A-, A, B+, etc.)

Features:
* Term (one-hot)
* Year
* Students (Student pop. per class)
* Subject / Department (one-Hot)
* (Course) Number
* Course Title (one-Hot)
* Primary Instructor (one-Hot)

Run with GPU:
* change Runtime Type to 'Standard GPU + High RAM'

Loss function:
* Mean absolute Error

Metric:
* R2 Score

In [21]:
def create_model():
  model = models.Sequential()

  model.add(Dense(256, activation='relu'))
  model.add(Dense(256, activation='relu'))

  model.add(Dense(1))

  model.compile(optimizer=Adam(learning_rate=0.001), loss='mean_squared_error', metrics=[RSquare()])

  return model

In [22]:
def run_regression(model, label, gpa_data, letterGrades, epochs=50, batch_size=64): # set any element in letterGrades as the label we want to predict
  K.clear_session()
  
  label_name = label
  if label == 'GPA':
    label_name = 'average GPA'
  print('Running Regression on {}'.format(label_name))

  l = copy.deepcopy(letterGrades)
  l.remove(label)
  # convert to one-hot coding
  new_gpa_data = pd.get_dummies(gpa_data, columns=['Term', 'Subject', 'Course Title', 'Primary Instructor']).drop(columns=l)
  # display(new_gpa_data)

  sample_data = new_gpa_data[:]

  Y = sample_data[label]
  X = sample_data.drop(columns=[label])

  # train : valid : test = 0.7 : 0.15 : 0.15
  x_train, X1, y_train, Y1 = train_test_split(X, Y, test_size = 0.3, random_state = 42)
  x_valid, x_test, y_valid, y_test = train_test_split(X1, Y1, test_size = 0.5, random_state = 42)

  checkpoint_filepath = '/tmp/checkpoint.h5'

  model_checkpoint_callback = ModelCheckpoint(
    filepath=checkpoint_filepath,
    monitor='val_r_square', 
    mode='max',
    save_best_only=True,
    save_weights_only=True)

  history = model.fit(x_train, y_train,
                  validation_data=(x_valid, y_valid),
                  epochs=epochs,
                  batch_size=batch_size,
                  callbacks=[model_checkpoint_callback])
  
  model.load_weights(checkpoint_filepath)

  print('\nTesting set:')
  test_loss, test_metric = model.evaluate(x_test, y_test)

  sample_test = x_test[:5]
  y_pred = model.predict(sample_test)
  print('\nFirst five testing data points:')
  print('labels: ', np.asarray(y_test[:5]))
  print('predictions: ', y_pred[:5])

  print('\nmodel summary: ')
  model.summary()
  

####Adam Optimizer####

In [23]:
with tf.device('/device:GPU:0'):
  model = create_model()
run_regression(model, 'GPA', working_data, letterGrades)

Running Regression on average GPA
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50

Testing set:

First five testing data points:
labels:  [3.90939394 3.01621359 3.73956522 3.82       2.92307692]
predictions:  [[3.7737682]
 [3.029811 ]
 [3.6328871]
 [3.8328736]
 [2.8227477]]

model summary: 
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense

###**Mini-batch Learning**

Larger batch sizes yield faster training speed per epoch, but models with smaller batch sizes learn faster as parameters are updated more frequently. Also, larger batch sizes take up more memory.

We found that training with a batch size of 64 for 50 epochs produces satisfactory results. Smaller batch sizes (e.g. 32) require much longer time to train but do not yield better results.

In [24]:
#run_regression(model, 'GPA', working_data, letterGrades, batch_size=32)
#del model

Running Regression on average GPA
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50

Testing set:

First five testing data points:
labels:  [3.90939394 3.01621359 3.73956522 3.82       2.92307692]
predictions:  [[3.8199139]
 [3.0217454]
 [3.7054737]
 [3.8208406]
 [2.8051555]]

model summary: 
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense

###**Optimizers**###
We have tried several optimizers, including SGD, Adam, Adagrad, Nadam, and RMSprop. 

Among them Adam yields the best results.

In [23]:
def create_model_SGD():
  model = models.Sequential()

  model.add(Dense(256, activation='relu'))
  model.add(Dense(256, activation='relu'))
  model.add(Dense(2, activation = 'relu')) # L2 norm

  model.add(Dense(1))

  model.compile(optimizer=SGD(learning_rate=0.01, momentum=0.0, nesterov=False), loss='mean_squared_error', metrics=[RSquare()])

  return model

In [24]:
def create_model_Adagrad():
  model = models.Sequential()

  model.add(Dense(256, activation='relu'))
  model.add(Dense(256, activation='relu'))
  
 
  model.add(Dense(1))

  model.compile(optimizer=Adagrad(learning_rate=0.01), loss='mean_squared_error', metrics=[RSquare()])

  return model

In [25]:
def create_model_Nadam():
  model = models.Sequential()

  model.add(Dense(256, activation='relu'))
  model.add(Dense(256, activation='relu'))

  model.add(Dense(1))

  model.compile(optimizer=Nadam(learning_rate=0.002, beta_1=0.9, beta_2=0.999), loss='mean_squared_error', metrics=[RSquare()])

  return model

In [26]:
def create_model_RMSprop():
  model = models.Sequential()

  model.add(Dense(256, activation='relu'))
  model.add(Dense(256, activation='relu'))

  model.add(Dense(1))

  model.compile(optimizer=RMSprop(learning_rate=0.001, rho=0.9), loss='mean_squared_error', metrics=[RSquare()])

  return model

In [27]:
with tf.device('/device:GPU:0'):
  model1 = create_model_SGD()
  model2 = create_model_Adagrad()
  model3 = create_model_Nadam()
  model4 = create_model_RMSprop()

####SGD####

In [30]:
#run_regression(model1, 'GPA', working_data, letterGrades) 
#del model1

Running Regression on average GPA
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50

Testing set:

First five testing data points:
labels:  [3.90939394 3.01621359 3.73956522 3.82       2.92307692]
predictions:  [[3.3574088]
 [3.3574088]
 [3.3574088]
 [3.3574088]
 [3.3574088]]

model summary: 
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense

####Adagrad####

In [28]:
#run_regression(model2, 'GPA', working_data, letterGrades)
#del model2

Running Regression on average GPA
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50

Testing set:

First five testing data points:
labels:  [2.89134146 3.4725     3.6        3.7608     3.42282051]
predictions:  [[3.0486987]
 [3.5815594]
 [3.3927758]
 [3.57574  ]
 [3.3431113]]

model summary: 
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_4 (D

####Nadam####

In [29]:
#run_regression(model3, 'GPA', working_data, letterGrades)
#del model3

Running Regression on average GPA
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50

Testing set:

First five testing data points:
labels:  [2.89134146 3.4725     3.6        3.7608     3.42282051]
predictions:  [[2.9482253]
 [3.620613 ]
 [3.507869 ]
 [3.6148648]
 [3.5587149]]

model summary: 
Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_7 (D

####RMSprop####

In [30]:
#run_regression(model4, 'GPA', working_data, letterGrades)
#del model4

Running Regression on average GPA
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50

Testing set:

First five testing data points:
labels:  [2.89134146 3.4725     3.6        3.7608     3.42282051]
predictions:  [[3.0347123]
 [3.6743803]
 [3.4486861]
 [3.66894  ]
 [3.3406034]]

model summary: 
Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_10 (

###**Hyperparameters**

We ran the model to predict the average GPA of each class given its feature space and got the following parameters after tuning:

* Train-valid-test split ratio: 0.7 : 0.15 : 0.15

* Optimizer: Adam

* Initial learning rate: 0.001

* Layers: input(input_size)->Dense(256)->Dense(256)->ouput(1)

* Activation function: ReLu

* Loss Function: Mean absolute error (MAE) and Mean squared error (MSE) had similar performances

* Epochs: 50

* Batch size: 64

Comparison with benchmark (on average GPA prediction):

* Our Model: MSE = 0.0640, R2 Score = 0.5895
* Benchmark (Linear Regression): MSE = 0.1127, R2 Score: 0.3108