Goal: Apply sentiment analysis on dataset from RateMyProfessors. The project will take in student comments and make predictions on the students star and difficulty rating for the professor.

Note: Will be using the term star instead of quality, since Dr. Hibo Je's dataset uses the term star. Hence, student star rating will mean the student quality rating.

Import necessary libraries

In [None]:
!pip install nlpaug
%load_ext tensorboard
from google.colab import drive
from nltk.corpus import stopwords
from sklearn.metrics import mean_squared_error, mean_absolute_error
from sklearn.model_selection import train_test_split
import datetime
import string
import matplotlib.pyplot as plt
import nlpaug.augmenter.word as naw
import nltk
import numpy as np
import pandas as pd
import re
import tensorflow as tf

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
The tensorboard extension is already loaded. To reload it, use:
  %reload_ext tensorboard


In [None]:
# Enable GPU
# Confirm connection to GPU with TensorFlow
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))

Found GPU at: /device:GPU:0


Load and view data

Data from https://data.mendeley.com/datasets/fvtfjyvw7d/2

In [None]:
# Load data
drive.mount('/content/drive')
df = pd.read_csv("./drive/MyDrive/Datasets/RateMyProfessor_Sample_data.csv")
print(df.head(3))

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
   professor_name                                 school_name  \
0  Leslie  Looney  University Of Illinois at Urbana-Champaign   
1  Leslie  Looney  University Of Illinois at Urbana-Champaign   
2  Leslie  Looney  University Of Illinois at Urbana-Champaign   

        department_name                    local_name state_name  \
0  Astronomy department   Champaign\xe2\x80\x93Urbana         IL   
1  Astronomy department   Champaign\xe2\x80\x93Urbana         IL   
2  Astronomy department   Champaign\xe2\x80\x93Urbana         IL   

   year_since_first_review  star_rating take_again  diff_index  \
0                     11.0          4.7        NaN         2.0   
1                     11.0          4.7        NaN         2.0   
2                     11.0          4.7        NaN         2.0   

                                       tag_professor  ...  lots_of_homew

In [None]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20000 entries, 0 to 19999
Data columns (total 51 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   professor_name            20000 non-null  object 
 1   school_name               20000 non-null  object 
 2   department_name           20000 non-null  object 
 3   local_name                20000 non-null  object 
 4   state_name                20000 non-null  object 
 5   year_since_first_review   20000 non-null  float64
 6   star_rating               20000 non-null  float64
 7   take_again                2998 non-null   object 
 8   diff_index                20000 non-null  float64
 9   tag_professor             11093 non-null  object 
 10  num_student               20000 non-null  float64
 11  post_date                 19995 non-null  object 
 12  name_onlines              20000 non-null  object 
 13  name_not_onlines          19995 non-null  object 
 14  studen

In [None]:
# Check types of the features
df.dtypes

professor_name               object
school_name                  object
department_name              object
local_name                   object
state_name                   object
year_since_first_review     float64
star_rating                 float64
take_again                   object
diff_index                  float64
tag_professor                object
num_student                 float64
post_date                    object
name_onlines                 object
name_not_onlines             object
student_star                float64
student_difficult           float64
attence                      object
for_credits                  object
would_take_agains            object
grades                       object
help_useful                 float64
help_not_useful             float64
comments                     object
word_comment                float64
gender                       object
race                         object
asian                       float64
hispanic                    

Extract data and pre-process

In [None]:
# Extract comments and the 2 labels
df = df.iloc[:, [22, 14, 15]]
df.head(3)

Unnamed: 0,comments,student_star,student_difficult
0,"This class is hard, but its a two-in-one gen-e...",5.0,3.0
1,Definitely going to choose Prof. Looney\'s cla...,5.0,2.0
2,I overall enjoyed this class because the assig...,4.0,3.0


In [None]:
# Find any NaN in the two labels
print(np.where(np.isnan(df['student_star'])))
print(np.where(np.isnan(df['student_difficult'])))

# Drop the observations that contain a NaN(s) as a label
df = df.dropna(subset=["student_star"])

(array([693, 694, 695, 696, 697]),)
(array([693, 694, 695, 696, 697]),)


In [None]:
# Check if NaN were successfully removed
print(np.where(np.isnan(df['student_star'])))
print(np.where(np.isnan(df['student_difficult'])))

print("New shape", df.shape)

(array([], dtype=int64),)
(array([], dtype=int64),)
New shape (19995, 3)


In [None]:
# Extract comments, student star rating, and student difficulty rating
x = df["comments"].astype(str).tolist()
y_star = df.iloc[:, 1].values
y_diff = df.iloc[:, 2].values

print(type(x), type(y_star), type(y_diff))
print(len(x), y_star.size, y_diff.size)

<class 'list'> <class 'numpy.ndarray'> <class 'numpy.ndarray'>
19995 19995 19995


In [None]:
# Removes any character that is not a word or whitespace while lowercasing
cleaning_x = []
for i in x:
  i = re.sub(r'[^\w\s]', ' ', i)
  i = i.lower()
  cleaning_x.append(i)

# Remove \
cleaned_x = []
for j in cleaning_x:
  j = j.replace("\\", " ")
  cleaned_x.append(j)

# Remove stopwords if have time
# Get NLTK stopwords
nltk.download('stopwords')
cleanest_text = []
for q in cleaned_x:
  words = q.split()
  cleaned = []
  for w in words:
    if w not in set(stopwords.words('english')):
      cleaned.append(w)
  cleanest_text.append(' '.join(cleaned))

#cleanest_text

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


In [None]:
print(len(cleanest_text))
print(cleanest_text[0])
print(type(cleanest_text), type(y_star), type(y_diff))

19995
class hard two one gen ed knockout content stimulating unlike classes actually participate pass sections easy offer extra credit every week funny dude much say
<class 'list'> <class 'numpy.ndarray'> <class 'numpy.ndarray'>


Gather more data using techniques from Wei and Zou's EDA

Resources:

https://blog.paperspace.com/data-augmentation-for-nlp/

https://arxiv.org/pdf/1901.11196.pdf

In [None]:
# Combine the x, ys, and yd to a single df
aug_df = pd.DataFrame({"Comments":cleanest_text,
                       "Star":y_star,
                       "Difficulty":y_diff})
aug_df.head(3)

Unnamed: 0,Comments,Star,Difficulty
0,class hard two one gen ed knockout content sti...,5.0,3.0
1,definitely going choose prof looney class inte...,5.0,2.0
2,overall enjoyed class assignments straightforw...,4.0,3.0


In [None]:
# Extract comments for preparation of data augmentation
comments = aug_df["Comments"].astype(str).tolist()
comments[0:3]

['class hard two one gen ed knockout content stimulating unlike classes actually participate pass sections easy offer extra credit every week funny dude much say',
 'definitely going choose prof looney class interesting class easy bring notes exams need remember lot lots bonus points available observatory sessions awesome',
 'overall enjoyed class assignments straightforward interesting enjoy video project felt like one group cared enough help']

In [None]:
# Data augmentation with synonym, swap, and delete of the comments
# at 20%. I chose a relatively low percentage to augment b/c the
# new data should still have most of its meaning
# Note: If time permits, maybe also try antonyms to diverse the data?
data_augments = [
  naw.SynonymAug(aug_p=0.2),
  naw.RandomWordAug(action='swap', aug_p=0.2),
  naw.RandomWordAug(action='delete', aug_p=0.2)
]

# Perform the 3 augmentations for each comment
new_augmented_texts = []
for c in comments:
  for a in data_augments:
    new_augmented_texts.append(a.augment(c))

In [None]:
# View the new augmented data
print(new_augmented_texts)

# Note: Data rate exceeded at 59985 augmented data

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)



In [None]:
# View the new augmented data
# Note: Due to the limit, I have to make do with 59985 samples.
print(len(new_augmented_texts))
new_augmented_texts[0:3]
# Note: The first 3 are augmented of the original
# Funny note: Since there is a term "gen ed", the synonym data
#             augmentation replaced "ed" with "erectile dysfunction"

59985


[['family hard two one gen ed knockout content stimulating unlike classes actually participate pass sections easy bid extra cite every week funny fellow much say'],
 ['class hard two one gen ed knockout content stimulating classes unlike actually participate pass sections offer easy extra credit every funny week dude much say'],
 ['class hard two one gen ed knockout content stimulating unlike participate pass sections easy credit every week funny dude say']]

In [None]:
# View the original non-augmented comment for comparison
comments[0]

'class hard two one gen ed knockout content stimulating unlike classes actually participate pass sections easy offer extra credit every week funny dude much say'

In [None]:
# Since I have augmented the data with 3 augmentors,
# the corresponding labels need to be of the same amount.
# Since the structure of the augmented comments are in
# order, due to the nested loop, can just use repeat.
star = aug_df["Star"].values
aug_star = np.repeat(star, len(data_augments))

diff = aug_df["Difficulty"].values
aug_diff = np.repeat(diff, len(data_augments))

In [None]:
# Double check that the labels corresponds to the correct sample
print(diff[0:10])
print(aug_diff[0:30])
print()
print(star[0:10])
print(aug_star[0:30])
# Seems like it

[3. 2. 3. 3. 1. 2. 2. 2. 1. 2.]
[3. 3. 3. 2. 2. 2. 3. 3. 3. 3. 3. 3. 1. 1. 1. 2. 2. 2. 2. 2. 2. 2. 2. 2.
 1. 1. 1. 2. 2. 2.]

[5. 5. 4. 5. 5. 5. 5. 5. 5. 5.]
[5. 5. 5. 5. 5. 5. 4. 4. 4. 5. 5. 5. 5. 5. 5. 5. 5. 5. 5. 5. 5. 5. 5. 5.
 5. 5. 5. 5. 5. 5.]


In [None]:
# Double check types again
print(type(y_star), type(aug_star))
print(type(y_diff), type(aug_diff))
print(type(cleanest_text), type(new_augmented_texts))
print()

# Lengths
print(y_star.size, aug_star.size)
print(y_diff.size, aug_diff.size)
print(len(cleanest_text), len(new_augmented_texts))
print()

# Length when combining
print(len(cleanest_text) + len(new_augmented_texts))

<class 'numpy.ndarray'> <class 'numpy.ndarray'>
<class 'numpy.ndarray'> <class 'numpy.ndarray'>
<class 'list'> <class 'list'>

19995 59985
19995 59985
19995 59985

79980


In [None]:
# Combine original data with augmented data
# The order of samples do not matter, since it will be
# shuffled during the splitting phase
# Note: Also the data augmentation was performed on
#       the pre-processed data, so no need to
#       pre-process again.

In [None]:
# However, before combining make sure they are in the same format
print(cleanest_text[0])
print(new_augmented_texts[0])

class hard two one gen ed knockout content stimulating unlike classes actually participate pass sections easy offer extra credit every week funny dude much say
['family hard two one gen ed knockout content stimulating unlike classes actually participate pass sections easy bid extra cite every week funny fellow much say']


In [None]:
# They are not the same format, so must convert
# the augmented comments into just a list of strings
cleanest_augmented_texts = []
for k in new_augmented_texts:
  for j in k:
    cleanest_augmented_texts.append(j)

# Make sure it is in the right format. Yes it is.
print(cleanest_augmented_texts[0])

family hard two one gen ed knockout content stimulating unlike classes actually participate pass sections easy bid extra cite every week funny fellow much say


In [None]:
# Okay. Now I merge the original and augmented dataset
merged_texts = cleanest_text + cleanest_augmented_texts

# Remember from before that the size should be 79980, so
# double check
print(len(merged_texts))

79920


In [None]:
# There is 60 missing samples?
# Let's investigate by looking at the lengths.
print(len(merged_texts))
print(len(cleanest_text) + len(new_augmented_texts))
print(len(cleanest_text))
print(len(cleanest_augmented_texts))
print(len(new_augmented_texts))

79920
79980
19995
59925
59985


In [None]:
# Some rate my professor comments will be one word, such as "Awesome"
# So during the data augmentation process, since I also used a
# "delete" method, this might cause no comments!
# Check for empty lists
emptiness = []
for m in new_augmented_texts:
  if not m:
    emptiness.append(m)
print(len(emptiness))

60


In [None]:
# There are 60 empty lists. This is a good thing, as now I can
# find the indices of these empty lists and remove them.
empty_indices = []
for i in range(len(new_augmented_texts)):
  if not new_augmented_texts[i]:
    empty_indices.append(i)
print(empty_indices)

[1365, 1366, 1367, 1422, 1423, 1424, 9519, 9520, 9521, 11280, 11281, 11282, 11439, 11440, 11441, 12051, 12052, 12053, 12792, 12793, 12794, 13101, 13102, 13103, 14814, 14815, 14816, 17835, 17836, 17837, 22173, 22174, 22175, 26166, 26167, 26168, 26226, 26227, 26228, 27615, 27616, 27617, 34158, 34159, 34160, 34164, 34165, 34166, 34170, 34171, 34172, 41454, 41455, 41456, 46662, 46663, 46664, 53346, 53347, 53348]


In [None]:
# Double check if these are actualy empty lists
print(new_augmented_texts[empty_indices[42]])

[]


In [None]:
# After retrieving the empty lists' indices, remove
def remove_emptiness(comments):
  """
  This function will take in the list of comments and will
  remove the empty lists.
  Returns a new list of non-empty lists.
  """
  fulfilled = []
  for c in comments:
    if c:
      fulfilled.append(c)
  return fulfilled
# Ran into small error when trying to remove empty lists with empty_indices,
# so just used this simpler method because I am not too familiar with
# string manipulation

In [None]:
# Note: I had to split the data due to "data rate exceeded" again
split1 = remove_emptiness(new_augmented_texts[0:10000])
split2 = remove_emptiness(new_augmented_texts[10000:20000])
split3 = remove_emptiness(new_augmented_texts[20000:30000])
split4 = remove_emptiness(new_augmented_texts[30000:40000])
split5 = remove_emptiness(new_augmented_texts[40000:50000])
split6 = remove_emptiness(new_augmented_texts[50000:len(new_augmented_texts)])

# Combine the splits
cleaned_aug_comments = split1 + split2 + split3 + split4 + split5 + split6

# Double check the length
print(len(cleaned_aug_comments))

59925


In [None]:
# Need to also drop the corresponding labels
new_aug_star = np.delete(aug_star, empty_indices)
new_aug_diff = np.delete(aug_diff, empty_indices)

print(new_aug_star.size, new_aug_diff.size)

59925 59925


Okay. Now I am able to finally merge the datasets properly

Caution Note: Merge in the same corresponding order!

In [None]:
# Merge the labels
combined_star = np.concatenate((y_star, new_aug_star))
combined_diff = np.concatenate((y_diff, new_aug_diff))

# Sanity check again
print(combined_star.size, combined_diff.size)

79920 79920


In [None]:
# Merget the cleaned comments
combined_texts = cleanest_text + cleanest_augmented_texts

# Sanity check again
print(len(combined_texts))

79920


In [None]:
# Tokenization
tokens = tf.keras.preprocessing.text.Tokenizer(num_words = 20000)

# Index vocabulary, so can make sequence later
tokens.fit_on_texts(combined_texts)

# Convert the student comments to a sequence of integers
integer_sequences = tokens.texts_to_sequences(combined_texts)

# Take a look at one student comment in the form of an integer sequence
print(integer_sequences[69])

# Sanity check again.
print(len(integer_sequences))

[90, 1167, 50, 3332, 1387, 154, 39, 16, 281, 44, 43, 146, 777, 411, 4057, 1, 1498, 25, 5331, 624, 1103, 168, 1117, 388, 2007, 2432, 1]
79920


In [None]:
# Find and set max length from sequences in preparation to pad
max_length = 0
for k in integer_sequences:
  if len(k) > max_length:
    max_length = len(k)
print(max_length)

72


In [None]:
# Padding the sequence to make them all the same lengths
# Note: Sequences must have same length to feed into NN
# Note: padding = "post" or "pre" produced similar performance results
padded_sequences = tf.keras.utils.pad_sequences(integer_sequences,
                                                maxlen = max_length,
                                                padding = "post")
# Sanity check
print(padded_sequences.shape)
print(padded_sequences[69])
print(integer_sequences[69])

(79920, 72)
[  90 1167   50 3332 1387  154   39   16  281   44   43  146  777  411
 4057    1 1498   25 5331  624 1103  168 1117  388 2007 2432    1    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0]
[90, 1167, 50, 3332, 1387, 154, 39, 16, 281, 44, 43, 146, 777, 411, 4057, 1, 1498, 25, 5331, 624, 1103, 168, 1117, 388, 2007, 2432, 1]


Load in Stanford's GloVe, which is a pre-trained word embedding application

Note: From Chollet's Deep Learning textbook

In [None]:
embeddings_index = {}
with open("./drive/MyDrive/Datasets/glove.6B.100d.txt") as f:
  for line in f:
    word, coefs = line.split(maxsplit = 1)
    coefs = np.fromstring(coefs, "f" ,sep = " ")
    embeddings_index[word] = coefs

print("Found %s word vectors." % len(embeddings_index))
#embeddings_index.keys()
#embeddings_index['is']

Found 400000 word vectors.


In [None]:
# Preparing/Creating the GloVe word-embeddings matrix
embedding_dimensions = 100
word_index = tokens.word_index

embedding_matrix = np.zeros((len(word_index) + 1, embedding_dimensions))
for word, i in word_index.items():
  embedding_vector = embeddings_index.get(word)
  if embedding_vector is not None:
    embedding_matrix[i] = embedding_vector

In [None]:
print(embedding_matrix.shape)

(20276, 100)


In [None]:
# Define the Embedding layer
embedding_layer = tf.keras.layers.Embedding(input_dim = len(word_index) + 1,
                                            output_dim = embedding_dimensions,
                                            input_length = max_length,
                                            weights = [embedding_matrix],
                                            trainable = False,
                                            mask_zero=True)

In [None]:
print(padded_sequences.shape)
print(padded_sequences[0].shape[0])

(79920, 72)
72


In [None]:
# I made a mistake of embedding it twice. This will yield some rather
# interesting findings... (such as the extra dimension and with reshaping
# yielded extremely poor performance)

Splitting the data (augmented plus original) into 60% training, 20% validation, 20% testing while shuffling

In [None]:
# Prepare data for splitting
x_np = padded_sequences
print(x_np.shape, type(x_np))
print(combined_star.shape, combined_diff.shape)

(79920, 72) <class 'numpy.ndarray'>
(79920,) (79920,)


In [None]:
x_tv, x_test, ys_tv, ys_test, yd_tv, yd_test = train_test_split(
    x_np, combined_star, combined_diff, test_size = 0.2, shuffle = True)

x_train, x_val, ys_train, ys_val, yd_train, yd_val = train_test_split(
    x_tv, ys_tv, yd_tv, test_size = 0.25, shuffle = True)

# Sanity check
print(x_train.shape, x_val.shape, x_test.shape)
print(ys_train.shape, ys_val.shape, ys_test.shape)
print(yd_train.shape, yd_val.shape, yd_test.shape)

(47952, 72) (15984, 72) (15984, 72)
(47952,) (15984,) (15984,)
(47952,) (15984,) (15984,)


In [None]:
# One-hot-encoding the labels
ys_train = tf.keras.utils.to_categorical(ys_train, num_classes = 9)
ys_val = tf.keras.utils.to_categorical(ys_val, num_classes = 9)
ys_test = tf.keras.utils.to_categorical(ys_test, num_classes = 9)
yd_train = tf.keras.utils.to_categorical(yd_train, num_classes = 9)
yd_val = tf.keras.utils.to_categorical(yd_val, num_classes = 9)
yd_test = tf.keras.utils.to_categorical(yd_test, num_classes = 9)

In [None]:
# Apply early stopping regularisation technique
early_stopping = tf.keras.callbacks.EarlyStopping(monitor = "val_loss",
                                                  patience = 3)

The following are steps to model only using the original data and without adding augmented data.

This is to compare to see if data augmentation will truly boost performance.


In [None]:
# Original data with no augmented data
# Comments are removed here, but if need extra clarifications then scroll up
og_tokens = tf.keras.preprocessing.text.Tokenizer(num_words = 20000)
og_tokens.fit_on_texts(cleanest_text)
og_integer_sequences = og_tokens.texts_to_sequences(cleanest_text)
print(og_integer_sequences[69])
print(len(og_integer_sequences))

og_max_length = 0
for k in og_integer_sequences:
  if len(k) > og_max_length:
    og_max_length = len(k)
print(og_max_length)

[105, 1206, 51, 2985, 1279, 155, 32, 16, 274, 45, 43, 138, 766, 392, 3531, 1, 1456, 24, 5255, 698, 1052, 164, 1069, 380, 1859, 2208, 1]
19995
72


In [None]:
og_padded_sequences = tf.keras.utils.pad_sequences(og_integer_sequences,
                                                maxlen = og_max_length,
                                                padding = "post")
print(og_padded_sequences.shape)
print(og_padded_sequences[69])
print(og_integer_sequences[69])

(19995, 72)
[ 105 1206   51 2985 1279  155   32   16  274   45   43  138  766  392
 3531    1 1456   24 5255  698 1052  164 1069  380 1859 2208    1    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0    0    0    0    0    0    0    0    0    0    0    0    0
    0    0]
[105, 1206, 51, 2985, 1279, 155, 32, 16, 274, 45, 43, 138, 766, 392, 3531, 1, 1456, 24, 5255, 698, 1052, 164, 1069, 380, 1859, 2208, 1]


In [None]:
og_word_index = og_tokens.word_index
og_embedding_matrix = np.zeros((len(og_word_index) + 1, embedding_dimensions))
for word, i in word_index.items():
  embedding_vector = embeddings_index.get(word)
  if embedding_vector is not None:
    embedding_matrix[i] = embedding_vector
print(og_embedding_matrix.shape)

(16228, 100)


In [None]:
og_embedding_layer = tf.keras.layers.Embedding(input_dim = len(og_word_index) + 1,
                                            output_dim = embedding_dimensions,
                                            input_length = og_max_length,
                                            weights = [og_embedding_matrix],
                                            trainable = False,
                                            mask_zero = True)
print(og_padded_sequences.shape)
print(og_padded_sequences[0].shape[0])

(19995, 72)
72


In [None]:
og_x_np = og_padded_sequences
print(og_x_np.shape, type(og_x_np))
print(y_star.shape, y_diff.shape)

(19995, 72) <class 'numpy.ndarray'>
(19995,) (19995,)


In [None]:
og_x_tv, og_x_test, og_ys_tv, og_ys_test, og_yd_tv, og_yd_test = train_test_split(
    og_x_np, y_star, y_diff, test_size = 0.2, shuffle = True)
og_x_train, og_x_val, og_ys_train, og_ys_val, og_yd_train, og_yd_val = train_test_split(
    og_x_tv, og_ys_tv, og_yd_tv, test_size = 0.25, shuffle = True)
print(og_x_train.shape, og_x_val.shape, og_x_test.shape)
print(og_ys_train.shape, og_ys_val.shape, og_ys_test.shape)
print(og_yd_train.shape, og_yd_val.shape, og_yd_test.shape)
og_ys_train = tf.keras.utils.to_categorical(og_ys_train, num_classes = 9)
og_ys_val = tf.keras.utils.to_categorical(og_ys_val, num_classes = 9)
og_ys_test = tf.keras.utils.to_categorical(og_ys_test, num_classes = 9)
og_yd_train = tf.keras.utils.to_categorical(og_yd_train, num_classes = 9)
og_yd_val = tf.keras.utils.to_categorical(og_yd_val, num_classes = 9)
og_yd_test = tf.keras.utils.to_categorical(og_yd_test, num_classes = 9)

(11997, 72) (3999, 72) (3999, 72)
(11997,) (3999,) (3999,)
(11997,) (3999,) (3999,)


In [None]:
# Define the log directory for the TensorBoard
log_bo_star_dir = "logs_baselineo_star/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_bo_callback_star = tf.keras.callbacks.TensorBoard(log_dir = log_bo_star_dir,
                                                      histogram_freq = 1)
obaseline_model_star = tf.keras.Sequential()
obaseline_model_star.add(og_embedding_layer)
obaseline_model_star.add(tf.keras.layers.Bidirectional(tf.keras.layers.GRU(64)))
obaseline_model_star.add(tf.keras.layers.Dropout(0.5))
obaseline_model_star.add(tf.keras.layers.Dense(9, activation='softmax'))
obaseline_model_star.summary()

obaseline_model_star.compile(loss='categorical_crossentropy', optimizer="adam",
              metrics=['accuracy', 'mse', 'mae'])

obaseline_model_star.fit(og_x_train, og_ys_train, validation_data=(og_x_val, og_ys_val),
          epochs = 8, batch_size = 32,
          callbacks = [early_stopping, tensorboard_bo_callback_star])

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_1 (Embedding)     (None, 72, 100)           1622800   
                                                                 
 bidirectional (Bidirectiona  (None, 128)              63744     
 l)                                                              
                                                                 
 dropout (Dropout)           (None, 128)               0         
                                                                 
 dense (Dense)               (None, 9)                 1161      
                                                                 
Total params: 1,687,705
Trainable params: 64,905
Non-trainable params: 1,622,800
_________________________________________________________________
Epoch 1/8
Epoch 2/8
Epoch 3/8
Epoch 4/8
Epoch 5/8
Epoch 6/8
Epoch 7/8
Epoch 8/8


<keras.callbacks.History at 0x7f979c8c0df0>

In [None]:
%tensorboard --logdir logs_baselineo_star/fit --port 6007

In [None]:
# Evaluate the model_star on test set
obaseline_model_star_evaluation = obaseline_model_star.evaluate(og_x_test, og_ys_test)
print("Original Model (star) test loss:", obaseline_model_star_evaluation[0])
print("Original Model (star) test accuracy:", obaseline_model_star_evaluation[1])

# Make predictions
obaseline_ys_predictions = obaseline_model_star.predict(og_x_test)

# See MAE and MSE
print("Original Baseline Model (star) test MAE: ", mean_absolute_error(og_ys_test, obaseline_ys_predictions))
print("Original Baseline Model (star) test MSE: ", mean_squared_error(og_ys_test, obaseline_ys_predictions))

Original Model (star) test loss: 1.5264798402786255
Original Model (star) test accuracy: 0.34533634781837463
Original Baseline Model (star) test MAE:  0.16904278
Original Baseline Model (star) test MSE:  0.08486905


In [None]:
log_bo_diff_dir = "logs_baselineo_diff/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_bo_callback_diff = tf.keras.callbacks.TensorBoard(log_dir = log_bo_diff_dir,
                                                      histogram_freq = 1)
obaseline_model_diff = tf.keras.Sequential()
obaseline_model_diff.add(og_embedding_layer)
obaseline_model_diff.add(tf.keras.layers.Bidirectional(tf.keras.layers.GRU(64)))
obaseline_model_diff.add(tf.keras.layers.Dropout(0.5))
obaseline_model_diff.add(tf.keras.layers.Dense(9, activation='softmax'))
obaseline_model_diff.summary()

obaseline_model_diff.compile(loss='categorical_crossentropy', optimizer = "adam",
              metrics=['accuracy', 'mse', 'mae'])

obaseline_model_diff.fit(og_x_train, og_yd_train, validation_data = (og_x_val, og_yd_val),
          epochs = 8, batch_size = 32,
          callbacks = [early_stopping, tensorboard_bo_callback_diff])

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_1 (Embedding)     (None, 72, 100)           1622800   
                                                                 
 bidirectional_1 (Bidirectio  (None, 128)              63744     
 nal)                                                            
                                                                 
 dropout_1 (Dropout)         (None, 128)               0         
                                                                 
 dense_1 (Dense)             (None, 9)                 1161      
                                                                 
Total params: 1,687,705
Trainable params: 64,905
Non-trainable params: 1,622,800
_________________________________________________________________
Epoch 1/8
Epoch 2/8
Epoch 3/8
Epoch 4/8
Epoch 5/8
Epoch 6/8
Epoch 7/8
Epoch 8/8


<keras.callbacks.History at 0x7f979d2ae080>

In [None]:
%tensorboard --logdir logs_baselineo_diff/fit --port 6008

In [None]:
# Evaluate the model_diff on test set
obaseline_model_diff_evaluation = obaseline_model_diff.evaluate(og_x_test, og_yd_test)
print("Original Model (diff) test loss:", obaseline_model_diff_evaluation[0])
print("Original Model (diff) test accuracy:", obaseline_model_diff_evaluation[1])

# Make predictions
obaseline_yd_predictions = obaseline_model_diff.predict(og_x_test)

# See MAE and MSE
print("Original Baseline Model (diff) test MAE: ", mean_absolute_error(og_yd_test, obaseline_yd_predictions))
print("Original Baseline Model (diff) test MSE: ", mean_squared_error(og_yd_test, obaseline_yd_predictions))

Original Model (diff) test loss: 1.5869672298431396
Original Model (diff) test accuracy: 0.2723180651664734
Original Baseline Model (diff) test MAE:  0.17599066
Original Baseline Model (diff) test MSE:  0.08782036


Note: As one can see, the model ended up stagnating due to the small dataset. In other words, the model "ran out of" data to learn from. This is what provoked the idea, alongside with the instructor's suggestions in the project description, of gathering more data.

In addition to the classification model, the regression model was also implemented. Prior to data augmentation, the regression model metrics yielded it was extremely overfitting and much more than the classification version. Therefore, the classification model was chosen to continue.

Applications of L1, L2 and L1_l2 regularizations, additional layers, combinations of GRU and LSTM layers, configuring the number of nodes, dropout techniques, different learning rates, MinMaxScaling, and different optimizers were considered in constructing the final model.

# Models using augmented data

Here I utilize two models, where model_star is to make predictions for star rating and model_diff is to make predictions for difficulty ratings. The following will be the baseline that utilizes the augmented data.

Note: rmsprop and ADAM optimizers performed similarly

The following will be the baseline model with data augmentation prior to hypertuning.

The baseline model will consist of a simple architecture.

In [None]:
# Define the log directory for the TensorBoard
log_b_star_dir = "logs_baseline_star/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_b_callback_star = tf.keras.callbacks.TensorBoard(log_dir = log_b_star_dir,
                                                      histogram_freq = 1)
baseline_model_star = tf.keras.Sequential()
baseline_model_star.add(embedding_layer)
baseline_model_star.add(tf.keras.layers.Bidirectional(tf.keras.layers.GRU(64)))
baseline_model_star.add(tf.keras.layers.Dropout(0.5))
baseline_model_star.add(tf.keras.layers.Dense(9, activation='softmax'))
baseline_model_star.summary()

baseline_model_star.compile(loss='categorical_crossentropy', optimizer="adam",
              metrics=['accuracy', 'mse', 'mae'])

baseline_model_star.fit(x_train, ys_train, validation_data=(x_val, ys_val),
          epochs = 8, batch_size = 32,
          callbacks = [early_stopping, tensorboard_b_callback_star])

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 74, 100)           2024000   
                                                                 
 bidirectional_2 (Bidirectio  (None, 128)              63744     
 nal)                                                            
                                                                 
 dropout_2 (Dropout)         (None, 128)               0         
                                                                 
 dense_2 (Dense)             (None, 9)                 1161      
                                                                 
Total params: 2,088,905
Trainable params: 64,905
Non-trainable params: 2,024,000
_________________________________________________________________
Epoch 1/8
Epoch 2/8
Epoch 3/8
Epoch 4/8
Epoch 5/8
Epoch 6/8
Epoch 7/8
Epoch 8/8


<keras.callbacks.History at 0x7f9797ab9750>

In [None]:
%tensorboard --logdir logs_baseline_star/fit --port 6009

In [None]:
# Evaluate the model_star on test set
baseline_model_star_evaluation = baseline_model_star.evaluate(x_test, ys_test)
print("Baseline Model (star) test loss:", baseline_model_star_evaluation[0])
print("Baseline Model (star) test accuracy:", baseline_model_star_evaluation[1])

# Make predictions
baseline_ys_predictions = baseline_model_star.predict(x_test)

# See MAE and MSE
print("Baseline Model (star) test MAE: ", mean_absolute_error(ys_test, baseline_ys_predictions))
print("Baseline Model (star) test MSE: ", mean_squared_error(ys_test, baseline_ys_predictions))

Baseline Model (star) test loss: 0.9880515933036804
Baseline Model (star) test accuracy: 0.5828328132629395
Baseline Model (star) test MAE:  0.116656095
Baseline Model (star) test MSE:  0.059395954


Also run on the y_difficulty

In [None]:
log_b_diff_dir = "logs_baseline_diff/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_b_callback_diff = tf.keras.callbacks.TensorBoard(log_dir = log_b_diff_dir,
                                                      histogram_freq = 1)
baseline_model_diff = tf.keras.Sequential()
baseline_model_diff.add(embedding_layer)
baseline_model_diff.add(tf.keras.layers.Bidirectional(tf.keras.layers.GRU(64)))
baseline_model_diff.add(tf.keras.layers.Dropout(0.5))
baseline_model_diff.add(tf.keras.layers.Dense(9, activation='softmax'))
baseline_model_diff.summary()

baseline_model_diff.compile(loss='categorical_crossentropy', optimizer = "adam",
              metrics=['accuracy', 'mse', 'mae'])

baseline_model_diff.fit(x_train, yd_train, validation_data = (x_val, yd_val),
          epochs = 8, batch_size = 32,
          callbacks = [early_stopping, tensorboard_b_callback_diff])

Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 74, 100)           2024000   
                                                                 
 bidirectional_3 (Bidirectio  (None, 128)              63744     
 nal)                                                            
                                                                 
 dropout_3 (Dropout)         (None, 128)               0         
                                                                 
 dense_3 (Dense)             (None, 9)                 1161      
                                                                 
Total params: 2,088,905
Trainable params: 64,905
Non-trainable params: 2,024,000
_________________________________________________________________
Epoch 1/8
Epoch 2/8
Epoch 3/8
Epoch 4/8
Epoch 5/8
Epoch 6/8
Epoch 7/8
Epoch 8/8


<keras.callbacks.History at 0x7f9769249ba0>

In [None]:
%tensorboard --logdir logs_baseline_diff/fit --port 6010

In [None]:
# Evaluate the model_diff on test set
baseline_model_diff_evaluation = baseline_model_diff.evaluate(x_test, yd_test)
print("Baseline Model (diff) test loss:", baseline_model_diff_evaluation[0])
print("Baseline Model (diff) test accuracy:", baseline_model_diff_evaluation[1])

# Make predictions
baseline_yd_predictions = baseline_model_diff.predict(x_test)

# See MAE and MSE
print("Baseline Model (diff) test MAE: ", mean_absolute_error(yd_test, baseline_yd_predictions))
print("Baseline Model (diff) test MSE: ", mean_squared_error(yd_test, baseline_yd_predictions))

Baseline Model (diff) test loss: 1.2666878700256348
Baseline Model (diff) test accuracy: 0.4746621549129486
Baseline Model (diff) test MAE:  0.14045009
Baseline Model (diff) test MSE:  0.07239273


Hypertuning the two models with regularization techniques and more an additional bidrectional GRU layer, but with less units.

In [None]:
log_h1_star_dir = "logs_h1_star/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_h1_callback_star = tf.keras.callbacks.TensorBoard(log_dir = log_h1_star_dir,
                                                      histogram_freq = 1)
h1_model_star = tf.keras.Sequential()
h1_model_star.add(embedding_layer)
h1_model_star.add(tf.keras.layers.Bidirectional(tf.keras.layers.GRU(64, return_sequences = True,
                                                                 kernel_regularizer = tf.keras.regularizers.l2(0.000001),
                                                                 bias_regularizer = tf.keras.regularizers.l2(0.000001))))
h1_model_star.add(tf.keras.layers.Bidirectional(tf.keras.layers.GRU(32,
                                                                 kernel_regularizer = tf.keras.regularizers.l2(0.000001),
                                                                 bias_regularizer = tf.keras.regularizers.l2(0.000001))))
h1_model_star.add(tf.keras.layers.Dropout(0.5))
h1_model_star.add(tf.keras.layers.Dense(9, activation='softmax'))

h1_model_star.summary()

h1_model_star.compile(loss='categorical_crossentropy', optimizer="adam",
              metrics=['accuracy', 'mse', 'mae'])

h1_model_star.fit(x_train, ys_train, validation_data=(x_val, ys_val),
          epochs = 8, batch_size = 32,
          callbacks = [early_stopping, tensorboard_h1_callback_star])

Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 74, 100)           2024000   
                                                                 
 bidirectional_4 (Bidirectio  (None, 74, 128)          63744     
 nal)                                                            
                                                                 
 bidirectional_5 (Bidirectio  (None, 64)               31104     
 nal)                                                            
                                                                 
 dropout_4 (Dropout)         (None, 64)                0         
                                                                 
 dense_4 (Dense)             (None, 9)                 585       
                                                                 
Total params: 2,119,433
Trainable params: 95,433
Non-t

<keras.callbacks.History at 0x7f97601c7850>

In [None]:
%tensorboard --logdir logs_h1_star/fit --port 6011

In [None]:
h1_model_star_evaluation = h1_model_star.evaluate(x_test, ys_test)
print("h1 Model (star) test loss:", h1_model_star_evaluation[0])
print("h1 Model (star) test accuracy:", h1_model_star_evaluation[1])

# Make predictions
h1_ys_predictions = h1_model_star.predict(x_test)

# See MAE and MSE
print("h1 Model (star) test MAE: ", mean_absolute_error(ys_test, h1_ys_predictions))
print("h1 Model (star) test MSE: ", mean_squared_error(ys_test, h1_ys_predictions))

h1 Model (star) test loss: 0.9555679559707642
h1 Model (star) test accuracy: 0.6123623847961426
h1 Model (star) test MAE:  0.10414232
h1 Model (star) test MSE:  0.05606264


In [None]:
log_h1_diff_dir = "logs_h1_diff/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_h1_callback_diff = tf.keras.callbacks.TensorBoard(log_dir = log_h1_diff_dir,
                                                      histogram_freq = 1)
h1_model_diff = tf.keras.Sequential()
h1_model_diff.add(embedding_layer)
h1_model_diff.add(tf.keras.layers.Bidirectional(tf.keras.layers.GRU(64, return_sequences = True,
                                                                 kernel_regularizer = tf.keras.regularizers.l2(0.000001),
                                                                 bias_regularizer = tf.keras.regularizers.l2(0.000001))))
h1_model_diff.add(tf.keras.layers.Bidirectional(tf.keras.layers.GRU(32,
                                                                 kernel_regularizer = tf.keras.regularizers.l2(0.000001),
                                                                 bias_regularizer = tf.keras.regularizers.l2(0.000001))))
h1_model_diff.add(tf.keras.layers.Dropout(0.5))
h1_model_diff.add(tf.keras.layers.Dense(9, activation='softmax'))
h1_model_diff.summary()

h1_model_diff.compile(loss='categorical_crossentropy', optimizer="adam",
              metrics=['accuracy', 'mse', 'mae'])

h1_model_diff.fit(x_train, yd_train, validation_data=(x_val, yd_val),
          epochs = 8, batch_size = 32,
          callbacks = [early_stopping, tensorboard_h1_callback_diff])

Model: "sequential_5"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 74, 100)           2024000   
                                                                 
 bidirectional_6 (Bidirectio  (None, 74, 128)          63744     
 nal)                                                            
                                                                 
 bidirectional_7 (Bidirectio  (None, 64)               31104     
 nal)                                                            
                                                                 
 dropout_5 (Dropout)         (None, 64)                0         
                                                                 
 dense_5 (Dense)             (None, 9)                 585       
                                                                 
Total params: 2,119,433
Trainable params: 95,433
Non-t

<keras.callbacks.History at 0x7f97603cbd00>

In [None]:
%tensorboard --logdir logs_h1_diff/fit --port 6012

In [None]:
# Evaluate the model_diff on test set
h1_model_diff_evaluation = h1_model_diff.evaluate(x_test, yd_test)
print("h1 Model (diff) test loss:", h1_model_diff_evaluation[0])
print("h1 Model (diff) test accuracy:", h1_model_diff_evaluation[1])

# Make predictions
h1_yd_predictions = h1_model_diff.predict(x_test)

# See MAE and MSE
print("h1 Model (diff) test MAE: ", mean_absolute_error(yd_test, h1_yd_predictions))
print("h1 Model (diff) test MSE: ", mean_squared_error(yd_test, h1_yd_predictions))

h1 Model (diff) test loss: 1.2698380947113037
h1 Model (diff) test accuracy: 0.4916166067123413
h1 Model (diff) test MAE:  0.13090636
h1 Model (diff) test MSE:  0.071683496


As one may see, adding an additional bidirectional GRU layers and regularizers improved a good ordeal.

Let's see if applying dropout and recurrent_dropout to the GRU layers will improve the performance as well.

This model with recurrent dropout and dropout for each of the 2 bidirectional GRUs has also been attempted, but due to taking an ETA of 30 minutes per epoch according to Google Colab and the following error, this model is being placed on hold and will continue with the h1 model.

Note: WARNING:tensorflow:Layer gru_8 will not use cuDNN kernels since it doesn't meet the criteria. It will use a generic GPU kernel as fallback when running on GPU.

Implement the best performing model, h1,  at a higher epoch 15. 20 would have been preferred, but choosing 15 due to time and GPU constraints.

In [None]:
log_hh1_star_dir = "logs_hh1_star/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_hh1_callback_star = tf.keras.callbacks.TensorBoard(log_dir = log_hh1_star_dir,
                                                      histogram_freq = 1)
hh1_model_star = tf.keras.Sequential()
hh1_model_star.add(embedding_layer)
hh1_model_star.add(tf.keras.layers.Bidirectional(tf.keras.layers.GRU(64, return_sequences = True,
                                                                 kernel_regularizer = tf.keras.regularizers.l2(0.000001),
                                                                 bias_regularizer = tf.keras.regularizers.l2(0.000001))))
hh1_model_star.add(tf.keras.layers.Bidirectional(tf.keras.layers.GRU(32,
                                                                 kernel_regularizer = tf.keras.regularizers.l2(0.000001),
                                                                 bias_regularizer = tf.keras.regularizers.l2(0.000001))))
hh1_model_star.add(tf.keras.layers.Dropout(0.5))
hh1_model_star.add(tf.keras.layers.Dense(9, activation='softmax'))

hh1_model_star.summary()

hh1_model_star.compile(loss='categorical_crossentropy', optimizer="adam",
              metrics=['accuracy', 'mse', 'mae'])

hh1_model_star.fit(x_train, ys_train, validation_data=(x_val, ys_val),
          epochs = 15, batch_size = 32,
          callbacks = [early_stopping, tensorboard_hh1_callback_star])

Model: "sequential_10"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_2 (Embedding)     (None, 72, 100)           2027600   
                                                                 
 bidirectional_16 (Bidirecti  (None, 72, 128)          63744     
 onal)                                                           
                                                                 
 bidirectional_17 (Bidirecti  (None, 64)               31104     
 onal)                                                           
                                                                 
 dropout_10 (Dropout)        (None, 64)                0         
                                                                 
 dense_10 (Dense)            (None, 9)                 585       
                                                                 
Total params: 2,123,033
Trainable params: 95,433
Non-

<keras.callbacks.History at 0x7f97a1d41660>

In [None]:
%tensorboard --logdir logs_hh1_star/fit --port 6013

In [None]:
hh1_model_star_evaluation = hh1_model_star.evaluate(x_test, ys_test)
print("hh1 Model (star) test loss:", hh1_model_star_evaluation[0])
print("hh1 Model (star) test accuracy:", hh1_model_star_evaluation[1])

# Make predictions
hh1_ys_predictions = hh1_model_star.predict(x_test)

# See MAE and MSE
print("hh1 Model (star) test MAE: ", mean_absolute_error(ys_test, hh1_ys_predictions))
print("hh1 Model (star) test MSE: ", mean_squared_error(ys_test, hh1_ys_predictions))


hh1 Model (star) test loss: 0.9485245943069458
hh1 Model (star) test accuracy: 0.6744869947433472
hh1 Model (star) test MAE:  0.084676474
hh1 Model (star) test MSE:  0.05126253


In [None]:
log_hh1_diff_dir = "logs_hh1_diff/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_hh1_callback_diff = tf.keras.callbacks.TensorBoard(log_dir = log_hh1_diff_dir,
                                                      histogram_freq = 1)
hh1_model_diff = tf.keras.Sequential()
hh1_model_diff.add(embedding_layer)
hh1_model_diff.add(tf.keras.layers.Bidirectional(tf.keras.layers.GRU(64, return_sequences = True,
                                                                 kernel_regularizer = tf.keras.regularizers.l2(0.000001),
                                                                 bias_regularizer = tf.keras.regularizers.l2(0.000001))))
hh1_model_diff.add(tf.keras.layers.Bidirectional(tf.keras.layers.GRU(32,
                                                                 kernel_regularizer = tf.keras.regularizers.l2(0.000001),
                                                                 bias_regularizer = tf.keras.regularizers.l2(0.000001))))
hh1_model_diff.add(tf.keras.layers.Dropout(0.5))
hh1_model_diff.add(tf.keras.layers.Dense(9, activation='softmax'))
hh1_model_diff.summary()

hh1_model_diff.compile(loss='categorical_crossentropy', optimizer="adam",
              metrics=['accuracy', 'mse', 'mae'])

hh1_model_diff.fit(x_train, yd_train, validation_data=(x_val, yd_val),
          epochs = 15, batch_size = 32,
          callbacks = [early_stopping, tensorboard_hh1_callback_diff])

Model: "sequential_11"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_2 (Embedding)     (None, 72, 100)           2027600   
                                                                 
 bidirectional_18 (Bidirecti  (None, 72, 128)          63744     
 onal)                                                           
                                                                 
 bidirectional_19 (Bidirecti  (None, 64)               31104     
 onal)                                                           
                                                                 
 dropout_11 (Dropout)        (None, 64)                0         
                                                                 
 dense_11 (Dense)            (None, 9)                 585       
                                                                 
Total params: 2,123,033
Trainable params: 95,433
Non-

<keras.callbacks.History at 0x7f96e96d16f0>

In [None]:
%tensorboard --logdir logs_hh1_diff/fit --port 6014

In [None]:
# Evaluate the model_diff on test set
hh1_model_diff_evaluation = hh1_model_diff.evaluate(x_test, yd_test)
print("hh1 Model (diff) test loss:", hh1_model_diff_evaluation[0])
print("hh1 Model (diff) test accuracy:", hh1_model_diff_evaluation[1])

# Make predictions
hh1_yd_predictions = hh1_model_diff.predict(x_test)

# See MAE and MSE
print("hh1 Model (diff) test MAE: ", mean_absolute_error(yd_test, hh1_yd_predictions))
print("hh1 Model (diff) test MSE: ", mean_squared_error(yd_test, hh1_yd_predictions))


hh1 Model (diff) test loss: 1.2498631477355957
hh1 Model (diff) test accuracy: 0.5589964985847473
hh1 Model (diff) test MAE:  0.11251475
hh1 Model (diff) test MSE:  0.06696774


In conclusion, it is clear from the results above that complicating the structure with an additional bidirectional GRU, as well as applying L2 regularization techniques to both bidirectional GRU layers, increased the performance drastically.

For future work, incorporating recurrent dropouts and dropouts for each bidirectional GRU layer may yield increased performance.

Summary of results from model only using the original dataset without augmented data:

Original Model (star) test accuracy: 0.34533634781837463

Original Baseline Model (star) test MAE:  0.16904278

Original Baseline Model (star) test MSE:  0.08486905

Original Model (diff) test loss: 1.5869672298431396

Original Model (diff) test accuracy: 0.2723180651664734

Original Baseline Model (diff) test MAE:  0.17599066

Original Baseline Model (diff) test MSE:  0.08782036

Summary of results with models using augmented data in addition to the original dataset:

Baseline Model (star) test loss: 0.9880515933036804

Baseline Model (star) test accuracy: 0.5828328132629395

Baseline Model (star) test MAE:  0.116656095

Baseline Model (star) test MSE:  0.059395954

Baseline Model (diff) test loss: 1.2666878700256348

Baseline Model (diff) test accuracy: 0.4746621549129486

Baseline Model (diff) test MAE:  0.14045009

Baseline Model (diff) test MSE:  0.07239273

Hypertuned Model (star) test loss: 0.9485245943069458

Hypertuned Model (star) test accuracy: 0.6744869947433472

Hypertuned Model (star) test MAE:  0.084676474

Hypertuned Model (star) test MSE:  0.05126253

Hypertuned Model (diff) test loss: 1.2498631477355957

Hypertuned Model (diff) test accuracy: 0.5589964985847473

Hypertuned Model (diff) test MAE:  0.11251475

Hypertuned Model (diff) test MSE:  0.06696774