# Question 1 - Tensorflow and Keras [50 points]
In this question you will use Tensorflow and Keras to train deep neural network models using the 80% training sample from question 0. You will then compare these models using the 10% validation sample.

Use the same data partition as for question 0. As stated above, if you didn’t previously divide data into test, validation, and 10 training folds based on unique materials (see unique_m.csv), redo the data splitting so that materials (rather than rows) are randomized among training folds and the validation and test sets.

## part a [40 points]
Train a series of deep neural network regression models to predict the critical temperature from the superconductivity dataset. Train a minimum of 5 models meeting the following conditions:

at least one model has 3 hidden layers,

at least one model has 1-2 hidden layers,

at least one model uses L1 or L2 regularization,

at least one model uses dropout,

all models should use MSE or its equivalent as the loss function.

All other details about model architecture are up to you. You may consider more than 5 models, but 5 is the minimum requirement.

Compare your models using the 10% validation set and select the best performing model.

In [13]:
# modules: --------------------------------------------------------------------
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow.keras import *
from tensorflow.keras import layers
from numpy import mean
from numpy import absolute
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedKFold
from sklearn.metrics import mean_squared_error

In [10]:
filepath = "/Users/ShuyanLi/Desktop/Umich_lsy/STATS507/HW8/"
df_data = pd.read_csv(filepath+"train.csv")
# split the cases into three parts
# 80% of the cases for training
# 10% of the cases for validation
# 10% of the cases for testing
train, validate, test = np.split(df_data, [int(.8 * len(df_data)), int(.9 * len(df_data))])

In [5]:
# 1 hidden layer(128), dropout, relu activation
model_1 = tf.keras.models.Sequential([
  tf.keras.layers.Dense(81, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(1)
])
# 0 hidden layer, relu activation
model_2 = tf.keras.models.Sequential([
  tf.keras.layers.Dense(81, activation='relu'),
  tf.keras.layers.Dense(1)
])
# 0 hidden layer, dropout, sigmoid activation
model_3 = tf.keras.models.Sequential([
  tf.keras.layers.Dense(81, activation='sigmoid'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(1)
])
# 3 hidden layers(128, 256, 128), dropout, regularization, relu activation
model_4= tf.keras.models.Sequential([
  tf.keras.layers.Dense(81, activation='relu', kernel_regularizer=regularizers.l1_l2(l1=0.01, l2=0.01)),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(128, activation='relu', kernel_regularizer=regularizers.l1_l2(l1=0.01, l2=0.01)),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(256, activation='relu', kernel_regularizer=regularizers.l1_l2(l1=0.01, l2=0.01)),
  tf.keras.layers.Dropout(0.3),
  tf.keras.layers.Dense(128, activation='relu', kernel_regularizer=regularizers.l1_l2(l1=0.01, l2=0.01)),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(1)
])
# 2 hidden layers(128, 128), dropout, regularization, relu activation
model_5 = tf.keras.models.Sequential([
  tf.keras.layers.Dense(81, activation='relu', kernel_regularizer=regularizers.l1_l2(l1=0.01, l2=0.01)),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(128, activation='relu', kernel_regularizer=regularizers.l1_l2(l1=0.01, l2=0.01)),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(128, activation='relu', kernel_regularizer=regularizers.l1_l2(l1=0.01, l2=0.01)),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(1)
])

In [23]:
# model 1,2,3,4,5
# all models should use MSE or its equivalent as the loss function.
model_1.compile(
    optimizer='adam',
    loss=tf.keras.losses.MeanSquaredError(),
    metrics=['mse']
)
model_2.compile(
    optimizer='adam',
    loss=tf.keras.losses.MeanSquaredError(),
    metrics=['mse']
)
model_3.compile(
    optimizer='adam',
    loss=tf.keras.losses.MeanSquaredError(),
    metrics=['mse']
)
model_4.compile(
    optimizer='adam',
    loss=tf.keras.losses.MeanSquaredError(),
    metrics=['mse']
)
model_5.compile(
    optimizer='adam',
    loss=tf.keras.losses.MeanSquaredError(),
    metrics=['mse']
)

In [26]:
# Use data in train
train_data = train.values
X_train, y_train = train_data[:, :-1], train_data[:, -1]
# Use data in validation
validate_data = validate.values
X_validate, y_validate = validate_data[:, :-1], validate_data[:, -1]

# Fit models
h1 = model_1.fit(X_train, y_train, epochs=10)
m1_loss, m1_Mean_Squared_Errors = model_1.evaluate(X_validate, y_validate, verbose=1)
print('Model 1 Loss {}, Model 1 Mean_Squared_Errors {}'.format(m1_loss, m1_Mean_Squared_Errors))

h2 = model_2.fit(X_train, y_train, epochs=10)
m2_loss, m2_Mean_Squared_Errors = model_2.evaluate(X_validate, y_validate, verbose=1)
print('Model 2 Loss {}, Model 2 Mean_Squared_Errors {}'.format(m2_loss, m2_Mean_Squared_Errors))

h3 = model_3.fit(X_train, y_train, epochs=10)
m3_loss, m3_Mean_Squared_Errors = model_3.evaluate(X_validate, y_validate, verbose=1)
print('Model 3 Loss {}, Model 3 Mean_Squared_Errors {}'.format(m3_loss, m3_Mean_Squared_Errors))

h4 = model_4.fit(X_train, y_train, epochs=10)
m4_loss, m4_Mean_Squared_Errors = model_4.evaluate(X_validate, y_validate, verbose=1)
print('Model 4 Loss {}, Model 4 Mean_Squared_Errors {}'.format(m4_loss, m4_Mean_Squared_Errors))

h5 = model_5.fit(X_train, y_train, epochs=10)
m5_loss, m5_Mean_Squared_Errors = model_5.evaluate(X_validate, y_validate, verbose=1)
print('Model 5 Loss {}, Model 5 Mean_Squared_Errors {}'.format(m5_loss, m5_Mean_Squared_Errors))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Model 1 Loss 266.6283222693715, Model 1 Mean_Squared_Errors 266.6283264160156
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Model 2 Loss 244.56510474383552, Model 2 Mean_Squared_Errors 244.56509399414062
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Model 3 Loss 555.0392446150112, Model 3 Mean_Squared_Errors 555.0392456054688
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Model 4 Loss 422.63677768581675, Model 4 Mean_Squared_Errors 406.0910339355469
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Model 5 Loss 327.7869746332895, Model 5 Mean_Squared_Errors 318.8554992675781


Model 2 gives the smallest MSE when using the validate dataset.

## part b [10 points]
Compute and report the MSE on the test dataset for the best performing model(model 2) from part a.

In [27]:
# Use data in test
test_data = test.values
X_test, y_test = test_data[:, :-1], test_data[:, -1]

m2_loss, m2_Mean_Squared_Errors = model_2.evaluate(X_test, y_test, verbose=1)
print('Test data: Model 2 Loss {}, Model 2 Mean_Squared_Errors {}'.format(m2_loss, m2_Mean_Squared_Errors))

Test data: Model 2 Loss 292.1271716011596, Model 2 Mean_Squared_Errors 292.127197265625


The MSE on the test dataset for the best performing model (model 2) is 292.127197265625.