Problem 2A: Implement the two layer neural network in ‘simple neural network.ipynb’ using keras. Modify the loss function so it is designed for binary classification

Response/Comments for problem 2a

In the code block below, I changed the loss function from mean squared error to cross entropy. The cross entropy loss function was already precoded so I commented out the original mean squared error and uncommented the cross entropy function.

When using the mean squared error loss, after the final epoch, the validation loss was 0.0661~ and the validation accuracy was 1.0. When comparing to the cross entropy loss, we find that the validation loss is 0.2761~ and the validation accuracy is 1.0. A slightly different validation loss is to be expected as MSE is not typically suited for classification problems and more tailored for handling regression problems due to the nature of the ouput (i.e. Classification outputs being 1 or 0 versus a scalar value for regression)

Mean Squared Error Loss Table
1. Epoch:     0 ; Validation loss:  0.2674 ; Validation accuracy:   0.625
2. Epoch:   100 ; Validation loss:  0.1989 ; Validation accuracy:  0.7375
3. Epoch:   200 ; Validation loss:  0.1629 ; Validation accuracy:  0.8125
4. Epoch:   300 ; Validation loss:  0.1361 ; Validation accuracy:    0.85
5. Epoch:   400 ; Validation loss:  0.1161 ; Validation accuracy:  0.8625
6. Epoch:   500 ; Validation loss:   0.101 ; Validation accuracy:  0.9125
7. Epoch:   600 ; Validation loss:  0.0893 ; Validation accuracy:  0.9375
8. Epoch:   700 ; Validation loss:  0.0799 ; Validation accuracy:    0.95
9. Epoch:   800 ; Validation loss:  0.0723 ; Validation accuracy:  0.9875
10. Epoch:   900 ; Validation loss:  0.0661 ; Validation accuracy:     1.0

Cross Entropy Loss Table
1. Epoch:     0 ; Validation loss:  0.8878 ; Validation accuracy:  0.6125
2. Epoch:   100 ; Validation loss:  0.6305 ; Validation accuracy:  0.6875
3. Epoch:   200 ; Validation loss:  0.5569 ; Validation accuracy:   0.825
4. Epoch:   300 ; Validation loss:  0.4888 ; Validation accuracy:  0.8875
5. Epoch:   400 ; Validation loss:  0.4306 ; Validation accuracy:  0.9125
6. Epoch:   500 ; Validation loss:  0.3836 ; Validation accuracy:   0.925
7. Epoch:   600 ; Validation loss:  0.3462 ; Validation accuracy:  0.9375
8. Epoch:   700 ; Validation loss:  0.3171 ; Validation accuracy:  0.9875
9. Epoch:   800 ; Validation loss:  0.2944 ; Validation accuracy:  0.9875
10. Epoch:   900 ; Validation loss:  0.2761 ; Validation accuracy:     1.0

In [1]:
import numpy as np 
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
# We will be using make_circles from scikit-learn
from sklearn.datasets import make_circles

SEED = 2017

# We create an inner and outer circle
X, y = make_circles(n_samples=400, factor=.3, noise=.05, random_state=2017)
outer = y == 0
inner = y == 1

# plt.title("Two Circles")
# plt.plot(X[outer, 0], X[outer, 1], "ro")
# plt.plot(X[inner, 0], X[inner, 1], "bo")
# plt.show()

#print(X)
X = X+1
#print(X)

X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=SEED)

def logistic(x):
    return 1 / (1 + np.exp(-x))

n_hidden = 50 # number of hidden units
n_epochs = 1000
learning_rate = 1 

# Initialise weights
weights_hidden = np.random.normal(0.0, size=(X_train.shape[1], n_hidden))
weights_output = np.random.normal(0.0, size=(n_hidden))

hist_loss = []
hist_accuracy = []

for e in range(n_epochs):
    del_w_hidden = np.zeros(weights_hidden.shape)
    del_w_output = np.zeros(weights_output.shape)

    # Loop through training data in batches of 1
    for x_, y_ in zip(X_train, y_train):
        # Forward computations
        hidden_input = np.dot(x_, weights_hidden)
        hidden_output = logistic(hidden_input)
        output = logistic(np.dot(hidden_output, weights_output)) #p( y = 1 | x)

        # Backward computations

        error = y_ - output #we can use this since y is either 0 or 1
        output_error = error * output * (1 - output)  #this is the gradient of logistic function
        
        hidden_error = np.dot(output_error, weights_output) * hidden_output * (1 - hidden_output)

        del_w_output += output_error * hidden_output
        del_w_hidden += hidden_error * x_[:, None]

    # Update weights
    weights_hidden += learning_rate * del_w_hidden / X_train.shape[0]
    weights_output += learning_rate * del_w_output / X_train.shape[0]

    # Print stats (validation loss and accuracy)
    if e % 100 == 0:
        hidden_output = logistic(np.dot(X_val, weights_hidden))
        out = logistic(np.dot(hidden_output, weights_output))

        #mean square error 
        #loss = np.mean((out - y_val) ** 2)

        #cross entropy error
        loss  = np.mean(-(y_val *np.log(out)) - ((1-y_val)*np.log(1-out)))

        # Final prediction is based on a threshold of 0.5
        predictions = out > 0.5
        accuracy = np.mean(predictions == y_val)
        print("Epoch: ", '{:>4}'.format(e), 
            "; Validation loss: ", '{:>6}'.format(loss.round(4)), 
            "; Validation accuracy: ", '{:>6}'.format(accuracy.round(4)))

Epoch:     0 ; Validation loss:  0.7055 ; Validation accuracy:  0.6375
Epoch:   100 ; Validation loss:  0.6059 ; Validation accuracy:   0.725
Epoch:   200 ; Validation loss:  0.5306 ; Validation accuracy:     0.8
Epoch:   300 ; Validation loss:  0.4689 ; Validation accuracy:   0.825
Epoch:   400 ; Validation loss:  0.4177 ; Validation accuracy:  0.8875
Epoch:   500 ; Validation loss:  0.3763 ; Validation accuracy:   0.925
Epoch:   600 ; Validation loss:  0.3432 ; Validation accuracy:  0.9375
Epoch:   700 ; Validation loss:  0.3165 ; Validation accuracy:  0.9875
Epoch:   800 ; Validation loss:  0.2947 ; Validation accuracy:  0.9875
Epoch:   900 ; Validation loss:  0.2764 ; Validation accuracy:     1.0


# Problem 2B: Modify the three layer neural network in ‘wine-classify.ipynb’ so it is solving a multi-class classification problem instead of a regression problem.

---

# Response/Comments for problem 2b

- Epoch 2000/2000
- 20/20 - 0s - 2ms/step - accuracy: 0.9953 - loss: 0.0175 - val_accuracy: 0.6562 - val_loss: 4.1651
- 10/10 ━━━━━━━━━━━━━━━━━━━━ 0s 523us/step - accuracy: 0.6478 - loss: 4.5418
- Test accuracy: 65.62%

In [1]:
import numpy as np 
import pandas as pd

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder

from keras.models import Sequential
from keras.layers import Dense
from keras.callbacks import EarlyStopping, ModelCheckpoint
from keras.optimizers import Adam
from keras.layers import Input
from keras.backend import floatx

from keras.utils import to_categorical

from sklearn.preprocessing import StandardScaler

SEED = 2017

pathname = '/Users/spencerkerkau/Desktop/Lec7code/Data/winequality-red.csv'
data = pd.read_csv(pathname, sep=';')

y = data['quality']
X = data.drop(['quality'], axis=1)


# encode class values as integers
encoder = LabelEncoder()
encoder.fit(y)
encoded_Y = encoder.transform(y)
# convert integers to dummy variables (i.e. one hot encoded)
y = to_categorical(encoded_Y)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=SEED)

scaler = StandardScaler().fit(X_train)
X_train = pd.DataFrame(scaler.transform(X_train))
X_test = pd.DataFrame(scaler.transform(X_test))

model = Sequential()

# Use Input(shape) as the first layer
model.add(Input(shape=(X_train.shape[1],)))

# First hidden layer with 100 hidden units
model.add(Dense(200, activation='relu')) 

# Second hidden layer with 50 hidden units
model.add(Dense(25, activation = 'relu'))

# Output layer
model.add(Dense(6, activation = 'softmax'))

# Set optimizer
opt = Adam()

#Compile model
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])

callbacks = [
             EarlyStopping(monitor='val_loss', patience=20, verbose=2),
             ModelCheckpoint('checkpoints/multi_layer_best_model.keras', monitor='val_loss', save_best_only=True, verbose=0)
            ]

n_batch_size = 64
n_epochs = 20 #5000

# Fit the model
model.fit(X_train, y_train, batch_size=n_batch_size, epochs=n_epochs, validation_split=0.2, verbose=2, validation_data=(X_test, y_test))

best_model = model
best_model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])

# Evaluate on test set
score = best_model.evaluate(X_test.values, y_test, verbose=1)
print('Test accuracy: %.2f%%' % (score[1]*100))

# Test accuracy: 65.62% 
# Benchmark accuracy on dataset 62.4%

Epoch 1/20
20/20 - 0s - 19ms/step - accuracy: 0.3862 - loss: 1.5650 - val_accuracy: 0.5750 - val_loss: 1.3345
Epoch 2/20
20/20 - 0s - 1ms/step - accuracy: 0.5668 - loss: 1.2154 - val_accuracy: 0.5688 - val_loss: 1.0883
Epoch 3/20
20/20 - 0s - 1ms/step - accuracy: 0.5747 - loss: 1.0650 - val_accuracy: 0.6062 - val_loss: 0.9936
Epoch 4/20
20/20 - 0s - 1ms/step - accuracy: 0.5966 - loss: 1.0095 - val_accuracy: 0.6000 - val_loss: 0.9715
Epoch 5/20
20/20 - 0s - 1ms/step - accuracy: 0.6083 - loss: 0.9835 - val_accuracy: 0.5906 - val_loss: 0.9561
Epoch 6/20
20/20 - 0s - 1ms/step - accuracy: 0.6091 - loss: 0.9613 - val_accuracy: 0.6031 - val_loss: 0.9450
Epoch 7/20
20/20 - 0s - 1ms/step - accuracy: 0.6153 - loss: 0.9472 - val_accuracy: 0.6094 - val_loss: 0.9342
Epoch 8/20
20/20 - 0s - 1ms/step - accuracy: 0.6200 - loss: 0.9328 - val_accuracy: 0.6187 - val_loss: 0.9249
Epoch 9/20
20/20 - 0s - 1ms/step - accuracy: 0.6271 - loss: 0.9225 - val_accuracy: 0.6062 - val_loss: 0.9300
Epoch 10/20
20/20 