# Artificial Intelligence
# 464/664
# Assignment #7

## General Directions for this Assignment

00. We're using a Jupyter Notebook environment (tutorial available here: https://jupyter-notebook-beginner-guide.readthedocs.io/en/latest/what_is_jupyter.html),
01. Output format should be exactly as requested (it is your responsibility to make sure notebook looks as expected on Gradescope),
02. Check submission deadline on Gradescope, 
03. Rename the file to Last_First_assignment_7, 
04. Submit your notebook (as .ipynb, not PDF) using Gradescope, and
05. Do not submit any other files.

## Before You Submit...

1. Re-read the general instructions provided above, and
2. Hit "Kernel"->"Restart & Run All".

## Neural Networks

For this assignment we will explore Neural Networks; in particular, we are going to explore model complexity. We will use the same dataset from Assignment #6 to classify a mushroom as either edible ('e') or poisonous ('p'). You are free to use PyTorch, TensorFlow, scikit-learn -- to name a few resources. The goal is to explore different model complexities (architectures) before declaring a winner. Either start with a simple network and make it more complex; or start with a complex model and pare it down. Either way, your submission should clearly demonstrate your exploration. 


Your output for each model should look like the output of `cross_validate` from Assignment #6:

```
Fold: 0	Train Error: 15.38%	Validation Error: 0.00%
Fold: 1
...

Mean(Std. Dev.) over all folds:
-------------------------------
Train Error: 100.00%(0.00%) Test Error: 100.00%(0.00%)
```

Notice that "Test Error" has been replaced by "Validation Error." Split your dataset into train, test, and validation sets. 


Start with a simple network. Train using the train set. Observe model's performance using the validation set. 


Increase the complexity of your network. Train using the train set. Observe model's performance using the validation set. 


Model complexity in Assignment #6 was depth limit. You can think of it here as the architecture of the network (number of layers and units per layer). Try at least three different network architectures. 


We're trying to find a model complexity that generalizes well. (Recall high bias vs high variance discussion in class.) 


Pick the network architecture that you deem best. Use the test set to report your winning model's performance. This is the ONLY time you use the test set.


No other directions for this assignment, other than what's here and in the "General Directions" section. You have a lot of freedom with this assignment. Don't get carried away. Try at least three different models; more importantly, document your process. Graders are not going to run your notebooks. The notebook will be read as a report on how different models were explored: what the results were, how the winning model was determined, what was the winning model's performance on the test data. Clearly highlight these items to receive full credit. Since you'll be using libraries, the emphasis will be on your ability to communicate your findings.

In [1]:
import random
import tensorflow as tf
import numpy as np
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
import math
from copy import deepcopy

In [2]:
def parse_data(file_name: str):
    data = []
    file = open(file_name, "r")
    for line in file:
        datum = [value for value in line.rstrip().split(",")]
        data.append(datum)
    random.shuffle(data)
    return data

In [3]:
data_mushroom = parse_data("agaricus-lepiota.data")
data_mushroom = [record[1:]+[record[0]] for record in data_mushroom]

In [4]:
def create_folds(data, n):
    k, m = divmod(len(data), n)
    return list(data[i * k + min(i, m):(i + 1) * k + min(i + 1, m)] for i in range(n))

In [5]:
def create_train_validate_test(folds, fold_index):
    training = []
    validate = []
    test = []
    for i, fold in enumerate(folds):
        if i == fold_index % len(folds):
            validate = fold
        elif i == (fold_index + 1) % len(folds):
            test = fold
        else:
            training = training + fold
    return training, validate, test

In [6]:
def find_error_rate(predicted, actual, num_classes):
    total_errors = 0
    for index in range(len(predicted)):
        if num_classes > 1:
            if actual[index][0] != predicted[index][0]:
                total_errors = total_errors + 1
        else:
            if actual[index] != predicted[index][0]:
                total_errors = total_errors + 1
    error_rate = total_errors / len(predicted)
    return error_rate

In [7]:
def get_stats(observations):
    mean = sum(observations) / len(observations)
    variance = sum([(elem - mean)**2 for elem in observations]) / len(observations)
    std_dev = math.sqrt(variance)
    return mean, std_dev

In [8]:
def convert_string_to_binary(train, validate, number_classes):
    X_train = [record[:-1] for record in train]
    X_val = [record[:-1] for record in validate]
    y_train = [record[-1] for record in train]
    y_val = [record[-1] for record in validate]

    encoder = OneHotEncoder()
    X_combined = np.vstack((X_train, X_val))
    categorical_indices = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21]
    X_encoded = encoder.fit_transform(X_combined[:, categorical_indices])
    X_train_encoded = X_encoded[:len(X_train)]
    X_val_encoded = X_encoded[len(X_train):]

    label_encoder = LabelEncoder()
    y_train_num = label_encoder.fit_transform(y_train)
    y_val_num = label_encoder.transform(y_val)

    if number_classes > 1:
        y_train_num = tf.keras.utils.to_categorical(y_train_num, num_classes=number_classes) 
        y_val_num = tf.keras.utils.to_categorical(y_val_num, num_classes=number_classes)

    return X_train_encoded, X_val_encoded, y_train_num, y_val_num

In [9]:
def convert_string_to_binary_with_test_data(train, validate, test, number_classes):
    X_train = [record[:-1] for record in train]
    X_val = [record[:-1] for record in validate]
    X_test = [record[:-1] for record in test]
    y_train = [record[-1] for record in train]
    y_val = [record[-1] for record in validate]
    y_test = [record[-1] for record in test]

    encoder = OneHotEncoder()
    X_combined = np.vstack((X_train, X_val, X_test))
    categorical_indices = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21]
    X_encoded = encoder.fit_transform(X_combined[:, categorical_indices])
    X_train_encoded = X_encoded[:len(X_train)]
    X_val_encoded = X_encoded[len(X_train):len(X_train) + len(X_val)]
    X_test_encoded = X_encoded[len(X_train) + len(X_val):]

    label_encoder = LabelEncoder()
    y_train_num = label_encoder.fit_transform(y_train)
    y_val_num = label_encoder.transform(y_val)
    y_test_num = label_encoder.transform(y_test)

    if number_classes > 1:
        y_train_num = tf.keras.utils.to_categorical(y_train_num, num_classes=number_classes) 
        y_val_num = tf.keras.utils.to_categorical(y_val_num, num_classes=number_classes)
        y_test_num = tf.keras.utils.to_categorical(y_test_num, num_classes=number_classes)

    return X_train_encoded, X_val_encoded, X_test_encoded, y_train_num, y_val_num, y_test_num

In [10]:
def cross_validate(raw_data, model, number_classes):
    folds = create_folds(data=raw_data, n=10)
    error_list_train, error_list_validate = [], []
    for fold_index in range(len(folds)):
        training_data, validation_data, test_data = create_train_validate_test(folds, fold_index)
        X_train_encoded, X_val_encoded, y_train_num, y_val_num = convert_string_to_binary(training_data, validation_data, number_classes)
        train_predictions = model.predict(X_train_encoded)
        validate_predictions = model.predict(X_val_encoded)
        rounded_train_predictions = np.round(train_predictions)
        rounded_validate_predictions = np.round(validate_predictions)
        error_rate_train = find_error_rate(rounded_train_predictions, y_train_num, number_classes)
        error_rate_validate = find_error_rate(rounded_validate_predictions, y_val_num, number_classes)
        error_list_train.append(error_rate_train)
        error_list_validate.append(error_rate_validate)
    for index in range(len(error_list_train)):
        print(f"Fold: {index}\tTrain Error: {error_list_train[index]*100:.2f}%\tValidation Error: {error_list_validate[index]*100:.2f}%")
    print(f"***")
    print(f"\nMean(Std. Dev.) over all folds:\n-------------------------------")
    print(f"Train Error: {get_stats(error_list_train)[0]*100:.2f}%({get_stats(error_list_train)[1]*100:.2f}%) Validation Error: {get_stats(error_list_validate)[0]*100:.2f}%({get_stats(error_list_validate)[1]*100:.2f}%)")
    print("\n")
    return training_data, validation_data, test_data

In [11]:
def cross_validate_test_data(training_data, validation_data, test_data, model, number_classes):
    error_list_test = []
    for fold_index in range(len(folds)):
        X_train_encoded, X_val_encoded, X_test_encoded, y_train_num, y_val_num, y_test_num = convert_string_to_binary_with_test_data(training_data, validation_data, test_data, number_classes)
        test_predictions = model.predict(X_test_encoded)
        rounded_test_predictions = np.round(test_predictions)
        error_rate_test = find_error_rate(rounded_test_predictions, y_test_num, number_classes)
        error_list_test.append(error_rate_test)
    for index in range(len(error_list_test)):
        print(f"Fold: {index}\tTest Error: {error_list_test[index]*100:.2f}%")
    print(f"***")
    print(f"\nMean(Std. Dev.) over all folds:\n-------------------------------")
    print(f"Test Error: {get_stats(error_list_test)[0]*100:.2f}%({get_stats(error_list_test)[1]*100:.2f}%)")
    print("\n")

<span style="font-size:200%; font-weight:bold;">Network Architecture: [1]</span><br>
<span style="font-size:100%; font-weight:bold;">1 layer with 1 unit</span>

In [12]:
folds = create_folds(data_mushroom, 10)
train, validate, test = create_train_validate_test(folds, 0)

X_train_encoded, X_val_encoded, y_train_num, y_val_num = convert_string_to_binary(train, validate, 1)

num_features = X_train_encoded.shape[1]

model_1 = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(num_features,)),
    tf.keras.layers.Dense(units=1, activation='sigmoid')
])

model_1.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

model_1.summary()

<span style="font-size:150%; font-weight:bold;">Model Accuracy with [1] Architecture</span><br>

In [13]:
losses = model_1.fit(X_train_encoded, y_train_num,
                   validation_data=(X_val_encoded, y_val_num), 
                   batch_size=32,
                   epochs=3)

Epoch 1/3
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.5860 - loss: 0.6612 - val_accuracy: 0.9127 - val_loss: 0.3692
Epoch 2/3
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 479us/step - accuracy: 0.9039 - loss: 0.3442 - val_accuracy: 0.9311 - val_loss: 0.2414
Epoch 3/3
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 507us/step - accuracy: 0.9318 - loss: 0.2347 - val_accuracy: 0.9569 - val_loss: 0.1791


<span style="font-size:150%; font-weight:bold;">Train and Validation Error Rates for [1]</span>

In [14]:
data_mushroom_1 = deepcopy(data_mushroom)
training_data, validation_data, test_data = cross_validate(data_mushroom_1, model_1, 1)

[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 465us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 489us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 377us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 441us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 368us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 458us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 380us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 452us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 376us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 465us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 375us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 460us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 366us/step
[1m26/26[

<span style="font-size:200%; font-weight:bold;">Network Architecture: [1, 8, 1]</span><br>
<span style="font-size:100%; font-weight:bold;">8 units in hidden layer, 1 unit in input and output layers</span>

In [15]:
folds = create_folds(data_mushroom, 10)
train, validate, test = create_train_validate_test(folds, 0)

X_train_encoded, X_val_encoded, y_train_num, y_val_num = convert_string_to_binary(train, validate, 1)

num_features = X_train_encoded.shape[1]
print(num_features)

model_3 = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(num_features,)),
    tf.keras.layers.Dense(units=1, activation='relu'),
    tf.keras.layers.Dense(units=8, activation='relu'),
    tf.keras.layers.Dense(units=1, activation='sigmoid')
])

model_3.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

model_3.summary()

117


<span style="font-size:150%; font-weight:bold;">Model Accuracy with [1, 8, 1] Architecture</span><br>

In [16]:
losses = model_3.fit(X_train_encoded, y_train_num,
                   validation_data=(X_val_encoded, y_val_num),
                   batch_size=32, 
                   epochs=3)

Epoch 1/3
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 981us/step - accuracy: 0.5798 - loss: 0.6478 - val_accuracy: 0.8905 - val_loss: 0.4707
Epoch 2/3
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 527us/step - accuracy: 0.8888 - loss: 0.4446 - val_accuracy: 0.9483 - val_loss: 0.3248
Epoch 3/3
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 539us/step - accuracy: 0.9519 - loss: 0.3116 - val_accuracy: 0.9815 - val_loss: 0.2446


<span style="font-size:150%; font-weight:bold;">Train and Validation Error Rates for [1, 8, 1]</span><br>
<span style="font-size:110%; font-weight:500;">The train error and validation error decreased as the model's complexity increased from [1] to [1, 8, 1], achieved by adding more layers and increasing the number of units in the hidden layer.</span>

In [17]:
data_mushroom_3 = deepcopy(data_mushroom)
training_data, validation_data, test_data = cross_validate(data_mushroom_3, model_3, 1)

[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 503us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 465us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 373us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 468us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 370us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 438us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 378us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 433us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 373us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 423us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 408us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 521us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 388us/step
[1m26/26[

<span style="font-size:200%; font-weight:bold;">Network Architecture: [8, 8, 8]

In [18]:
folds = create_folds(data_mushroom, 10)
train, validate, test = create_train_validate_test(folds, 0)

X_train_encoded, X_val_encoded, y_train_num, y_val_num = convert_string_to_binary(train, validate, 8)

num_features = X_train_encoded.shape[1]

model_6 = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(num_features,)),
    tf.keras.layers.Dense(units=8, activation='relu'),
    tf.keras.layers.Dense(units=8, activation='relu'),
    tf.keras.layers.Dense(units=8, activation='sigmoid')
])

model_6.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

model_6.summary()

<span style="font-size:150%; font-weight:bold;">Model Accuracy with [8, 8, 8] Architecture</span><br>

In [19]:
losses = model_6.fit(X_train_encoded, y_train_num,
                   validation_data=(X_val_encoded, y_val_num), 
                   batch_size=32,
                   epochs=3)

Epoch 1/3
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 969us/step - accuracy: 0.1990 - loss: 0.5265 - val_accuracy: 0.7626 - val_loss: 0.1711
Epoch 2/3
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 534us/step - accuracy: 0.8357 - loss: 0.1474 - val_accuracy: 0.9533 - val_loss: 0.0543
Epoch 3/3
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 543us/step - accuracy: 0.9595 - loss: 0.0424 - val_accuracy: 0.9889 - val_loss: 0.0168


<span style="font-size:150%; font-weight:bold;">Train and Validation Error Rates for [8, 8, 8]</span><br>
<span style="font-size:110%; font-weight:500;">The train error and validation error decreased as the model's complexity increased from [1, 8, 1] to [8, 8, 8], achieved by increasing the number of units in the hidden layer.</span>

In [20]:
data_mushroom_6 = deepcopy(data_mushroom)
training_data, validation_data, test_data = cross_validate(data_mushroom_6, model_6, 8)

[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 515us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 511us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 412us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 457us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 396us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 483us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 420us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 503us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 402us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 507us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 432us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 520us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 427us/step
[1m26/26[

<span style="font-size:200%; font-weight:bold;">(Best Model) Network Architecture: [32, 32, 32]</span><br>
<span style="font-size:110%; font-weight:500;">This model is considered the best because it has the lowest train and validation errors among all models. Despite its increased complexity, it avoids overfitting to the training data, as both validation and test errors remain relatively small.</span>

In [21]:
folds = create_folds(data_mushroom, 10)
train, validate, test = create_train_validate_test(folds, 0)
X_train_encoded, X_val_encoded, X_test_encoded, y_train_num, y_val_num, y_test_num = convert_string_to_binary_with_test_data(train, validate, test, 32)

num_features = X_train_encoded.shape[1]

model_4 = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(num_features,)),
    tf.keras.layers.Dense(units=32, activation='relu'),
    tf.keras.layers.Dense(units=32, activation='relu'),
    tf.keras.layers.Dense(units=32, activation='sigmoid')
])

model_4.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

model_4.summary()

<span style="font-size:150%; font-weight:bold;">Model Accuracy with [32, 32, 32] Architecture</span><br>

In [22]:
losses = model_4.fit(X_train_encoded, y_train_num,
                   validation_data=(X_val_encoded, y_val_num), 
                   batch_size=32,
                   epochs=3)

Epoch 1/3
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1ms/step - accuracy: 0.5134 - loss: 0.3237 - val_accuracy: 0.9582 - val_loss: 0.0102
Epoch 2/3
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 622us/step - accuracy: 0.9735 - loss: 0.0071 - val_accuracy: 1.0000 - val_loss: 0.0011
Epoch 3/3
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 590us/step - accuracy: 0.9992 - loss: 9.4883e-04 - val_accuracy: 1.0000 - val_loss: 3.3385e-04


<span style="font-size:150%; font-weight:bold;">Train and Validation Error Rates for [32, 32, 32]</span><br>
<span style="font-size:110%; font-weight:500;">The train error and validation error decreased as the model's complexity increased from [8, 8, 8] to [32, 32, 32], achieved by increasing the number of units in each layer.</span>

In [23]:
data_mushroom_4 = deepcopy(data_mushroom)
training_data, validation_data, test_data = cross_validate(data_mushroom_4, model_4, 32)

[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 551us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 519us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 450us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 539us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 479us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 558us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 451us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 494us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 455us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 554us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 453us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 548us/step
[1m204/204[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 446us/step
[1m26/26[

<span style="font-size:150%; font-weight:bold;">Test Error Rates for [32, 32, 32]</span><br>
<span style="font-size:110%; font-weight:500;">The test error is similar to the train and validation error. All three errors are relatively small compared to the other models considered</span>

In [24]:
cross_validate_test_data(training_data, validation_data, test_data, model_4, 32)

[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 533us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 503us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 540us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 542us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 534us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 482us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 599us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 526us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 528us/step
[1m26/26[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step
Fold: 0	Test Error: 0.00%
Fold: 1	Test Error: 0.00%
Fold: 2	Test Error: 0.00%
Fold: 3	Test Error: 0.00%
Fold: 4	Test Error: 0.00%
Fold: 5	Test Error: 0.00%
Fold: 6	Test Error: 0.00%
Fold: 7	Test Error: 0.00%
Fold: 8	Test Error: 0.00%
Fold: 9	Test Error

## Before You Submit...

1. Re-read the general instructions provided above, and
2. Hit "Kernel"->"Restart & Run All".