<a href="https://colab.research.google.com/github/DanB1421/DATA602/blob/main/Brilliant_Problem_Set_10-%20%20%20.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Problem 1 -- Generate the model

Write a function:

    build_network(nslayers, n_neurons_per_layer, activation_fn)

The function should return a compiled model with the following structure:
* An Input node accepting an image of dimensions $28\times28$
* A Flatten node
* $n$ hidden layer nodes, each containing `n_neurons_per_layer` neurons and using the activation function `activation_fn`.
* An output layer (Dense layer) of 10 neurons that uses the softmax activation function.


The model should be compiled as such:
* Optimizer: sgd
* metrics: `["accuracy"]`
* loss: `sparse_categorical_crossentropy` (since the target variable is represented as a single value, as opposed to being one-hot encoded)



In [None]:
import tensorflow as tf
import tensorflow.keras as keras
from tensorflow.keras.layers import Dense, Input, Flatten
from tensorflow.keras.models import Sequential
from tensorflow.keras.datasets import mnist
from tensorflow.keras.callbacks import EarlyStopping
!pip install scikit-optimize
import skopt
import skopt.space as sp
import matplotlib.pyplot as plt
import numpy as np

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting scikit-optimize
  Downloading scikit_optimize-0.9.0-py2.py3-none-any.whl (100 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m100.3/100.3 kB[0m [31m12.2 MB/s[0m eta [36m0:00:00[0m
Collecting pyaml>=16.9 (from scikit-optimize)
  Downloading pyaml-23.5.8-py3-none-any.whl (17 kB)
Installing collected packages: pyaml, scikit-optimize
Successfully installed pyaml-23.5.8 scikit-optimize-0.9.0


In [None]:
def build_network(nslayers, n_neurons_per_layer, activation_fn):
    model = Sequential()
    model.add(Input(shape=(28,28)))
    model.add(Flatten())
    for i in range(nslayers):
        model.add(Dense(n_neurons_per_layer, activation=activation_fn))
    model.add(Dense(10, activation='softmax'))
    model.compile("sgd", "sparse_categorical_crossentropy", ["accuracy"])
    return model

## Problem 2 -- Load the keras MNIST dataset.

Call `keras.datasets.mnist.load_data("mnist.npz")`, which returns
`(X_train, y_train), (X_test, y_test)`.  Split the training dataset into a training and validation set.

In [None]:
(X_train, y_train), (X_test, y_test) = keras.datasets.mnist.load_data("mnist.npz")

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [None]:
preprocess = lambda k: k/255.0
X_train = preprocess(X_train)
X_test = preprocess(X_test)

X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=5000)

#Problem 3 -- Train the model.

Call `build_network` with parameters of your choice (4-8 layers, 50-150 neurons per layer, and ReLU activation (`relu`) is a reasonable starting point.)  Train the model against the training dataset.  To reduce training time, an early stopping callback is advised.  Evaluate the model using the validation dataset.  What is the prediction accuracy of the neural net?

In [None]:
model = build_network(4, 50, "relu")
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 flatten (Flatten)           (None, 784)               0         
                                                                 
 dense (Dense)               (None, 50)                39250     
                                                                 
 dense_1 (Dense)             (None, 50)                2550      
                                                                 
 dense_2 (Dense)             (None, 50)                2550      
                                                                 
 dense_3 (Dense)             (None, 50)                2550      
                                                                 
 dense_4 (Dense)             (None, 10)                510       
                                                                 
Total params: 47,410
Trainable params: 47,410
Non-traina

In [None]:
model.fit(X_train, y_train)



<keras.callbacks.History at 0x7fb3f73374c0>

In [None]:
early_stopping_cb = EarlyStopping(monitor='val_loss',
                                  min_delta = 0.001,
                                  patience=3)
history = model.fit(X_train, y_train,
                    epochs=15,
                    batch_size=32,
                    callbacks=[early_stopping_cb],
                    validation_data=(X_val, y_val))

Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15


In [None]:
loss, accuracy = model.evaluate(X_val, y_val)
print(f"validation accuracy: {accuracy:0.4f}")

validation accuracy: 0.9718


The prediction accuracy of the neural network using validation is 0.9718.

#Problem 4 -- Optimize the model.

Use one of the hyperparameter optimization frameworks discussed in class, such as scikit-optimize, to find an optimal values of the number of layers, activation function, and neurons per layer for this neural network.  Use a budget of about 20 runs.  Use the below tables as rough guidelines for the parameter space.

|Parameter|Space ($\Lambda_n$)|
|---------|----|
|Activation function|`relu`, `sigmoid`|
|Number of layers|~2-20 (integer, uniform)|
|Number of neurons per layers|10-300 (integer, log distributed)|

What combination of parameters ($\lambda$) produces the highest accuracy, and what is that accuracy?





In [None]:
from hyperopt import (fmin, hp, tpe, Trials, STATUS_OK, STATUS_FAIL)
import time
import numpy as np

def hyperopt_objective(args):
  act_fn, n_layer, n_neurons = [args[k] for k in ['activation', 'n_layers', 'neurons_per_layer']]
  n_layer, n_neurons = [int(k) for k in [n_layer, n_neurons]]
  model = build_network(n_layer, n_neurons, act_fn)
  model.fit(X_train,
            y_train,
            epochs=20,
            callbacks=[early_stopping_cb],
            validation_split=0.2,
            verbose=0)
  _, accuracy = model.evaluate(X_val, y_val, verbose=0)
  return {'loss': -accuracy,
            'time': time.time(),
            'status': STATUS_OK,
            'model': model}

In [None]:
trials = Trials()
fmin(
        hyperopt_objective,
        {
          'activation': hp.choice('activation', ["relu", "sigmoid"]),
          'n_layers': hp.quniform("n_layers", 2, 20, 1),
          'neurons_per_layer': hp.quniform("neurons_per_layer", 10, 300, 1)
        },
        algo=tpe.suggest,
        max_evals=20,
        verbose=0,
        trials=trials,
        show_progressbar=True
)
print(trials)
print(trials.best_trial)

<hyperopt.base.Trials object at 0x7fb3c0684fa0>
{'state': 2, 'tid': 5, 'spec': None, 'result': {'loss': -0.9714000225067139, 'time': 1683730766.3877177, 'status': 'ok', 'model': <keras.engine.sequential.Sequential object at 0x7fb2e3b2bac0>}, 'misc': {'tid': 5, 'cmd': ('domain_attachment', 'FMinIter_Domain'), 'workdir': None, 'idxs': {'activation': [5], 'n_layers': [5], 'neurons_per_layer': [5]}, 'vals': {'activation': [0], 'n_layers': [18.0], 'neurons_per_layer': [218.0]}}, 'exp_key': None, 'owner': None, 'version': 0, 'book_time': datetime.datetime(2023, 5, 10, 14, 57, 2, 422000), 'refresh_time': datetime.datetime(2023, 5, 10, 14, 59, 26, 387000)}


In [None]:
best_model = trials.best_trial['result']['model']
loss, accuracy = best_model.evaluate(X_val, y_val)
print(f"validation accuracy: {accuracy:0.4f}")

validation accuracy: 0.9714


The optimal values are 18 layers, 218 neurons per layer, and a relu activation function.