# Introduction to Deeper into Keras with TensorFlow 

TensorFlow is Google's machine learning programing language. TensorFlow is a very general tools for constructing artificial neural networks, both as operation trees and as more topologically complicated networks. TensorFlow's nodes incorporate all of the structure we discussed in the lecture, including most importantly the reverse autodiff partial derivative information.

It's important to remember that using TensorFlow for computations proceeds in two parts:

1. __Construct a graph:__ In this step, create a pattern for the network by defining all of our nodes (input, output, constant, multipliers, LSUs, etc) and connecting them. You also specify their initial values. At this stage no construction has been done. 

2. __Compile the model:__ To actually put the model in memory you must compile it. 

3. __Fit or Run the Graph:__  Fit the graph, and when you are satisfied save out the weights. 

<img src = "https://www.tensorflow.org/images/tensors_flowing.gif">

Higher level tools like Keras hide the details of the initialization and session and streamline this process by including pre-made versions of common architectures. This lab will focus on using the low level tools to enhance Tensorflow. This lab is meant to work with Tensorflow 2.1.


#### Getting TensorFlow and Keras

If you are on Google Colab, TensorFlow and Keras are automatically installed. If you are running Jupyter locally use the anaconda prompt to install using 

    $ pip install tensorflow
$ pip install keras

* Windows: Open "Anaconda Prompt" from the start menu or from Cortana's search. 
* OSX: Open terminal, you should directly be able to use pip from there. 
* Linux: Open terminal, you should directly be able to use pip from there. 

For more information, see https://www.tensorflow.org/install and https://keras.io/#installation.

This lab follows Chapters 9 and 10 from *Hands-On Machine Learning with Scikit_Learn & TensorFlow*.

## Loading TensorFlow

After TensorFlow is installed restart your kernel. The code below loads Tensorflow, Keras and then sets the random seeds for numpy. This makes your results consistent across runs.

In [None]:
import numpy as np
import tensorflow as tf
from tensorflow import keras

np.random.seed(42)

### Basic Operations with TensorFlow

For an excelent post about the annatomy of a tensor see https://pgaleone.eu/tensorflow/2018/07/28/understanding-tensorflow-tensors-shape-static-dynamic/

We will start by making the simple operation tree below. Our first step is to set up the graph. We will define TensorFlow variables using 

* `tf.constant(input, name="VarName")` Creates a TensorFlow array from the input data `input`. Name it is optional, but will make it easier to call within TensorFlow. 

Tensorflows variables and constants act much like `numpy` objects. They have a values, a shape and a datatype. For example, lets define  
$$
X^2Y + Y -4\,,
$$
initialized so that $X=3$ and $Y = 2$ we can use:

In [None]:
X = tf.constant(3, name="X")
Y = tf.constant(2, name="Y")


The output for `f` is a `tf.Tensor` object. Lets take a look at it:

* `id=` - The nodes/tensors unique ID.
* `shape=()` - The shape is number of elements in each dimension. A scaler has rank 0 and an empty shape, a vector has rank 1 and a shape (N), where N is the number of elements in the vector. A matrix has rank 2 and shape (N_1, N_2). 
* `numpy` - The content of the tensor as a numpy array. 

Lets look at an example with higher order arrays. We can multiply matrices just like in numpy, with `tf.multiply`:

In [None]:
# Build a dataflow graph.
c = tf.constant([[1.0, 2.0], [3.0, 4.0]])
d = tf.constant([[1.0, 1.0], [0.0, 1.0]])


Here, we see that the shape is (2,2), the data type is a `float32` and the content of the tensor as a numpy object would be an array of floats

    array([[1., 3.],[3., 7.]], dtype=float32))
    
We should leave Tensorflow objects as Tensorflow objects, but we can quarry as `tf.Tensor` object to return any of these parameters:

* `tf.Tensor.numpy()` - Return contents as an numpy array.
* `tf.Tensor.shape` - View the tensors shape parameter.
* `tf.Tensor.dtype` - View the tensors datatype parameter.

#### Indexing for tf.Tensor Objects

Indexing works much the same as for numpy arrays:

Tensors can be manipulated like numpy objects, provided you use the proper tensorflow commands to perform thing like transposing:

In [None]:
display(c+10)
display(tf.transpose(c))

Here is a partial list of tensorflow operations:

* `tf.add()`
* `tf.multiply()`
* `tf.square()`
* `tf.exp()`
* `tf.sqrt()`
* `tf.reshape()`
* `tf.squeeze()`
* `tf.tile())`

Some functions have a different name than their numpy equivalent. For example, 

* `tf.reduce_mean()` - Return the mean of the array.
* `tf.reduce_sum()` - Return the summation of elements in the array. 
* `tf.reduce_max()` - Return the max of the elements in the array.
* `tf.math.log()` - Return log of the array. 

Tensorflow uses _reduce_ here because the algorithm used at the GPU level is a _reduce_ algorithm that does not guarantee the order in which elements are added. Finally, matrix operations can be performed with

* `tf.transpose()` - Returns a new tensor that is a transpose of the old tensor.
* `tf.matmul()` - Returns a new tensor that is a matrix multiplication of two tensors. 
* `tf.linalg.inv()` - Returns the inverted matrix. 

For a more complete list of Tensorflow's matheamtics operations see the documentation for `tf.math`:

https://www.tensorflow.org/api_docs/python/tf/math

and `tf.linalg`:

https://www.tensorflow.org/api_docs/python/tf/linalg


__Note:__ In Tensorflow  `M @ N` is __matrix multiplication__ while `M*N` is componentwise multiplication:

Finally, Tensorflow plays nicely with numpy and will allow many numpy functions to be apply to `tf.Tensor` objects:

In [None]:
np.square(c)

## Strings:

It can handle strings and list of strings:

Strings will be stored as bytes encoded for unicode:

In [None]:
tf.strings.unicode_decode(p, "UTF8")

The tf.strings library has a large list of text processing tools that can be found at 

https://www.tensorflow.org/api_docs/python/tf/strings

For example, we can get the string lengths, or join the list into a single string:

## Variables

The constants we've been working with cannot be changed and any operations done to them are returned as something new. However, for a neural network we want both trainable and untrainable parameters so we will need `tf.Variable` objects. 

The `v = tf.Variable` objects can be updated in place using 

* `v.assign(new)` - Update the value of the variable `v` to the value `new`.
* `v.assign_add(N)` - Increment the variable by `N`. 
* `v.assign_add(N)` - Decrement the variable by `N`. 

In [None]:
v = tf.Variable([[1., 2., 3.], [4., 5., 6.]])
v.assign(2*v)
display(v)



### Exercise:

Lets use TensorFlow compute our linear regression solution $\beta = (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y}$. Try to construct the matrix beta below for the breast cancer data set:

In [None]:
from sklearn.datasets import load_breast_cancer
cancer = load_breast_cancer()
m,n = cancer.data.shape

cancer_bias = np.c_[np.ones((m,1)),cancer.data]


### Exercise

Use Tensorflow to implement Elastic Net Regression.

## Customizing Models with Tensorflow:

Tensorflow is a library under development, and as a result not every function is implemented yet. In addition, for a specific application you may come up with your own loss function, or may want to perform tests comparing different kinds of homebrewed activation function. In these cases, we will use the lower level tensorflow to construct new functions. 

Lets implement the [Huber Loss function](https://en.wikipedia.org/wiki/Huber_loss)

$$
L_\delta (a) = \begin{cases}
 \frac{1}{2}{a^2}                   & \text{for } |a| \le \delta, \\
 \delta (|a| - \frac{1}{2}\delta), & \text{otherwise.}
\end{cases}
$$

In the graph below, the Huber loss is in green and the squared error loss is in blue. 

<img width = 400 src="https://upload.wikimedia.org/wikipedia/commons/thumb/c/cc/Huber_loss.svg/600px-Huber_loss.svg.png">

The Huber loss is used in robust regression and is less sensitive to outliers and noisy data than the squared error loss. 

In this case, we'll use $\delta=1$. It's clear that our program flow should be roughly something like

__(Psudocode)__:

    ERROR = y_true - y_pred
    if (ERROR > 1) then
        return abs(ERROR) - .5
    else
        return (ERROR^2)/2

However, for speed reasons we want to do everything inside Tensorflow constructing each step as a node in a graph, that includes the if statement. Tensorflow has an if node in the form of `tf.where`:

* `tf.where(is_true, V1, V2)` - If the tensor `is_true` contains `True` then return `V1`, otherwise return `V2`. 

Our code will then look something like this:

In [None]:
def huber_fn(y_true, y_pred):


Lets try out our model on a simple network with a single dense hidden layer. We'll try to fit the cancer dataset:

In [None]:
from sklearn.datasets import load_breast_cancer
cancer = load_breast_cancer()
m,n = cancer.data.shape

print(cancer.data.shape)

X = tf.constant( cancer.data, dtype=tf.float32, name="X" )
y = tf.constant( cancer.target.reshape(-1,1), dtype=tf.float32, name="y" )

In [None]:
model = keras.models.Sequential()
model.add(keras.layers.Dense(300, input_dim = n))
model.add(keras.layers.Dense(2, activation="softmax"))

model.summary()

We can view the structure of the network with tensorboard if we include a callback:

In [None]:
from datetime import datetime

# Windows 10
logdir="logs\\fit\\" + datetime.now().strftime("%Y%m%d-%H%M%S")

# *nix
#logdir="logs/fit/" + datetime.now().strftime("%Y%m%d-%H%M%S")

tensorboard_callback = keras.callbacks.TensorBoard(log_dir=logdir)

model.compile(loss=huber_fn, optimizer="nadam", metrics=["accuracy"])
model.fit(X, y, epochs=10, callbacks=[tensorboard_callback])

%load_ext tensorboard
%tensorboard --logdir logs

### Saving Custom Models:

Keras will save a model with a custom component (like a custom loss function) as normal, recording all of the custom components we've added. The only trick is that when we reload the model we have to provide Keras with a dictionary mapping the custom objects it expects (in this case a function name "huber_fn") to a proper substitute.

In [None]:
model.save("my_model_with_a_custom_loss.h5")

In [None]:
model2 = keras.models.load_model("my_model_with_a_custom_loss.h5",
                                custom_objects={"huber_fn": huber_fn})

What if we wanted to extend the Huber loss function to include the $\delta$ hyperparamter that switches between the linear and quadratic parts? Before digging too deep in it turn out Keras will have a problem: if we set a custom $\delta$ Keras wont know to save that rate. We could save it in an auxiliary files but that's not scalable. 

The solution is to define a new __subclass__ of the Keras losses __class__. A __class__ is an object that has some data and some functions associated with it, like a pandas Dataframe or a Numpy array. Both the dataframe and the array aren't just holding data, they have a whole array of functions you can call to manipulate that data. A __class__ is kind of like a machine (like a car):

* It has has some internal states. For a machine these would be the amount of fuel, the orientation of the second cog, the number of velocity of a piston in the engine. For us they will be the numbers, arrays, text, etc that we want to store. 

* It has an interface. For the machine this might be dials to read the internal states and peddles to change them.

A subclass is a class that conforms to some expected standards. In this case, Keras has laid out a specific series of commands that it wants a _Loss Function_ class to have. If the class has these functions implemented, Keras will try to use them during training, saving, and other times the loss function will be considered. 

In the case of the HuberLoss, we need to define three functions:

* `__init__()` - Tells Python what to do with the class each time we create a new instance of it.
* `call` - Keras expects that the actual computation of the loss function will be performed if it calls `HuberLoss.call`. 
* `get_config` - When Kera's saves the model, it calls this function to get the configuration of the loss function. 

The code for the HuberLoss class is below. We'll go through it:

In [None]:
class HuberLoss(keras.losses.Loss):
    def __init__(self, threshold=1.0, **kwargs):
#        self.threshold = threshold
#        super().__init__(**kwargs)
        
    def call(self, y_true, y_pred):
#        error = y_true - y_pred
#        is_small_error = tf.abs(error) < self.threshold
#        squared_loss = tf.square(error) / 2
#        linear_loss = self.threshold * tf.abs(error) - self.threshold**2 / 2
#        return tf.where(is_small_error, squared_loss, linear_loss)
    
    def get_config(self):
#        base_config = super().get_config()
#        return {**base_config, "threshold": self.threshold}

First, when we define the class we use `keras.losses.Loss` to tell python to import all of the machinery the `keras.losses.Loss` class has. This is what is means to __subclass__ a class. On the initialization step, we define a new function

    def __init__(self, threshold=1.0, **kwargs):
        self.threshold = threshold
        super().__init__(**kwargs)
        
Here, `self` is an object that references the object being created. The variable `**kwargs` is a generic name for a dictionary of keyword's, you'll see it a lot in subclasses as a catchall for any variables that need to be passed to the main class. For loss functions typically need to make a `sum_over_batch_size` call to deal with batch sizes larger than 1, we're going to let the class `keras.losses.Loss` deal with those calls since it know how to. Finally, `threshold=1.0` of course will be our threshold. 

In the next step, we define a new variable `self.threshold`. Each time we create a new instance of the loss function we set its threshold and can access it using `HuberLoss.theshold`. Finally, we call the initialization function of the `keras.losses.Loss` class, and pass any extra parameters (like `sum_over_batch_size`) up to it.

In [None]:
l = HuberLoss(threshold=1.0)
l.threshold

The call function is the most straight forward: it's just a direct extension of the `huber_fn` function to include the threshold $\delta$. Note, we always need to pass self to any internal functions so that they can access the classes variables:

    def call(self, y_true, y_pred):
        error = y_true - y_pred
        is_small_error = tf.abs(error) < self.threshold
        squared_loss = tf.square(error) / 2
        linear_loss = self.threshold * tf.abs(error) - self.threshold**2 / 2
        return tf.where(is_small_error, squared_loss, linear_loss)
        
Finally, the `get_config` function gets the configuration from the larger `keras.losses.Loss` class, and then appends the `self.threshold` parameter with the dictionary key `"threshold"` to configuration. It is this configuration that will be saved when Keras saves out the model

    
    def get_config(self):
        base_config = super().get_config()
        return {**base_config, "threshold": self.threshold}
        
There's a lot here, an object oriented programing is a large subject, but in general if you want to correctly implement something in Keras, tensorflow, or any other library you should do it by subclassing a standard object instead of creating your own objects. The class information for Keras loss functions can be found here:

https://www.tensorflow.org/api_docs/python/tf/keras/losses/Loss

If you look at the right hand side, you can see the methods included in the class. 

Finally, using the class is straight forward:

In [None]:

model.fit(X, y, epochs=1, callbacks=[tensorboard_callback])

model.save("my_model_with_a_custom_loss.h5")
model2 = keras.models.load_model("my_model_with_a_custom_loss.h5", 
                                 custom_objects={"HuberLoss": HuberLoss},
                                 compile=False)
#### NOTE: This is currently broken. Can recover using compile=False and recompling the model.

# model2.compile(loss=HuberLoss(2.), optimizer="nadam")

## Custom Activation, Regularization, and Constraints

Defining custom functions and classes for activation functions, regularization and constraints is just as easy. For example,

In [None]:
def my_softplus(z): # return value is just tf.nn.softplus(z)
#    return tf.math.log(tf.math.exp(z) + 1.0)

def my_glorot_initializer(shape, dtype=tf.float32):
#    stddev = tf.sqrt(2. / (shape[0] + shape[1]))
#    return tf.random.normal(shape, stddev=stddev, dtype=dtype)

def my_l1_regularizer(weights):
#    return tf.reduce_sum(tf.abs(0.1 * weights))

def my_positive_weights(weights): # return value is just tf.nn.relu(weights)
#    return tf.where(weights < 0., tf.zeros_like(weights), weights)

We then can create custom layers using these functions:

In [None]:
model = keras.models.Sequential([
    keras.layers.Dense(30, activation="elu", kernel_initializer="lecun_normal",
                       input_dim=n),
#    keras.layers.Dense(2, activation=my_softplus,
#                       kernel_regularizer=my_l1_regularizer,
#                       kernel_constraint=my_positive_weights,
#                       kernel_initializer=my_glorot_initializer),
])
model.summary()

In [None]:
model.compile(loss="sparse_categorical_crossentropy", optimizer="nadam", metrics=["mae"])
model.fit(X, y, epochs=1)

Of course if you want your model to save any hyperparameters these new functions depend on you will need to properly define them as their appropriate subclasses. 

## Custom Layer and Models

If you want to delve into RNN's, or into training deep custom models, there will come a point where you need to start constructing your own custom layers. The simplest kinds of layers are layers with no trainable parameters, also known as functions. If you just want a layer to apply a function the simplest option is to write the function and then wrap it in a lambda layer. For example

    exponential_layer = keras.layers.Lambda(lambda x: tf.exp(x))
    
If you want to build a more complicated custom layer, you should subclass the `keras.layers.Layer` class. A custom layer needs the following methods:

* `__init__` - The layer initializer, to be run when a new layer is created. 
* `build` - The code to initialize the weights and construct the new lay. 
* `call` - The code to run the layer on input. 
* `compute_output_shape` - This code will be called by any subsequent layers to determine what their input shape will be. 
* `get_config` - Returns the layers internal parameters. Called when Keras saves the layer. 

For example, we'll reinvent the wheel a bit and build a custom dense layer. 

In [None]:
class MyDense(keras.layers.Layer):
    def __init__(self, units, activation=None, **kwargs):
        super().__init__(**kwargs)

        
    def build(self, batch_input_shape):

        super().build(batch_input_shape) # must be at the end

    def call(self, X):

    
    def compute_output_shape(self, batch_input_shape):


    def get_config(self):
        base_config = super().get_config()
        return {**base_config, "units": self.units,"activation": keras.activations.serialize(self.activation)}

In [None]:
class MyDense(keras.layers.Layer):
    def __init__(self, units, activation=None, **kwargs):
        super().__init__(**kwargs)
        self.units = units
        self.activation = keras.activations.get(activation)
        
    def build(self, batch_input_shape):
        self.kernel = self.add_weight(
            name="kernel", shape=[batch_input_shape[-1], self.units],
            initializer="glorot_normal")
        self.bias = self.add_weight(
            name="bias", shape=[self.units], initializer="zeros")
        super().build(batch_input_shape) # must be at the end

    def call(self, X):
        return self.activation(X @ self.kernel + self.bias)
    
    def compute_output_shape(self, batch_input_shape):
        return tf.TensorShape(batch_input_shape.as_list()[:-1] + [self.units])

    def get_config(self):
        base_config = super().get_config()
        return {**base_config, "units": self.units,"activation": keras.activations.serialize(self.activation)}

#### __init__

We need to passes the standard initialization parameters to the `keras.layers.Layer` class. Then, we save the number of layer nodes in the internal variable `units`. Finally, if an activation function was passed we save it in the internal variable `activation`. 

#### build

Here, we use the `add_weight` function that comes from the `keras.layers.Layer` to add an array of weights of shape `batch_input_shape` by `self.units` to the later. In this case, `batch_input_shape` will be determined by automatically from the previous layers `compute_output_shape` function. In fact, that function will pass a list of all of the previous layer shapes, so we only take the last one. This is one of the advantages of subclassing, the `keras.layers.Layer` already has functions built in to figure out the number of inputs from the previous layer.  We initialize these weights using Gloro initialization and save the matrix of weights as `self.kernel`. We then add more weights for the bias. 


#### call

Remember that a dense layer is just a matrix multiplication:

$$
N^{i+1} = \sigma(N^{i}W + b)\,.
$$

In the call method, we compute this function. Remember that for `tf.Tensor` objects, `@` is matrix multiplication. 

#### compute_output_shape

Return the list of input shapes, with the output of this layer appended. This will be called when the next layer is initialized. 

#### get_config

Return the number of units and the activation function used. 



We can now use `MyDense` as a standard dense layer! Although we didn't implement the `input_dim` for simplicity. 

In [None]:
model = keras.models.Sequential()
model.add(keras.layers.Dense(300, input_dim = n))
model.add(MyDense(2, activation="softmax"))

model.summary()

model.compile(loss="sparse_categorical_crossentropy", optimizer="nadam", metrics=["mae"])
model.fit(X, y, epochs=1)

## Exercise:

Implement the `input_dim` parameter. You will need to add a variable to be passed to the `__init__` method that looks like `input_dim=None` so that by default it is set to none. Then in the `__init__` method, same `input_dim` to a new variable called `self.input_dim`.  Finally, in the `build` method use `tf.which` to use the `self.input_dim` is it is not None, and `batch_input_shape` otherwise. 

## Exercise:

The code below constructs a noising layer that adds Guassian noise to input. Use this code, and the code for `MyDense` above to construct a fuzzy dense layer that adds noise to the input and then computes 

$$
N^{i+1} = \sigma(N^{i}W + b)\,.
$$

# Subclassing Models

Note: Part of the explanation for this code is in Chapter 10 under "Using the Subclassing API to Build Dynamic Models."


Assume that you want to try 10 different DNN's with varying parameters on a specific dataset. You could code them out each by hand but if they're structurally similar enough its better to create a custom model subclass that can instantiate multiple models with your desired parameters. 

For example, the code below will build a 2 layer DNN with our specified activation function and layer node count. 

In [None]:
class TwoLayerDNN(keras.Model):
    def __init__(self, units=[30,30], activation="relu", **kwargs):
        super(TwoLayerDNN, self).__init__(**kwargs) # handles standard args (e.g., name)


    def call(self, inputs):

        return self.out(hidden2)                    


model = TwoLayerDNN(units=[40,50])
model.build(input_shape=(None, 20))

model.summary()

In [None]:
class TwoLayerDNN(keras.Model):
    def __init__(self, units=[30,30], activation="relu", **kwargs):
        super(TwoLayerDNN, self).__init__(**kwargs) # handles standard args (e.g., name)
        self.units = units
        self.hidden1 = keras.layers.Dense(units[0], activation=activation)
        self.hidden2 = keras.layers.Dense(units[1], activation=activation)
        self.out = keras.layers.Dense(2)

    def call(self, inputs):
        hidden1 = self.hidden1(inputs)
        hidden2 = self.hidden2(hidden1)
        return self.out(hidden2)                    


model = TwoLayerDNN(units=[40,50])
model.build(input_shape=(None, 20))

model.summary()

Lets use this code to run over a few possible models with a few possible accitvation functions:

In [None]:
import tensorflow as tf
from tensorflow import keras

from sklearn.datasets import load_breast_cancer
cancer = load_breast_cancer()
m,n = cancer.data.shape

print(cancer.data.shape)

X = tf.constant( cancer.data, dtype=tf.float32, name="X" )
y = tf.constant( cancer.target.reshape(-1,1), dtype=tf.float32, name="y" )

model = TwoLayerDNN(units=[40,50])
model.compile(loss="sparse_categorical_crossentropy", optimizer="nadam", metrics=["accuracy"])
history = model.fit(X, y, epochs=10)

In [None]:
unit1 = [10,20,30,40]
unit2 = [10,20,30,40]
act = ["relu","elu","selu"]

results = []

for i in range(len(unit1)):
    for j in range(len(unit2)):
        for k in act:
 #           model = TwoLayerDNN(units=[i,j],activation=k)
 #           model.compile(loss="sparse_categorical_crossentropy", optimizer="nadam", metrics=["accuracy"])
 #           history = model.fit(X, y, epochs=10,verbose=2)
 #           d = history.history
 #           d["parameters"] = [i,j,k]
 #           results = results + [d]

In [None]:
from matplotlib import pyplot as plt

plt.figure(figsize = (15,15))

endacc = []

for mod in results:
    par = [str(i) for i in mod["parameters"]]
    lab = " ".join(par)
    plt.plot(mod["accuracy"],label=lab)
    endacc = endacc + [mod["accuracy"][-1]]
    
plt.legend()


We can find the optimal parameters as well:

In [None]:
endacc = [mod["accuracy"][-1] for mod in results ]
np.argmax(endacc)

In [None]:
results[24]["parameters"]

Of course, we don't need to do an exhaustive search either, there are quite a few tool kits available for hyperparmeter tuning including Hyperopt, Keras Tuner, Scikit-Optimize and GridSearchCV in scikit learn. 

## Extra: 

Lets try to make our chart more useful by using coloring that actually is responsive to the different parameters we're using:

In [None]:
unit1 = [10,20,30,40]
unit2 = [10,20,30,40]
act = ["relu","elu","selu"]

results = []

for i in range(len(unit1)):
    for j in range(len(unit2)):
        for k in range(len(act)):
            model = TwoLayerDNN(units=[unit1[i],unit2[j]],activation=act[k])
            model.compile(loss="sparse_categorical_crossentropy", optimizer="nadam", metrics=["accuracy"])
            history = model.fit(X, y, epochs=10,verbose=2)
            d = history.history
            d["parameters"] = [i,j,k]
            results = results + [d]

In [None]:
from matplotlib import pyplot as plt

plt.figure(figsize = (15,15))

endacc = []

al = [.3,.5,.7,.9]
cs = ["C0", "C1", "C2", "C3"]
st = [':', '-', '--']

for mod in results:
    p = mod["parameters"]
    par = [str(unit1[p[0]]), str(unit2[p[1]]), str(act[p[2]])]
    lab = " ".join(par)
    plt.plot(mod["accuracy"],
             linestyle=st[p[2]],
             color=cs[p[1]],
             alpha = al[p[0]],
             label=lab)
    endacc = endacc + [mod["accuracy"][-1]]
    
plt.legend()


## Exercise:

Create a subclass that also allow you to select the number of layers. Since you need to to have the layers saved in a fixed variable, instead of using 

`self.hidden1 = keras.layers.Dense(units[0], activation=activation)`

in in the initialization create a `list` object containing all of your layers. 