# Week 7: Neural Networks with TensorFlow

This week we will go over how to use TensorFlow to implement a neural network


In [153]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

For now, we will be working with the Sequential neural network class in keras. A sequential neural net is one where each layer takes in a single set of inputs and outputs a single set of outputs. When our input is a vector (as has been the case so far) we can think of this as each layer taking in a single input matrix and outputing a single output matrix, where the input matrix has shape 
$$\text{(number of neurons in past layer)} \times \text{(number of neurons in this layer)}$$
and the output matrix has shape
$$\text{(number of neurons in this layer)} \times \text{(number of neurons in next layer)}$$
Non zero entries in these matrices will indicate that the output from a neuron in a prior layer is being passed to a neuron in the next layer.

We can initialize this neural network using `model = keras.Sequential()`


In [178]:
model = keras.Sequential()

Now that we have initiallized our model as a sequential neural network, we want to start to add layers to this neural net. To initialize a dense layer we can use 

``layers.Dense(num_neurons)``

A Dense layer means each neuron will take in all the inputs from the past layer (input matrix is fully non-zero). There are a few other options we can use when initializing a layer:

`` use_bias = True/False``

This specifies whether the neurons in this layer will use a bias term. By default use_bias is set to be True. 

`` activation = "relu"/"sigmoid"/etc.``

This specifies what activation function is used in the layer. If this is left unspecified, then we will use a linear activation function $\pi(x) = x$.

We also may want to specify the shape of the initial input. We can do this using 

``keras.Input(shape = ())``

In this particular case we will be using 4 features to clasify penguins, so we specify our input shape as (4,)

In [179]:
input = keras.Input(shape = (4,))
layer1 = layers.Dense(2, use_bias = True, activation = "sigmoid")
layer2 = layers.Dense(10, use_bias = True, activation = "sigmoid")
layer3 = layers.Dense(20, use_bias = False)
output = layers.Dense(1, use_bias = True, activation = "sigmoid")

Once we have set up our layers, we can add them to our neural network using `model.add()`


In [180]:
model.add(input)
model.add(layer1)
model.add(layer2)
model.add(layer3)

We can remove the last layer by using `model.pop()`

In [181]:
print(len(model.layers))
model.pop()
print(len(model.layers))



3
2


At this point we can add the output layer

In [182]:
model.add(output)

Our model is now initialized. We can use the `model.summary()` model to take a look at the layers in our model

In [183]:
model.summary()

Model: "sequential_10"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_62 (Dense)             (None, 2)                 10        
_________________________________________________________________
dense_63 (Dense)             (None, 10)                30        
_________________________________________________________________
dense_65 (Dense)             (None, 1)                 11        
Total params: 51
Trainable params: 51
Non-trainable params: 0
_________________________________________________________________


and the `model.weights` attribute to see what (random) weights the model has been initialized with

In [184]:
model.weights

[<tf.Variable 'dense_62/kernel:0' shape=(4, 2) dtype=float32, numpy=
 array([[ 0.5941806 , -0.3471973 ],
        [ 0.07243395, -0.09919524],
        [-0.58444524, -0.05843472],
        [ 0.00209427, -0.34695196]], dtype=float32)>,
 <tf.Variable 'dense_62/bias:0' shape=(2,) dtype=float32, numpy=array([0., 0.], dtype=float32)>,
 <tf.Variable 'dense_63/kernel:0' shape=(2, 10) dtype=float32, numpy=
 array([[-0.66048074, -0.62314504, -0.23081267, -0.5326394 , -0.07129222,
          0.6264935 , -0.35728034, -0.39815134,  0.1333983 ,  0.49193138],
        [ 0.7062308 , -0.08162123,  0.4638719 ,  0.6584328 ,  0.58804685,
          0.34665996, -0.53175616,  0.20638853, -0.60973036, -0.3571642 ]],
       dtype=float32)>,
 <tf.Variable 'dense_63/bias:0' shape=(10,) dtype=float32, numpy=array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)>,
 <tf.Variable 'dense_65/kernel:0' shape=(10, 1) dtype=float32, numpy=
 array([[ 0.6257991 ],
        [-0.34274462],
        [ 0.6847772 ],
        [-

We can see what predictions our model would make given these weights by just pasing it a single 4 element vector.

In [185]:
import seaborn as sns
import numpy as np
penguins = sns.load_dataset('penguins').dropna()
penguins.head()
Y = 1*(penguins['species']=="Adelie")
X = penguins[['bill_length_mm','bill_depth_mm','flipper_length_mm','body_mass_g']]
test = np.array(X[0:1])
y = model(test)
print(y)


tf.Tensor([[0.3925631]], shape=(1, 1), dtype=float32)


Now, we want to move on to training our neural network using the data. We first split our data into testing and training.

In [162]:
from sklearn.model_selection import train_test_split
Xtrain, Xtest, Ytrain, Ytest = train_test_split(X,Y)

We first specify a training configuration using 

``model.compile(optimizer, loss, metrics)``

The optimizer tells keras how to numerically compute the derivative at each step of the back-popogation. Examples include ``keras.optimizers.SGD()``, ``keras.optimizers.Adam()``, or ``keras.optimizers.RMSprop()``. For now we use Adam, which implements a stochastic gradient descent estimating the first and second derivatives at each step of the back propogation.

The loss function tells us what loss function we want to minimize when doing back-propogation. Examples include ``keras.losses.MeanSquaredError()``, ``keras.losses.KLDivergence()``, etc. For this example we will use the MSE.

Finally, the metrics argument tells what metrics to use to describe our models performance on the training data. Examples include ``keras.metrics.Accuracy()``, ``keras.metrics.Crossentropy()``, etc. We can pass multiple metrics to this argument. For this example we will use the KL Divergence.


In [186]:
model.compile(optimizer = keras.optimizers.Adam(), loss=keras.losses.MeanSquaredError(), metrics=[keras.metrics.KLDivergence()])

Finally, we fit the model using

``model.fit(Xtrain, Ytrain, epochs)``

When training the model keras will try to split up the data into batches (randomly) and do back-progogation to compute the gradient on each subsample. Epochs tells the data how many times to iterate through and use the whole dataset and run back-propogation. For now we try 50 epochs.

In [187]:
model.fit(Xtrain, Ytrain, epochs = 50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<tensorflow.python.keras.callbacks.History at 0x7f86f16048d0>

Using the predict method we can make predictions and validate this neural net using our testing data

In [188]:
Yhat = 1*(model.predict(Xtest).flatten() >= 0.5)
print(np.mean(Yhat == Ytest))


0.6666666666666666


Our model preforms just ok. But, we did get it to work.