# Lab 2: Deep Learning with TensorFlow and Keras

In today's class we'll look at how easy it is to implement a neural net using the framework TensorFlow and the library Keras in Python. We'll also explain the basic building blocks of a neural net, such as neuron, layer, activation function, loss function, and how we train neural nets.

Sources:
* [TensorFlow documentation](https://www.tensorflow.org/api_docs/python/)
* [Keras documentation](https://keras.io)

## 0. Importing TensorFlow and Keras

In order to implement and train our first neural nets, we first need to have TensorFlow and Keras installed and imported. A manual on installing these tools can be found in **Lab 0**.

If you already have the tools installed, we can import them as follows:

In [None]:
import tensorflow as tf

If some warnings pop out, don't be alarmed: they're only caused by the frequent updates of various libraries which in turn might use some of each other's deprecated methods.

If you could import tensorflow successfully, we can continue with importing Keras and other useful libraries:

In [None]:
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split

from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import Adam

`Pandas` is a useful library for working with datasets. It enables us to use universal indexing and provides access to values through keys.

The library `sklearn` offers methods related to machine learning. From these options, we will use the class `LabelEncoder` which can vectorize our non-numerical values (such as class labels). The method `train_test_split` will be used to automatically create a training and testing set.

Finally, we import all the classes that we will use from `Keras`:
* `Sequential` is a type of neural network, and represents simple feed-forward neural nets
* `Dense` is a fully connected layer where each neuron in the layer is connected to all neurons from the previous layer
* `Adam` is a frequently used optimizer that speeds up the training process

## 1. Building a Perceptron in TensorFlow

The perceptron is an algorithm for supervised learning of binary classifiers and can be represented as a single neuron. This provides us with the perfect opportunity to look at how neurons work and how they are implemented in TensorFlow at a low level. Don't worry, Keras adds some extra layers of abstraction that will make it easy to build neural nets.

Let's first look at the structure of a neuron:
![Structure of a neuron](figures/lab1-neuron.png)

A neuron is a computational unit that is built up from a number of simple operations. First, it calculates a weighted sum of all inputs, then it adds a bias, and finally applies an activation function. Some definitions add an output function that usually returns the same value as its input and, therefore, we will not consider it here.

We will now build a simple neuron and test its functionality by predefining all its weights to 1:

In [None]:
# simple neuron with two input nodes
def my_neuron(input_vals):
    # define some arbitrary weights for the two input values
    W = tf.Variable(tf.ones((2, 1)))

    # define the bias of the neuron
    b = tf.Variable(1, dtype=tf.float32)

    # compute weighted sum (hint: check out tf.matmul)
    z = tf.matmul(input_vals, W) + b
    print(z)

    # apply the sigmoid activation function (hint: use tf.sigmoid)
    output = tf.sigmoid(z)

    return output

First, we initialize the weights to 1 by creating a `Variable` - in TensorFlow, values can be saved into variables or constants. We call a method similar to `numpy`'s `ones` method: it creates an array of ones with the given shape (2 inputs, 1 output). We then initialize the bias to 1 as well.

The method `matmul` is used to calculate the weighted sum (or dot product) to which we then add the bias. So we can check the result of the computation, we print `z`. Since in TensorFlow all computation is done through tensors, the result of the print statement will be a string representation of the particular tensor.

Finally, we call the `sigmoid` method to represent the activation function sigmoid, which is one of the most popular ones for classification tasks.

We can test our neuron by giving it some sample inputs:

In [None]:
sample_input = tf.constant([[2, 3]], shape=(1, 2), dtype=tf.float32)

# if you've done everything correctly, this should give you a tensor with value 0.9975274
result = my_neuron(sample_input)
print(result)

As mentioned before, the result of the print statement will be a string representation of the tensor and should look similar to this:

```
tf.Tensor([[6.]], shape=(1, 1), dtype=float32)
tf.Tensor([[0.9975274]], shape=(1, 1), dtype=float32)
```

## 2. Building a Neural Net with Keras

Now that we've taken a look under the hood of TensorFlow, we'll look at how easy it is to build a neural net from scratch with Keras.

Our first task is to load the dataset using `pandas`. For the purposes of this lab, we'll be working with the Iris dataset which you can download [here](https://archive.ics.uci.edu/ml/datasets/Iris).

In [None]:
dataset = pd.read_csv("iris.csv")

X = dataset.iloc[:, :4].values
y = dataset.iloc[:, -1].values

We split the dataset into the lists of inputs (first four columns) and outputs (last column). The output now contains layers as strings, which are unfitted for use with neural nets. Therefore, we vectorize them, meaning that we will change them into vectors with a length equal to the number of classes. All vectors will contain zeroes and a single 1 whose position represents the class (1 0 0 for class one, 0 1 0 for class two and 0 0 1 for class three).

In [None]:
encoder = LabelEncoder()
y1 = encoder.fit_transform(y)

Y = pd.get_dummies(y1).values

Now that we have our vectorized outputs, we can split our dataset into training and testing sets. To this end, we will use the method `train_test_split` from `sklearn` which has the following main parameters:
* list of inputs
* list of outputs
* test_size - the size of the testing set (in percentages)

In [None]:
X_train, X_test, Y_train, Y_test = train_test_split(
    X, Y, test_size=0.2)

We can now build our model. We first create an empty `Sequential` model (for model functionality check the [documentation](https://keras.io/models/model/)) and then we add fully connected layers to it:

In [None]:
model = Sequential()

model.add(Dense(10, input_shape=(4,), activation='tanh'))
model.add(Dense(8, activation='tanh'))
model.add(Dense(6, activation='tanh'))
model.add(Dense(3, activation='softmax'))

As you can see, the constructor of layers has a number of parameters. For the first added layer, you must specify the input size (the number of inputs from the dataset). For all layers, you must specify the number of neurons (first parameter) and you should also specify the activation function (the list of all available activation functions can be found [here](https://keras.io/activations/) but you can also create your own).

If you want to check out the structure of the model, you can do so by calling the method `summary`:

In [None]:
model.summary()

The main difference between TensorFlow and PyTorch is that while PyTorch has dynamically compiled models (meaning that you can change the model during runtime), with TensorFlow you must first compile the model. We'll do so in the next step:

In [None]:
model.compile(Adam(lr=0.04), loss='categorical_crossentropy', metrics=['accuracy'])

The main parameters are the following:
* optimizer - the optimizer used to speed up the training process and the rate of convergence ([list of optimizers](https://keras.io/optimizers/))
* loss - the loss function used; `categorical_crossentropy` is usually used for multi-class classification ([list of loss functions in Keras](https://keras.io/losses/))
* metrics - the metrics shown during training to evaluate the performance of the model ([list of supported metrics](https://keras.io/metrics/))

To train the model, you must call the method `fit` by providing the training inputs and output and the number of epochs. For further parameters, please check out the documentation.

In [None]:
model.fit(X_train, Y_train, epochs=100)

As `fit` serves for training the model, `predict` can be used to ge the predicted output for a given input or list of inputs like so:

In [None]:
Y_pred = model.predict(X_test)

If you want to evaluate the performance of your model using further metrics, you can do so using `numpy` and `sklearn`:

In [None]:
import numpy as np

Y_test_class = np.argmax(Y_test, axis=1)
Y_pred_class = np.argmax(Y_pred, axis=1)

from sklearn.metrics import classification_report, confusion_matrix

print(classification_report(Y_test_class, Y_pred_class))
print(confusion_matrix(Y_test_class, Y_pred_class))