## Deep Learning with Tensorflow

### Deep Learning 

Deep learning is a specific subfield of machine learning that learns to transform input data into increasingly
meaningful representations in successive layers, in order to successfully perform a task. 

These layered representations are learned via models called neural
networks

How Deep Learning Works:

![](./im/how_dl_works.png)

### Tensorflow

[Tensorflow](https://www.tensorflow.org/) is an open source Deep Learning platform developed by Google. 

Installing Tensorflow:

`pip install tensorflow`

It is recommended to install Tensorflow in a virtual environment.  

You can create a virtual environment by the following command:

`python -m venv \path\to\.myenv` 


where `.myenv` is the name of the environment folder

Activating venv on linux:
`$ source /path/to/.myenv/bin/activate`

Activating on  PowerShell
`PS>: path\to\.myvenv\Scripts\Activate.ps1`

 On Microsoft Windows, it may be required to enable the Activate.ps1 script by setting the execution policy for the user. You can do this by issuing the following PowerShell command:
`PS C:> Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser`

If you want to switch projects or otherwise leave your virtual environment, simply run:
`deactivate`

In [None]:
import tensorflow as tf
tf.__version__

### Keras
Keras is a deep learning API for Python, that provides a convenient
way to define and train any kind of deep learning model.

TensorFlow itself now comes bundled with its own Keras implementation,
`tf.keras`.

#### A Simple Neural Network with 5 inputs and 2 outputs
![ann](./im/ann2.jpg)

### A Neuron
![neuron](./im/neuron.jpeg)

We can use the `Sequential` API in `tf.keras` when our network is composed of a single stack of layers connected sequentially

In [None]:
model=tf.keras.Sequential()

We will add a fully connected layer of 4 neurons to our model. We also specify the activation function and input shape

In [None]:
model.add(tf.keras.layers.Dense(4, activation='relu', input_shape=(5,)))

Add another layer of 2 neurons to the model.

In [None]:
model.add(tf.keras.layers.Dense(2)) #no activation, no need to specify input shape-each of the 2 neurons will have 4 inputs

The model’s `summary()` method displays all the model’s layers, including each layer’s
name (which is automatically generated unless you set it when creating the layer),  and its number of parameters

In [None]:
model.summary()

`None` in output shape means batch size can be anything.

Instead of adding the layers one by one as we just did, you can pass a list of layers when creating the Sequential model:

In [None]:
model=tf.keras.Sequential([tf.keras.layers.Dense(4, input_shape=(5,)),
                           tf.keras.layers.Dense(2)])
model.summary()

You can fetch a layer by its index, and access its parameters by the `get_weights()` method:

In [None]:
model.layers[0].get_weights()

**Exercise** Using the Sequential API, create the network shown below. Use relu activation for layer 1 and layer 2 and sigmoid activation for the output layer. How many parameters will the network have?

![ann](./im/annq.jpg)

In [None]:
#[] your code here


### Regression using ANN

There are two major types of supervised machine learning problems, called classification
and regression.
In classification, the goal is to predict a class label, which is a choice from a predefined
list of possibilities.

For regression tasks, the goal is to predict a continuous number

Basically, there are two types of regression models.
Simple regression, and multiple regression.
Simple regression is when one independent variable is used to estimate a dependent variable.
It can be either linear, or non-linear. Linearity of regression, is based on the nature of relationship between independent and dependent variables.
When more than one independent variable is present, 
the processes is called multiple regression.
Again, depending on the relation between dependent and independent variables,
it can be either linear or nonlinear regression.


### Simple Polynomial Regression Example

We’re going to train a network to model data generated by a quadratic function $y=0.5x^2 +x + 2$.
This will result in a model that can take a value, x, and predict its y.

For generating training data, start with generating uniformly distributed values from -3 to 3 for x values. Use the quadratic function to generate corresponding y values and add noise to simulate real data

In [None]:
x_values=tf.random.uniform(shape=[100], minval=-3, maxval=3) 
y_values = 0.5*x_values**2 + x_values + 2 + 0.2* tf.random.normal(x_values.shape) 

In [None]:
# Plot out the data 
import matplotlib.pyplot as plt
plt.plot(x_values, y_values, 'ro')
plt.xlabel('x_values')
plt.ylabel('y_values')

We'll use Keras to create a simple model architecture

In [None]:
model_1 = tf.keras.Sequential()
model_1.add(tf.keras.layers.Dense(16, activation='relu', input_shape=(1,))) #
model_1.add(tf.keras.layers.Dense(1)) #no activation so that the output is free to range 
model_1.summary()

After a model is created, you must call its `compile()` method to specify the `loss` function
and the `optimizer` to use.

In [None]:

# Compile the model using a standard optimizer and loss function for regression
model_1.compile(optimizer='sgd', loss='mse')

The learning process will then begin with the `model.fit()` command

In [None]:
model_1.fit(x_values, y_values, epochs=100)

Once you’ve trained your model, you can use it to make predictions
on new data. This is called *inference*.

In [None]:
#x=1, y=?
model_1.predict([1])

The predict() method can accept multiple input samples. Let us generate some test data and make predictions:

In [None]:
x_test=tf.random.uniform(shape=[100], minval=-3, maxval=3) 
predictions = model_1.predict(x_test)

In [None]:
plt.plot(x_test, predictions, 'r.', label='predictions')
plt.plot(x_test, 0.5*x_test**2+x_test+2, 'b.', label='true values')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()

### How to improve our model?
More training data? More epochs? More neurons in a layer? More layers? Different activation function?

In [None]:
x_train=tf.random.uniform(shape=[1000], minval=-3, maxval=3) #more training data
y_train = 0.5*x_train**2 + x_train + 2 + 0.2* tf.random.normal(x_train.shape) 


In [None]:
model_1 = tf.keras.Sequential()
model_1.add(tf.keras.layers.Dense(16, activation='relu', input_shape=(1,))) 
model_1.add(tf.keras.layers.Dense(1)) 
model_1.compile(optimizer='sgd', loss='mse')

Validation is used to determine if the model is overfitting to training data. Train the model with validation added:

In [None]:
history_1 = model_1.fit(x_train, y_train, epochs=500, batch_size=16, validation_split=0.2)

The `batch_size` argument specifies how many pieces of training data to feed into
the network before measuring the loss and updating the weights and biases.

The fit() method returns a History object containing the training parameters
(history.params), the list of epochs it went through (history.epoch), and most
importantly a dictionary (history.history) containing the loss and extra metrics it
measured at the end of each epoch on the training set and on the validation set (if
any). Let us graph the history.

In [None]:
import matplotlib.pyplot as plt
loss = history_1.history['loss']
val_loss = history_1.history['val_loss']
plt.plot(loss, label='train loss')
plt.plot(val_loss, label='val loss')
plt.legend()

In [None]:
# Make predictions based on our test dataset
predictions = model_1.predict(x_test)

# Graph the predictions against the actual values
plt.plot(x_test, predictions, 'r.', label='predictions')
plt.plot(x_test, 0.5*x_test**2+x_test+2, 'b.', label='actual')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()

### Image classification using FCN
We will use Fashion MNIST dataset, which is a drop-in replacement of MNIST. It has the exact same
format as MNIST (70,000 grayscale images of 28 × 28 pixels each, with 10 classes),
but the images represent fashion items rather than handwritten digits.

In [None]:
fashion_mnist = tf.keras.datasets.fashion_mnist
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()

In [None]:
x_train.shape, x_train.dtype 

In [None]:
y_train[:10]

With MNIST, when the label is equal to 5, it means that the image represents the
handwritten digit 5. Easy. For Fashion MNIST, however, we need the list of class
names to know what we are dealing with:

In [None]:
class_names = ["T-shirt/top", "Trouser", "Pullover", "Dress", "Coat",
"Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"]

Let us look at the first image in the dataset and its label

In [None]:
import matplotlib.pyplot as plt
plt.imshow(x_train[0], cmap='gray') 
plt.axis("off")
plt.title(class_names[y_train[0]]);

Since we are going to train the
neural network using Gradient Descent, we must scale the input features. (this also converts them to floats):

In [None]:
x_train = x_train / 255.0
x_test = x_test/ 255.0

Next we define the neural network that makes up our model. 
Our network will use 100 neurons in the hidden layer:

![A graph showing the network](./im/dense-multilayer-network-fashionmnist-small.png)

In [None]:
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(100, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])

The activation on the output layer is Softmax.

Our network layer structure will look like this:

![An image showing the network layer structure as it's broken down into layers.](./im/multilayer-network-layers-fashionmnist-small.png)

Call its compile() method to specify the loss function
and the optimizer to use. Optionally, you can specify a list of extra metrics to
compute during training and evaluation:

In [None]:
model.compile(loss="sparse_categorical_crossentropy",
optimizer="adam",
metrics=["accuracy"])

The loss function in this case is called sparse categorical cross entropy. 
[Cross Entropy](https://en.wikipedia.org/wiki/Cross_entropy)

The optimizer is `adam` which is an evolution of the
stochastic gradient descent (sgd) optimizer that has been shown to be faster and more efficient

To train the network, we call its fit() method over 10 epochs:


In [None]:
history = model.fit(x_train, y_train, epochs=10, batch_size=16)

#### Exploring the Model Output

In [None]:
y_prob=model.predict(x_test[:3])
y_prob.round(2)

Finally, we can do something new—evaluate the model, using a single line of code.
We have a set of 10,000 images and labels for testing, and we can pass them to the
trained model to have it predict what it thinks each image is, compare that to its
actual label, and sum up the results:



In [None]:
model.evaluate(x_test, y_test)

#### Saving and Restoring a Model

In [None]:
model.save("my_keras_model.h5")

Keras will use the HDF5 format to save both the model’s architecture (including every
layer’s hyperparameters) and the values of all the model parameters for every layer
(e.g., connection weights and biases). It also saves the optimizer (including its hyperparameters
and any state it may have).

In [None]:
model = tf.keras.models.load_model("my_keras_model.h5")

#### Takeaway

Adding more layers to our network will improve accuracy. However, they are not perfect for computer vision tasks. In images, there are some structural patterns that can help us classify an object regardless of it's position in the image, but fully connected networks are not *translation invariant*.

Convolutional Neural Networks(CNNs) are more effective for computer vision tasks.

### Implementing a Convolutional Neural Network to recognize fashion images


To implement a convolutional layer, we’ll use the `tf.keras.layers.Conv2D` type.
This accepts as parameters the number of convolutions to use in the layer, the size of
the convolutions, the activation function, etc.
For example, here’s a convolutional layer used as the input layer to a neural network:


In [None]:
conv=tf.keras.layers.Conv2D(filters=32, kernel_size=3, strides=1,
padding='same', activation='relu', input_shape=(28, 28, 1)),

This code creates a `Conv2D` layer with 32 filters, each 3 × 3, using a stride of 1 (both
horizontally and vertically)

`padding` can be `same` or `valid`. "valid" means no padding. "same" results in padding with zeros evenly to the left/right or up/down of the input. When padding="same" and strides=1, the output has the same size as the input.

![](./im/conv.gif)

Conv2D layers are designed for multicolor images, so we’re specifying the third
dimension of input_shape as 1.

#### Pooling Layer

The second common building block of CNNs is the pooling layer. Their goal is to subsample (i.e., shrink) the input image in order to reduce the computational load, the memory usage, and the number of parameters
(thereby limiting the risk of overfitting).

A pooling layer has no weights; all it does is aggregate the
inputs using an aggregation function such as the max or mean. 
Figure shows a
max pooling layer, which is the most common type of pooling layer. 


<center><img src="./im/max_pooling.png"/></center>


In this example,
we use a 2 × 2 pooling kernel, with a stride of 2 and no padding. Only the max input
value in each receptive field makes it to the next layer, while the other inputs are
dropped.

The following code
creates a max pooling layer using a 2 × 2 kernel. The strides default to the kernel size,
so this layer will use a stride of 2 (both horizontally and vertically). By default, it uses
"valid" padding (i.e., no padding at all):

In [None]:
max_pool = tf.keras.layers.MaxPool2D(pool_size=2)

A typical CNN architecture

<center><img src="./im/typical_cnn.png"/></center>

A simple CNN to tackle the Fashion MNIST dataset:


In [None]:
model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(64, 3, strides=1, padding='same', activation='relu', input_shape=(28, 28, 1)),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(128, 3, strides=1,padding='same', activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])

In [None]:
model.summary()

In [None]:
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

In [None]:
history = model.fit(x_train, y_train, epochs=5, batch_size=16)

In [None]:
model.evaluate(x_test, y_test)

#### Further Learning

1. Books

![](./im/books.png)

2. Tutorials and guide on the tensorflow website
https://www.tensorflow.org/overview