<a href="https://colab.research.google.com/github/RohanBh/machine-learning-algorithms/blob/master/explore_tf_2_0.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
!pip install -q tensorflow-gpu==2.0.0-alpha0

In [0]:
import tensorflow as tf

In [3]:
print(tf.__version__)
print(tf.test.is_gpu_available())

2.0.0-alpha0
True


##  TensorFlow helloworld

[`keras`][2] is a high level deep learning API which can run with **tensorflow** or **theano** backends. In tensorflow, it is accessible through [`tf.keras`][1] module.

Let's take a look at [`tf.keras.datasets`][3] module. This module contains popular datasets like *mnist*, *fashion_mnist*, *cifar10*, etc. Let's load the mnist dataset.

From the docs, [`mnist.load_data(path)`][4] returns a tuple of numpy arrays: `(x_train, y_train), (x_test, y_test)`.

[1]: https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras
[2]: https://keras.io/
[3]: https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/datasets
[4]: https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/datasets/mnist/load_data

In [0]:
mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()

In [5]:
x_train.shape, y_train.shape, x_test.shape, y_test.shape

((60000, 28, 28), (60000,), (10000, 28, 28), (10000,))

In [0]:
# Normalize the data
x_train, x_test = x_train / 255.0, x_test / 255.0

There are two ways to build a model in keras. First is to use the **Sequential model API** in which we stack layers and provide the optimizer and loss function to use when compiling the model. Then there is the Keras **Functional API**. It is used to build complex models. In the functional API, each layer is a function. We define the input as a [tensor][1]. Then the first hidden layer would take that tensor as an input and the outputs of that layer can be fed into another layer and so on.

Let's build a simple MNIST classifier using the **Sequential Model** API. The [Sequential Model class][2] is available as an alias `tf.keras.Sequential`. It is actually the part of models module and was accessed as `tf.keras.models.Sequential` in tf 1.0.

[1]: https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/Tensor
[2]: https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/Sequential

In [7]:
tf.keras.models.Sequential == tf.keras.Sequential

True

In [0]:
# Let's create our neural network
model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(128, activation="relu"),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10, activation="softmax")
])

The Sequential class accepts a list of layers. We can also initialize our class first and add layers to it later using the [`Sequential.add(layer)`][1] method.

[1]: https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/Sequential#add

[Keras layers][1] provides the API for neural network layers consisting of one or more neurons. In tensorflow, it can be accessed using [`tf.keras.layers`][2] module. In this module we have different kinds of layers (Dense, Max pool, etc.) and activations (Relu, sigmoid, etc.).

[Flatten][3] takes a input tensor and flattens it into an output tensor. 

[1]: https://keras.io/layers/about-keras-layers/
[2]: https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/layers
[3]: https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/layers/Flatten

In [9]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
flatten (Flatten)            (None, 784)               0         
_________________________________________________________________
dense (Dense)                (None, 128)               100480    
_________________________________________________________________
dropout (Dropout)            (None, 128)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)                1290      
Total params: 101,770
Trainable params: 101,770
Non-trainable params: 0
_________________________________________________________________


Compiling the model means configuring the model for training. We provide the `optimizer`, `loss`, `metrics` (eg: `mse`, `accuracy`, etc.). The `metrics` is the list of metrics that must be evaluated during training and testing.

We use [`Categorical cross entropy`][1] loss when there are two or more label classes and the labels are provided in `one_hot` representation. But as we know `y` ranges from 0 to 9 for each datapoint,  we use [`Sparse categorical cross entropy`][2].

[1]: https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/losses/CategoricalCrossentropy
[2]: https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/losses/SparseCategoricalCrossentropy

In [0]:
model.compile(optimizer="adam",
             loss="sparse_categorical_crossentropy",
             metrics=["accuracy"])

Let the training begin!! [`Model.fit()`][1] trains the model for a fixed number of epochs on the training data. We are following batch learning. Because we didn't specify the `batch_size` in `model.fit` method, the batch_size default to 32.

`validation_data` in `model.fit` is used while evaluating the loss or any other model metric at the end of each epoch. It's not used during training.

`validation_split` takes a portion of `x` and `y` and uses it as validation data.

`steps` is the number of batches are used before evaluation is considered finished. Since `steps` is `None`, evaluation takes place until dataset is exhausted.

This pseudocode tells us the difference between validation, testing and training data.
```
for each epoch
    for each training data instance
        propagate error through the network
        adjust the weights
        calculate the accuracy over training data
    for each validation data instance
        calculate the accuracy over the validation data
    if the threshold validation accuracy is met
        exit training
    else
        continue training

```
Basically, validation data is used to minimize overfitting. We would know the model overfits when validation accuracy increases sharply!
After training is finished, we run against the test dataset to verify that the accuracy is sufficient.

Although in our case, we are using number of epochs as a stopping criterion.


[1]: https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/Model#fit

In [11]:
model.fit(x_train, y_train, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7f79797c4240>

Now after testing is complete, we evaluate our model on the test dataset. [`Model.evaluate()`][1] returns the loss and metric values for the model. Again, the batches of 32 are chosen since the computation is done in batches.

[1]: https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/Model#evaluate

In [12]:
model.evaluate(x_test, y_test)



[0.07318823329345323, 0.9772]