# Cifar10 with tf.keras, tf.data and image augmentation

In [None]:
import sys
import os
import tempfile
import shutil
import tensorflow as tf
import tensorflow_hub as hub
import tensorflow_datasets as tfds

import numpy as np
%matplotlib inline
import random
import matplotlib.pyplot as plt

In [None]:
# Python version 3.5 or 3.6
assert sys.version_info >= (3, 5)
assert sys.version_info < (3, 7)
# Tensorflow 2.0
assert tf.__version__ >= "2.0"

The problem we are trying to solve here is to classify RGB images (32 pixels by 32 pixels), into their 10 categories (_airplane_, _automobile_, _bird_, _cat_, _deer_, _dog_, _frog_, _horse_, _ship_, _truck_). The dataset we will use is the CIFAR10 dataset, a classic dataset in the machine learning community.

# Input Data Management

## Download the dataset

The CIFAR10 dataset comes pre-loaded in Keras, in the form of a set of four Numpy arrays.

Documentation : https://www.tensorflow.org/api_docs/python/tf/keras/datasets/cifar10

In [None]:
train_data, test_data = tfds.load(name="imdb_reviews", split=["train", "test"], 
                                  batch_size=-1, as_supervised=True)

train_examples, train_labels = tfds.as_numpy(train_data)
test_examples, test_labels = tfds.as_numpy(test_data)

In [None]:
x_val = train_examples[:10000]
partial_x_train = train_examples[10000:]

y_val = train_labels[:10000]
partial_y_train = train_labels[10000:]

## Visualize the data

In [None]:
print("Training entries: {}, test entries: {}".format(len(train_examples), len(test_examples)))

In [None]:
train_examples[:2]

In [None]:
train_labels[:10]

## Create a tf.data Dataset

In [None]:
NUM_EPOCHS = 10
BATCH_SIZE = 100

In [None]:
ds_train = tf.data.Dataset.from_tensor_slices((partial_x_train, partial_y_train))
ds_train = ds_train.shuffle(buffer_size=10000)
ds_train = ds_train.batch(BATCH_SIZE)
ds_train = ds_train.repeat(NUM_EPOCHS)

In [None]:
ds_test = tf.data.Dataset.from_tensor_slices((x_val, y_val))
ds_test = ds_test.batch(BATCH_SIZE)

# Model Management

## Build the model

In [None]:
module_url = "https://tfhub.dev/google/tf2-preview/gnews-swivel-20dim/1"

Our Neural Network will now be composed of the following layer : 
- `KerasLayer` : add the `module_url`, the output_shape ([OUTPUT_SHAPE]), the input_shape ([]), dtype=tf.string and set `trainable` to False
- `Dense` Layer : 16 neurons, relu activation
- `Dense` Layer : 1 neuron, sigmoid activation

> <div class="mark">Create the model</div><i class="fa fa-lightbulb-o "></i>

Documentation : 
- https://www.tensorflow.org/hub/api_docs/python/hub/KerasLayer?hl=en
- https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense

In [None]:
OUTPUT_SHAPE = 20

model = tf.keras.Sequential()
model.add(hub.KerasLayer(module_url, output_shape=[OUTPUT_SHAPE], input_shape=[], dtype=tf.string, trainable=False))
model.add(tf.keras.layers.Dense(16, activation='relu'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))

In [None]:
OUTPUT_SHAPE = 20

model = tf.keras.models.Sequential()
# TODO


To make our network ready for training, we need to pick three more things, as part of "compilation" step:

* A loss function: the is how the network will be able to measure how good a job it is doing on its training data, and thus how it will be able to steer itself in the right direction.
* An optimizer: this is the mechanism through which the network will update itself based on the data it sees and its loss function.
* Metrics to monitor during training and testing. Here we will only care about accuracy (the fraction of the images that were correctly classified).

You will implement the following compilation step for your Neural Network : 
- "adam" optimizer
- "binary_crossentropy" loss
- metric : "accuracy"

Documentation : https://www.tensorflow.org/api_docs/python/tf/keras/models/Sequential#compile

### Compile the model

In [None]:
optimizer = tf.optimizers.Adam()

model.compile(loss='binary_crossentropy',
              optimizer=optimizer,
              metrics=['accuracy'])

Summarize the model

In [None]:
model.summary()

## Train the model

We are now ready to train our network, which in Keras is done via a call to the `fit` method of the network: 
we "fit" the model to its training data.

You will fit the network with the following configurations :
- `x`: ds_train
- `epochs` : 5 (passes on the whole dataset)
- `steps_per_epoch`: 150 steps
- `validation_data`: ds_test
- `validation_steps`: 10
- `callbacks`: tensorboard

You will also add a callback for launching TensorBoard to observe how the training is performing.

In [None]:
LOG_DIR = './tensorboard/tf_keras_data_transfer'

tensorboard = tf.keras.callbacks.TensorBoard(log_dir=LOG_DIR, histogram_freq=1, update_freq="batch")

> <div class="mark">Fit the model with the above information.</div><i class="fa fa-lightbulb-o "></i>

In [None]:
shutil.rmtree(LOG_DIR, ignore_errors=True)

model.fit(x=ds_train,
         epochs=10,
         steps_per_epoch=150,
         validation_data=ds_test,
         validation_steps=10,
         callbacks=[tensorboard])

In [None]:
shutil.rmtree(LOG_DIR, ignore_errors=True)

model. # TODO

Two quantities are being displayed during training: the "loss" of the network over the training data, and the accuracy of the network over the training data.

# Model Performance Evaluation

Now let's check that our model performs well on the test set too.

You can do this by calling the `evaluate` method of your network on the test set (use 300 for the `steps` argument).

Documentation : https://www.tensorflow.org/api_docs/python/tf/keras/models/Sequential#evaluate

> <div class="mark">Evaluate the model performance on test set</div><i class="fa fa-lightbulb-o "></i>

In [None]:
model.evaluate(ds_test)

In [None]:
model. # TODO

In [None]:
model.evaluate(ds_test)