> ### RIIAA 2.0 – Workshop 
> **Deep Learning as a Service** <br>
> **Instructor:** [Rodolfo Ferro](https://rodolfoferro.xyz) <br>
> **Email:** <ferro@cimat.mx> <br>
> **Twitter:** <https://twitter.com/FerroRodolfo/> <br>
> **GitHub:** <https://github.com/RodolfoFerro/> <br>

# Iris Classification Problem

Along this notebook we'll explain how to use the power of cloud computing with Google Colab for a classical example –*The Iris Classification Problem*– using the popular [Iris flower dataset](https://en.wikipedia.org/wiki/Iris_flower_data_set).

For this classification problem we will build a simple feed-forward full-connected artificial neural network.

The Python framework that we will be using is [Tensorflow 2.0](https://www.tensorflow.org) with the [Keras](https://keras.io/) module.


### Problem statement

Before we tackle the problem an ANN, let's understand what we'll be doing: 

* If we feed our neural network with Iris data, the model should be able to determine what species it is.

> #### What do we need to do?
> Train a _Deep Learning_ model (in this case) using a known dataset: [Iris flower dataset](https://en.wikipedia.org/wiki/Iris_flower_data_set).
>
> Specifically, we are going to do the following:
> - Load the dataset
> - Preprocess the data
> - Build the model
> - Set hyperparameters 
> - Train the model
> - Save and download the trained model
> - Predict data

## Installing dependencies

For our training we will be using Tensorflow 2.0, so we want to be sure it is installed on its latest version:

In [None]:
# Let's install Tensorflow 2.0:
!pip install -q tensorflow==2.0.0-rc0

# And verify that it is now in its latest version:
import tensorflow as tf
print(tf.__version__)

## The Iris dataset

In [None]:
from IPython.display import HTML
url = 'https://en.wikipedia.org/wiki/Iris_flower_data_set'
iframe = '<iframe src=' + url + ' width="100%" height=400></iframe>'
HTML(iframe)

## Importing the dataset

In [None]:
# Importing dataset from scikit-learn and other useful packages:
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
import matplotlib.pyplot as plt
import numpy as np

# We will fix a random seed for reproducibility:
seed = 11
np.random.seed(seed)

In [None]:
# We now import the Iris dataset:
iris = ???

# And set the features and labels vectors from it:
# Set x as 'data' from iris
# Set y as 'target' from iris
# Set names as 'target_names' from iris
# Set feature_names as 'feature_names' from iris

# We can load some elements to verify the contents in the dataset:
elements_to_display = [???]
for element in elements_to_display:
    print(f"Element {element}th:")
    print(f"  - Features: {x[element]}")
    print(f"  - Target: {y[element]}")
    print(f"  - Species: {names[element % 3]}")
    print()

## Preprocess dataset

The preprocess step results very important in many cases. For this case, we will just need to do a very simple transformation: a one hot encode process.

In [None]:
from tensorflow import keras

# One hot encode outputs: 
y = keras.utils.to_categorical(y)

# Set global variables:
n_features = len(feature_names)
n_classes = names.shape[0]

# Let's checkout changes:
for element in elements_to_display:
    print(f"Element {element}th:")
    print(f"  - Features: {x[element]}")
    print(f"  - Target: {y[element]}")
    print(f"  - Species: {names[element % 3]}")
    print()

    
# Split the data set into training and testing sets:
x_train, x_test, y_train, y_test = train_test_split(x, y, 
                                                    test_size=0.3, 
                                                    random_state=seed)

## Let's talk about the model...

We will be using a very simple model, a feed-forward multi-layer perceptron.

### Let's create the model with Keras!

First of all, let's import what we'll use:

In [None]:
# Let's import our Keras stuff:
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential

def iris_model(input_dim, output_dim, init_nodes=4, name='model'):
    """FF-MLP model for Iris classification problem."""
    
    # Create model:
    model = Sequential(name=name)
    # Add Dense -> init_nodes, input_dim=input_dim, activation='relu'
    # Add Dense -> 2*init_nodes, activation='relu'
    # Add Dense -> 3*init_nodes, activation='relu'
    # Add Dense -> output_dim, activation='softmax'
    
    # Compile model:
    model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    
    return model

### Useful resources

- Sequential model: <https://keras.io/getting-started/sequential-model-guide/>
- Classifying the Iris Data Set with Keras: <https://janakiev.com/notebooks/keras-iris/>

### Building the model

In [None]:
# Let's build our model:
model = ???
model.summary()

### Training the model

In order to train the model, we first need to set its training hyperparameters.

In [None]:
# Set hyperparameters
epochs = ???
batch = ???

# Fit the model
history = model.fit(x_train, y_train, 
                    validation_data=(x_test, y_test),
                    verbose=True,
                    epochs=epochs, batch_size=batch)

### Evaluating the results

In [None]:
# Final evaluation of the model:
scores = model.evaluate(x_test, y_test, verbose=1)
print(f'Test accuracy: {scores[1]}')

### Plot the training along the time

In [None]:
def plot_loss(history):
    plt.style.use("ggplot")
    plt.figure(figsize=(8, 4))
    plt.plot(history.history['loss'])
    plt.plot(history.history['val_loss'])
    plt.title("Model's training loss")
    plt.xlabel("Epoch #")
    plt.ylabel("Loss")
    plt.legend(['Train', 'Test'], loc='upper left')
    plt.show()


def plot_accuracy(history):
    plt.style.use("ggplot")
    plt.figure(figsize=(8, 4))
    plt.plot(history.history['accuracy'])
    plt.plot(history.history['val_accuracy'])
    plt.title("Model's training accuracy")
    plt.xlabel("Epoch #")
    plt.ylabel("Accuracy")
    plt.legend(['Train', 'Test'], loc='upper left')
    plt.show()

In [None]:
plot_loss(history)
plot_accuracy(history)

_How can we save these plots?_

## Saving a model

To save the trained model we will basically do two things:

1. Serialize the model into a JSON file, which will save the architecture of our model.
2. Serialize the weights into a HDF5 file, which will save all parameters of our model.

In [None]:
# Serialize model to JSON:
model_json = model.to_json()
with open("iris_model.json", "w") as json_file:
    json_file.write(model_json)

# Serialize weights to HDF5 (h5py needed):
model.save_weights("iris_model.h5")
print("Model saved to disk.")

## Downloading a model

We just need to import the Google Colab module and download the specified files.

In [None]:
from google.colab import files

model_files = ['iris_model.json', 'iris_model.h5']
for file in model_files:
    files.download(file)

## Loading a trained model
We will basically do three things:

1. Load the model from a JSON file.
2. Load the weights from a HDF5 file.
3. (Re)Compile the trained model.

In [None]:
# Load json and create model:
from tensorflow.keras.models import model_from_json

json_file = open('iris_model.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json)

# Load weights into loaded model:
loaded_model.load_weights("iris_model.h5")
print("Model loaded from disk.")

In [None]:
# Evaluate loaded model on test data:
loaded_model.compile(loss='categorical_crossentropy',
                     optimizer='adam',
                     metrics=['accuracy'])

score = loaded_model.evaluate(x_test, y_test, verbose=1)
print(f'Test accuracy: {score[1]}')

## Predicting from new data

Now that we have a trained model, how do we use it?

It is as simple as follows:

In [None]:
# Remembering some elements:
for element in elements_to_display:
    prediction_vector = model.predict(np.array([x[element]]))
    print(f"Element {element}th:")
    print(f"  - Features: {x[element]}")
    print(f"  - Target: {y[element]}")
    print(f"  - Scpecies: {names[np.argmax(y[element])]}")
    print(f"  - Predicted species: {names[np.argmax(prediction_vector)]}")
    print()