# Training Models

In this notebook we are going to implement a machine learning model using one of the classifiers available in scikit learn.
The idea is to train a model and send all the metrics to the machine learning workspace in the cloud.
The notebook is divided in the following tasks:

1. Install the Azure Machine Learning Workspace SDK
2. Create a Machine Learning Workspace
3. Create the Machine Learning Experiment
4. Train a Random Forest classifier with Scikit-learn
5. Train a Deep Learning classifier with Tensorflow
6. Register best Model

## 1. Install the Azure Machine Learning Workspace SDK

First of all we need to install the SDK in order to log and make the training in the cloud.

## 2. Create an Azure Machine Learning Workspace

In this step we have to create a machine learning workspace through the SDK.

* Give a name to the workspace
* Use the existing Resource group
* Use the same location that you have created previously inside your workspace

> *Note in case you have more than one subscription set previously as default the subscription in which you want to deploy the workspace.*
> *Set the default subscription with the following command: `az account set --subscription SUBSCRIPTION_ID` *

> *Hint: use the Azure Machine Learning SDK from [here](https://docs.microsoft.com/en-us/python/api/overview/azure/ml/intro?view=azure-ml-py)*

## 3. Create the Machine Learning Experiment

Create a experiment to where we can log all the metrics and parameters from the training.

1. Get the machine learning workspace

2. Create an experiment to log all the metrics and parameters

## 4. Train a Random Forest classifier with Scikit-learn
First, let's implement the Random Forest classifier.

### 1. Load data and prepare for training

Once we have processed all the image dataset and we have the images normalized we can create the training dataset needed for a machine learning algorithm.
However, the traditional machine learning algorithms do not understand images, they only understand numbers.
In this step we will create a flatten image dataset with only arrays of numbers and we will label them in other to use any of the supervised calssifier.

1. Load all the files in an array:
    * Flatten the images into an arrays of numbers.
    * Create the labels of the dataset with the same number of elements.

> *Hint: Use the Image module of Pillow and numpy to flatten the array.*

2. Divide the input dataset into train and test

### 2. Train the model

We are finally here, we are going to train a custom model based on the classifcation task that we have at hands.
To solve this step we are going to try with one of the models implemented in **scikit learn** athat are really helpful.
Try to train this model and use different parameters to see how the metrics of the model changes.

> *Hint: Use a classifier from scikit learn, from [here](https://scikit-learn.org/stable/supervised_learning.html#supervised-learning)*

## 5 Train a CNNs model with Tensorflow
With this approach, we are going to use Convolutional Neural Networks(CNNs) to analyze directly the images and then classify using the available categories.
With this kind of models, we can extract automatically the features from the images and make CNN able to get a better result for classification.

### 1. Prepare the data for training
Now there is no need to flatten the matrix of the image, we can train with it.
For this purpose we are going to load the images and ensure training dataset fullfills the following:

* All the data must be a numpy array.
* The labels to train it must be done using a following a one hot encoding approximation, see [here](https://hackernoon.com/what-is-one-hot-encoding-why-and-when-do-you-have-to-use-it-e3c6186d008f)
* Don't forget to divide the dataset in train and test

### 2. Create the model
After prototyping quickly with Scikit-Learn we realize that it is a good thing to use frameworks. 
Tensorflow is one of the most common frameworks for Deep Learning. 
However, sometimes it could be tedious, that is one of the reasons to use Keras which has Tensorflow as backend.

For the design of the model we propose to use an architecture with the following layers:

1. Input Layer (3 channel image input layer)
2. Convolutional (2D)
3. Max Pooling
4. Convolutional (2D)
5. Max Pooling
6. Dense (Output layer)

> *Hint: Use [Tensorflow](https://www.tensorflow.org/) with [Keras](https://keras.io/) in order to design the architecture of your model.*

### 3. Train the model

Train the Convolutional neural network using the previous designed neural network.
Now you only need to train you network architecture, but remember to log the accuracy and input parameters of your network to Azure Machine Learning Workspace.
In addition, after training save your trained model.

## 6. Register best Model

After finishing training is important to register the best model in the workspace.
Then later, we can deploy this model referring to it.