# IRIS DATA CLASSIFICATION

This notebook trains neural network to classify iris data. The training manifesto given in the notebook follows the caviar search strategy for hyperparameter tuning, a technique is followed by machine learning scientists and engineers for training deep learning models.

**Import deep_learn package**

In [None]:
try:
    from deep_learn.nn import ann
except:
    from config import *
    append_path('../')
    from deep_learn.nn import ann

**Import neccessary package**

In [None]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn import datasets
from sklearn.preprocessing import OneHotEncoder
import matplotlib.pyplot as plt

## 1. Load and reshape data

**Load iris data**

In [None]:
iris = datasets.load_iris()

In [None]:
X = iris.data
y = iris.target.reshape(-1,1)

**Store the data in pandas dataframe**

In [None]:
# stack the X y data horizontally
data = np.hstack((X,y))
# store the numpy array in pandas dataframe
data = pd.DataFrame(data=data, columns=iris.feature_names +['species'])
# shuffle the data
data = data.sample(frac=1, random_state=1).reset_index(drop=True)

In [None]:
data.head(20)

**Features and output of the data**

In [None]:
features = iris.feature_names
output = 'species'

## 2. Preprocess the data for deep learning model

**Do a train test split**

In [None]:
train_data, test_data = train_test_split(data, test_size = 0.266, random_state = 1)

**A function to extract feature matrix and output vector**

In [None]:
def get_xy_data(dataframe, features = None, output = None):

    '''a function for parsing the feature matrix and output array from a pandas dataframe'''

    # to ignore pandas warning
    import warnings
    warnings.filterwarnings('ignore')

    # import numpy
    import numpy as np

    # if no featues are given then just return the a numpy matrix of the dataframe
    if features == None:
        return dataframe.as_matrix()

    # extract the feature matrix and convert it to numpy array
    X = dataframe[features].as_matrix()

    # if there is no output
    if output == None:
        return X
    # if the output vector is wanted by the user
    else:
        # extracting the output columns and converting it to numpy array
        y = dataframe[output].as_matrix()
        y = np.reshape(y, (-1,1))
        # returning the feature matrix and output vector
        return (X, y)

**Extract X y data for train and test set**

In [None]:
X_train, Y_train = get_xy_data(train_data, features=features, output=output)
X_test, Y_test = get_xy_data(test_data, features=features, output=output)

**Onehot encoding the y data**

In [None]:
encoder = OneHotEncoder()
Y_train = encoder.fit_transform(Y_train)
Y_train = Y_train.toarray()
Y_test = encoder.transform(Y_test)
Y_test = Y_test.toarray()

In [None]:
print(X_train.shape)
print(Y_train.shape)

In [None]:
print(X_test.shape)
print(Y_test.shape)

## 3. Train the first neural network for classification

Implementation of first neural network is a dirty implementation which allows engineers to test if the network along with its hyperparameters, architecture, loss function actually works. After creating dirty implementation engineers do hyperparameter tuning. Now the iris dataset for is a very simple data to create a very accurate first implementation. Things will not so easy for example creating a yolo object detection network for detecting vehicles, pedestrians, and road signs for self driving system.  

**Neural network architecture**

In [None]:
layers_dims = [4,4,8,8,4,3]

**Create a nn model object**

In [None]:
model = ann(layers_dims=layers_dims)

**Hyperparameters of the model**

In [None]:
batch_size = X_train.shape[0]
learning_rate = 0.1*.5
num_iterations = 40000

**Fit the model**

In [None]:
model.fit(X_train, Y_train, X_test, Y_test, batch_size,
          learning_rate = learning_rate, 
          num_iterations = num_iterations, print_cost=True, random_seed = 0)

**Plot of Cost vs Iteration**

In [None]:
# plot the cost
plt.plot(np.squeeze(model.costs))
plt.ylabel('cost')
plt.xlabel('iterations (per tens)')
plt.title("Learning rate =" + str(learning_rate))
plt.show()

## 3.Caviar Strategy for hyperparameter tuning

There are 2 strategies for deep learning hyperparamter tuning: 1) Panda strategy in which we babysit a single model, this is applicable if computing resource is limited 2) Caviar strategy in which we randomly initialize a number of hyperparameter settings and train neural network model using different settings and then choose the one with the lowest error, this is strategy is applicable if we have enormous computing resource. In this notebook I will choose to tune the parameters of learning rate and nn architecture using caviar strategy.

**Function to generate a given number of nn architectures**

In [None]:
def hidden_layer_and_node_generator(model_num, num_input, num_output, randome_seed = 0, low = 8, high=17):
    
    '''a function to generate a given number of nn architectures'''
    
    # set the random seed
    np.random.seed(randome_seed)
    
    # list to store the architectures
    model_architecture_list = []
    
    # iterate given number of times
    for i in range(model_num):
        # randomly generate number of hidden layers
        num_hidden = np.random.randint(low = 3, high = 6)
        # randomly generate the number of nodes in each layer
        layers_dims = np.random.randint(low = low, high = high, size = num_hidden)
        layers_dims = layers_dims.tolist()
        # insert the input and output layer
        layers_dims.insert(0,num_input)
        layers_dims.append(num_output)
        # append the architecture to the designated list
        model_architecture_list.append(layers_dims)
    
    return model_architecture_list

**Generaty a list of learning rates to choose from**

In [None]:
learning_rates = np.round(np.linspace(0.1*0.5,0.1*5,num=200),4)
learning_rates

**Function to randomly generate a given number of learning rates**

In [None]:
def learning_rate_generator(learning_rates, model_num, randome_seed = 0):
    
    '''a function to randomly generate a given number of learning rates'''
    
    np.random.seed(randome_seed)
    
    return np.random.choice(learning_rates, size=model_num).tolist()

**Function which implements a caviar strategy search**

In [None]:
def caviar_strategy_search(model_num, batch_size, randome_seed = 0, num_iterations = 40000):
    
    '''a function which implements a caviar strategy search'''
    
    # randomly generate a list of learning rates
    learning_rate_list = learning_rate_generator(learning_rates, model_num, randome_seed = randome_seed)
    # randomly generate a list of architectures 
    model_architecture_list = hidden_layer_and_node_generator(model_num,4,3, randome_seed = randome_seed)
    # lists to store the costs and accuracy of models
    cost_list = []
    accuracy_list = []
    
    # iterate a given number of times
    for i in range(model_num):
        # create and fit a nn model with given architecture
        model = ann(layers_dims=model_architecture_list[i])
        model.fit(X_train, Y_train, X_test, Y_test, batch_size,
                  learning_rate = learning_rate_list[i], 
                  num_iterations = num_iterations, print_cost=False, random_seed = randome_seed)
        # append the average of last 5 costs to the designated list
        cost_list.append(np.average(model.costs[-5:]))
        # append the accuracy to the designated list
        accuracy_list.append(model.accuracy)
        # print statement
        print("Completed training and collected results for the model:",str(i+1))
        
    return model_architecture_list, learning_rate_list, cost_list, accuracy_list

In [None]:
model_num = 20
batch_size = X_train.shape[0]
randome_seed = 0

In [None]:
model_architecture_list, learning_rate_list, cost_list, accuracy_list = \
caviar_strategy_search(model_num, batch_size, randome_seed = randome_seed)

In [None]:
results = pd.DataFrame({"model layers": model_architecture_list,
                        "learning rate": learning_rate_list,
                        "accuracy": accuracy_list,
                        "cost": cost_list})
results = results.reindex(columns=["model layers", "learning rate", "accuracy", "cost"])

In [None]:
results