### TP6 NN (OBLIGATORY)
 In this TP we will develop a two-layer fully-connected neural network to perform classification, and test it out on the CIFAR-10 dataset. We train the network with a softmax loss function. The network uses a ReLU nonlinearity activation function after the first fully connected layer. The outputs of the second fully-connected layer are the scores for each class.
 

In [1]:
# Run some setup code for this notebook.
from __future__ import print_function

import random
import numpy as np
from data_utils import load_IRIS, load_CIFAR10
from NN import *
import matplotlib.pyplot as plt


# make figures appear inline
%matplotlib inline
%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

# notebook will reload external python modules;
# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
%load_ext autoreload
%autoreload 2

 The CIFAR10 dataset contains 60,000 32x32 color images in 10 different classes ([CIFAR-10](https://en.wikipedia.org/wiki/CIFAR-10)) 

You have to download the dataset; open a terminal and go to the folder *datasets*, then execute the script *get_datasets.sh*:
```bash
$ ./get_datasets.sh
```

In [2]:
def get_CIFAR10_data(num_training=49000, num_validation=1000, num_test=1000):
    """
    Load the CIFAR-10 dataset from disk and perform preprocessing to prepare
    it for the two-layer neural net classifier. These are the same steps as
    we used for the SVM, but condensed to a single function.  
    """
    # Load the raw CIFAR-10 data
    cifar10_dir = 'datasets/cifar-10-batches-py'
    
    X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)
        
    # Subsample the data
    mask = list(range(num_training, num_training + num_validation))
    X_val = X_train[mask]
    y_val = y_train[mask]
    mask = list(range(num_training))
    X_train = X_train[mask]
    y_train = y_train[mask]
    mask = list(range(num_test))
    X_test = X_test[mask]
    y_test = y_test[mask]

    # Normalize the data: subtract the mean image
    mean_image = np.mean(X_train, axis=0)
    X_train -= mean_image
    X_val -= mean_image
    X_test -= mean_image

    # Reshape data to rows
    X_train = X_train.reshape(num_training, -1)
    X_val = X_val.reshape(num_validation, -1)
    X_test = X_test.reshape(num_test, -1)

    return X_train, y_train, X_val, y_val, X_test, y_test


# Cleaning up variables to prevent loading data multiple times (which may cause memory issue)
try:
   del X_train, y_train
   del X_test, y_test
   print('Clear previously loaded data.')
except:
   pass

# Invoke the above function to get our data.
X_train, y_train, X_val, y_val, X_test, y_test = get_CIFAR10_data()
print('Train data shape: ', X_train.shape)
print('Train labels shape: ', y_train.shape)
print('Validation data shape: ', X_val.shape)
print('Validation labels shape: ', y_val.shape)
print('Test data shape: ', X_test.shape)
print('Test labels shape: ', y_test.shape)

Train data shape:  (49000, 3072)
Train labels shape:  (49000,)
Validation data shape:  (1000, 3072)
Validation labels shape:  (1000,)
Test data shape:  (1000, 3072)
Test labels shape:  (1000,)


### Instructions

We will use the class NN in the file NN.py to represent instances of our network. The network parameters are stored in the instance variable self.params where keys are string parameter names and values are numpy arrays. 

- Open the file NN.py and fill the missing parts of the NN.loss. This function takes the data and weights and computes the class scores, the loss, and the gradients on the parameters. (You have to perform the forward pass, the backward pass).

- Fill the missing part in the NN.train() (You have just to to update the parameters of the network). 

- Implement the NN.predict method to predict labels for data points.



 - Train your network using 1, 10, 100, 1000 neurons in the hidden layer (hidden_size) and for each case compute the test accuracy and calculate the time it takes to train the network. Explain if and how the size of the hidden layer influences the prediction (test accuracy) and the time it takes to train the network.

#### To train your network use: learning_rate=1e-4, learning_rate_decay=0.95, batch_size=100, num_iters = 1000

In [None]:
from timeit import default_timer as timer
from matplotlib.backends.backend_pdf import PdfPages

In [None]:
input_size = 32 * 32 * 3
hidden_size = np.array([1, 10, 100, 1000])
num_classes = 10

for size in hidden_size:
    start = timer()
    Neural_Network = NN(input_size=input_size, hidden_size=size, output_size=num_classes)
    loss_history, train_acc_history, test_acc_history = Neural_Network.train(X=X_train, y=y_train, X_test=X_test, y_test=y_test,
                  learning_rate=1e-4, learning_rate_decay=0.95, num_iters=1000,
                  batch_size=100, verbose=False)
    end = timer()
    print('hidden layer size: ', size)
    print('seconds elapsed: ', end - start)
    print('train_acc_history: ', train_acc_history)
    print('test_acc_history: ', test_acc_history)
    #### Plot the train / test accuracies
    with PdfPages("fig{0}.pdf".format(size)) as pdf:
        plt.subplot(2, 1, 1)
        plt.plot(loss_history)
        plt.title('Loss history {0} neurons'.format(size))
        plt.xlabel('Iteration')
        plt.ylabel('Loss')

        plt.subplot(2, 1, 2)
        plt.plot(train_acc_history, label='train')
        plt.plot(test_acc_history, label='test')
        plt.title('Classification accuracy history')
        plt.xlabel('Epoch')
        plt.ylabel('Clasification accuracy')
        plt.legend()
        pdf.savefig()
        plt.show()

# COMPARISON WITH SAME NETWORK ARCHITECTURE USING KERAS

In [8]:
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from keras.utils import np_utils
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn import preprocessing
from sklearn.preprocessing import LabelEncoder
from sklearn.pipeline import Pipeline

In [4]:
np.random.seed(1234)

In [5]:
X_train_df = pd.DataFrame(X_train).astype(float)
y_train_df = pd.DataFrame(y_train)

In [None]:
# Get column names first
names = X_train_df.columns
# Create the Scaler object
scaler = preprocessing.StandardScaler()
# Fit your data on the scaler object
scaled_df = scaler.fit_transform(X_train_df)
scaled_df = pd.DataFrame(scaled_df, columns=names)

In [None]:
scaled_df.head()

In [6]:
encoder = LabelEncoder()
encoder.fit(y_train_df)
encoded_Y = encoder.transform(y_train_df)
dummy_y = np_utils.to_categorical(encoded_Y)

  y = column_or_1d(y, warn=True)
  y = column_or_1d(y, warn=True)


In [16]:
def baseline_model():
	# create model
	model = Sequential()
	model.add(Dense(10, input_dim=3072, activation='relu'))
	model.add(Dense(10, activation='softmax'))
	# Compile model
	model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
	return model

In [17]:
estimator = KerasClassifier(build_fn=baseline_model, epochs=3, batch_size=100, verbose=0)

In [18]:
kfold = KFold(n_splits=10, shuffle=True, random_state=1234)

In [19]:
results = cross_val_score(estimator, X_train_df, dummy_y, cv=kfold)
print("Baseline: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Use tf.cast instead.
Baseline: 19.07% (2.82%)


The same network using the Keras library gives a CV correct classification of 19.07%, which is lower than the test correct classification of around 33% using our own Neural Network.