# CNN for Cifar-10 Dataset

## Requirements

For this project we are used:
* Python 3.6
* Tensorflow 1.8.0
* *GPUs parallel calculation manager* nVidia CUDA 9.0
* *GPU-accelerated library* nVidia cuDNN 7.1
* or CPU optimized tensorflow for intel

## Introduction

In this first notebook we are going to explain how we have set up our work.  
Essentially the process is divided into several parts:
* Set up environment with the Cifar-10 Dataset
* Define a convolutional neural network
* Define a quantization method
* Train the convolutional neural network
* Provide information about CNN's performance and accuracy

In [2]:
import numpy as np
import tensorflow as tf

  from ._conv import register_converters as _register_converters


## Cifar-10 Dataset

Cifar-10 Dataset is taken from the official website www.cs.toronto.edu.

Dataset is stored in the data directory: cnn/data. From Cifar-10 dataset we are going to take x_train, t_train, x_test and t_test.
The training dataset set is used for training the CNN, the testing dataset is used for evaluate the performance and the accuracy of the network.

### Load data

In [21]:
from cnn.dense import dataset_preprocessing_by_keras
from cnn.utils.dataset import load_cifar10

In [22]:
x_train, t_train, x_test, t_test = load_cifar10()

In [23]:
x_train.shape, t_train.shape, x_test.shape, t_test.shape

((50000, 32, 32, 3), (50000, 10), (10000, 32, 32, 3), (10000, 10))

In [5]:
x_train = dataset_preprocessing_by_keras(x_train)
x_train[0, :, :, 0]

array([[-0.68747891, -0.65572952, -0.63985482, ..., -0.95734874,
        -1.00497283, -0.92559935],
       [-0.65572952, -0.63985482, -0.59223073, ..., -0.92559935,
        -1.02084753, -0.86210057],
       [-0.7192283 , -0.63985482, -0.65572952, ..., -0.68747891,
        -0.735103  , -0.70335361],
       ...,
       [-0.36998499, -0.7509777 , -1.00497283, ..., -0.25886212,
        -0.24298742, -0.14773924],
       [-0.25886212, -0.65572952, -1.19546919, ...,  0.31262694,
        -0.00486698, -0.49698256],
       [-0.27473681, -0.35411029, -1.11609571, ...,  0.75711843,
         0.64599556,  0.28087755]])

## CNN Model and Training

We will use a custom made wrapper for tensorfllow NN training and use

In [6]:
from cnn.model_class import TfClassifier

This CNN is called *dense_cnn*. Here we will explain how it is composed.

The CNN is composed by several layers. In the first part there are 2 **convolutional** layers and 2 **pooling** layers (they are alternated), then there are a *flatten* layer followed by a **relu** layer, a *dropout* layer and finally a **softmax** layer.

The network uses a stochastic gradient descent optimizer and a categorical crossentropy loss.  
To judge the performance of our model we are used a MSE metric.

In [7]:
from cnn.dense import NET_NAME, eval_fn, forward_pass, loss_fn

In [8]:
model = TfClassifier(NET_NAME, forward_pass, loss_fn, eval_fn,
                     tf.train.AdamOptimizer())

This network is trained for 50 epochs.

In [9]:
# history = model.fit(
#    [x_train, t_train],
#    batch_size=64,
#    validation_split=0.1,
#    epochs=1,
#    verbosity=1)
#
# print(history)

Then it's evaluated

In [None]:
evals = model.evaluate([x_test, t_test])

print(evals)

INFO:tensorflow:Restoring parameters from /home/daibak/Documents/Code/Python/aca-tensorflow/cnn/models/dense_cnn/model.ckpt


### test

In [34]:
def _split_data_dict_in_perc(input_dict, n_samples, percs):
    for k, v in input_dict.items():
        input_dict[k] = np.split(v, (n_samples * percs).astype(np.int))

    input_LD = [dict(zip(input_dict, t))
                for t in zip(*input_dict.values())]  # List of Dicts

    return input_LD


def _batch_data_dict(input_dict, n_samples, batch_size):
    n_batches, drop = np.divmod(n_samples, batch_size)

    for k, v in input_dict.items():
        input_dict[k] = np.array_split(v, n_batches)

    out_LD = [dict(zip(input_dict, t)) for t in zip(*input_dict.values())]

    return out_LD


def _set_train_mode_to_LD(input_LD, mode):
    mode_d = {"train_mode:0": mode}
    for d in input_LD:
        d.update(mode_d)

    return input_LD

In [71]:
inputs = [x_train, t_train]
input_names = ["features", "labels"]
batch_size = 64
validation_split = 0.2
MAX_BATCH_SIZE = 2000

n_samples = inputs[0].shape[0]

input_tensors = input_names
input_DL = dict(zip(input_tensors, inputs))

input_LD = _split_data_dict_in_perc(input_DL, n_samples,
                                    np.array([1 - validation_split]))

train_dict = input_LD[0]
val_dict = input_LD[1]

n_train_samples = train_dict[input_tensors[0]].shape[0]

train_LD = _batch_data_dict(train_dict, n_train_samples, batch_size)

if n_samples - n_train_samples > MAX_BATCH_SIZE:
    val_LD = _batch_data_dict(val_dict, n_samples - n_train_samples,
                              MAX_BATCH_SIZE)
else:
    val_LD = [val_dict]

train_LD = _set_train_mode_to_LD(train_LD, True)
val_LD = _set_train_mode_to_LD(val_LD, False)

In [78]:
input_LD.extend(input_names)
input_LD[-2]

'features'