1. Introduction
1. Data preparation
    1. Load data
    1. Normalization, Reshape and Label encoding
    1. Visualize test and train sample
1. Model Building
    1. Split training and valdiation set
    1. Define the model architechture
    1. Set the optimizer and annealer
    1. Data augmentation
    1. Train model
1. Evaluate the model
    1. Training and validation curves
    1. Visualize Prediction
1. Prediction and submition
    1. Predict and Submit results
* * * *
### 1. Introduction

This kernel is basic start in deep learning. 

CIFAR-10 (Canadian Institute For Advanced Research) is the type “hello world” dataset of computer vision. This dataset is a collection of images that are commonly used to train machine learning and computer vision algorithms. It is one of the most widely used datasets for machine learning research. The CIFAR-10 dataset contains 60,000 32x32 color images in 10 different classes. The 10 different classes represent airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks. There are 6,000 images of each class. In this competition, your goal is to correctly identify different object from a dataset of tens of thousands of color images.
* * * *

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# import backend
import tensorflow as  tf

# Model architecture
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Conv2D
from keras.layers import MaxPool2D, Activation, MaxPooling2D
from keras.layers.normalization import BatchNormalization

# Annealer
from keras.callbacks import LearningRateScheduler

# Data processing
from keras.preprocessing.image import ImageDataGenerator
from keras.utils import to_categorical
from keras.preprocessing import image

# Visualization
import matplotlib.pyplot as plt
import seaborn as sns


# Progressor
from tqdm import tqdm
import h5py


**Dataset Description:**

* **train.7z -** a folder containing the training images in png format
* **test.7z -** a folder containing the test images in png format
* **trainLabels.csv -** the training labels
* **sampleSubmission.csv -** We have to predict labels for all 300,000 images.

Train and Test files are present a .7z compressed format. You can process file from .7z extension code is given below, but it will take long enough time to process.
<pre><code>
# Reading .7z content
!pip install pyunpack # install decoder package
!pip install patool # install requirement

from pyunpack import Archive

print("start unpacking train data")
%time Archive('/kaggle/input/cifar-10/train.7z').extractall('/kaggle/working/dataset/', auto_create_dir=True, patool_path=None)
print("unpacking finished")
print("start unpacking test data")
%time Archive('/kaggle/input/cifar-10/test.7z').extractall('/kaggle/working/dataset/', auto_create_dir=True, patool_path=None)
print("unpacking finished")

### This will take approx 2-3 hr to read file from test.7z.
</code></pre>

I downloaded the zip file, and processed it and created files in batches.You can use this data for your work, data is organised as follow:
* **train_data-** it is generated from train.7z
* **test_data_1-** first 50k examples from test.7z
* **test_data_2-** index 50k to 100k examples from test.7z
* **test_data_3- ** index 100k to 150k examples from test.7z
* **test_data_4-** index 150k to 200k examples from test.7z
* **test_data_5- ** index 200k to 250k examples from test.7z
* **test_data_6-** index 250k to 300k examples from test.7z


### 2. Data Prepration
**Load Data**

In [None]:
# Read File
sample_submission = pd.read_csv('../input/cifar-10/sampleSubmission.csv')
train_labels = pd.read_csv('../input/cifar-10/trainLabels.csv')

print("Number of training sample: ",train_labels.shape[0])
print("Number of test samples: ", sample_submission.shape[0])

In [None]:
import h5py
with h5py.File('../input/cifardata/train_data.h5', 'r') as file:
    #for key in file.keys():
     #   print(key)
    train_ids = pd.DataFrame(np.array(np.squeeze(file['train_ids'])),columns=['id'])
    train_data = np.array(file['train_images']).reshape(-1, 32, 32, 3)
    
with h5py.File('../input/cifardata/test_data_1.h5', 'r') as file:
    test_ids_1 = pd.DataFrame(np.array(np.squeeze(file['test_ids'])),columns=['id'])
    test_data_1 = np.array(file['test_images']).reshape(-1, 32, 32, 3)
    
with h5py.File('../input/cifardata/test_data_2.h5', 'r') as file:
    test_ids_2 = pd.DataFrame(np.array(np.squeeze(file['test_ids'])),columns=['id'])
    test_data_2 = np.array(file['test_images']).reshape(-1, 32, 32, 3)
    
    
with h5py.File('../input/cifardata/test_data_3.h5', 'r') as file:
    test_ids_3 = pd.DataFrame(np.array(np.squeeze(file['test_ids'])),columns=['id'])
    test_data_3 = np.array(file['test_images']).reshape(-1, 32, 32, 3)
    
    
with h5py.File('../input/cifardata/test_data_4.h5', 'r') as file:
    test_ids_4 = pd.DataFrame(np.array(np.squeeze(file['test_ids'])),columns=['id'])
    test_data_4 = np.array(file['test_images']).reshape(-1, 32, 32, 3)
    
    
with h5py.File('../input/cifardata/test_data_5.h5', 'r') as file:
    test_ids_5 = pd.DataFrame(np.array(np.squeeze(file['test_ids'])),columns=['id'])
    test_data_5 = np.array(file['test_images']).reshape(-1, 32, 32, 3)
    
    
with h5py.File('../input/cifardata/test_data_6.h5', 'r') as file:
    test_ids_6 = pd.DataFrame(np.array(np.squeeze(file['test_ids'])),columns=['id'])
    test_data_6 = np.array(file['test_images']).reshape(-1, 32, 32, 3)
    


In [None]:
test_data = np.concatenate([test_data_1, test_data_2, test_data_3, test_data_4, test_data_5, test_data_6], axis=0)
test_ids = pd.concat([test_ids_1, test_ids_2, test_ids_3, test_ids_4, test_ids_5, test_ids_6], axis=0).reset_index()
#print(test_data.head())
print(test_ids.head())

In [None]:
# check load test data is consistent are not
test_ids.id.value_counts().sort_index() - sample_submission.id.value_counts().sort_index()

In [None]:
sample_submission.id.value_counts().sort_index()

In [None]:
# check train data is consistent or not
sum(train_ids.id == train_labels.id)

In [None]:
# shape of data
print(train_data.shape)
print(test_data.shape)
print(train_ids.shape)
print(test_ids.shape)

In [None]:
# Distribution of classes in training samples

train_labels.label.value_counts().plot(kind='bar', title='Distribution of classes')

**Normalization, Reshape and Label Encoding**

***Normalization***
* We perform a grayscale normalization to reduce the effect of illumination's differences.
* If we perform normalization, CNN works faster.
***Reshape***
* Train and test images (32 x 32)
* We reshape all data to -1*28x28x1 4D matrices.

***Label Encoding***
* Encode labels to one hot vectors
> 2 => [0,0,1,0,0,0,0,0,0,0]
> 4 => [0,0,0,0,1,0,0,0,0,0]

In [None]:
# Function to reshape and scaling image
def Scale_Reshape(x):
    x_min = x.min(axis=(1, 2), keepdims=True)
    x_max = x.max(axis=(1, 2), keepdims=True)

    x = (x - x_min)/(x_max-x_min)
    
    x = x.reshape(-1, 32, 32, 3)
    return x


In [None]:
# Training data processing
train = Scale_Reshape(train_data)

# Test data processing
test = Scale_Reshape(test_data)

# Label processing

Y=train_labels['label']
# convert to one-hot
Y = pd.get_dummies(Y)


print("train shape: ", train.shape)
print("test shape: ", test.shape)
print("one-hot label shape: ", Y.shape)

In [None]:
# Label encoding
from sklearn.preprocessing import LabelEncoder

lb_make = LabelEncoder()
label_int = train_labels[['label']].copy()
label_int.label = lb_make.fit_transform(label_int.label)


**Visualize test and train sample**
##### Train sample

In [None]:
# visualizing training samples
plt.figure(figsize=(15,5))
for i in range(40):  
    plt.subplot(4, 10, i+1)
    plt.imshow(train_data[i].reshape((32, 32, 3)),cmap=plt.cm.hsv)
    plt.title(f"{train_labels.label[i]}")
    plt.axis('off')
plt.subplots_adjust(wspace=0.3, hspace=0.3)
plt.show()

##### Test sample

In [None]:
# visualizing test samples
plt.figure(figsize=(15,5))
for i in range(40):  
    plt.subplot(4, 10, i+1)
    plt.imshow(test[i].reshape((32,32, 3)),cmap=plt.cm.hsv)
    plt.axis('off')
plt.subplots_adjust(wspace=0.0, hspace=0.0)
plt.show()

### 3. Model Building: 

**Split training and valdiation set**

In [None]:
# split training and validation set.
X_train, X_val, Y_train, Y_val = train_test_split(train, Y, random_state=0, test_size=0.1)
print("X_train shape: ", X_train.shape)
print("Y_train shape: ", Y_train.shape)
print("X_val shape: ", X_val.shape)
print("Y_val shape: ", Y_val.shape)

**Define the model architechture**

**Convolutional neural networks (CNNs)** are the current **state-of-the-art model architecture** for image classification tasks. CNNs apply a series of filters to the raw pixel data of an image to extract and learn higher-level features, which the model can then use for classification. CNNs contains different components:

***Convolutional layers***, which apply a specified number of convolution filters to the image. For each subregion, the layer performs a set of mathematical operations to produce a single value in the output **feature map**. Convolutional layers then typically apply a **ReLU activation** function) to the output to introduce **nonlinearities** into the model.

***BatchNormalization layers***, Batch Normalization is a technical trick to make training faster.

***Dropout layers***, Dropout is a regularization method, where the layer randomly replaces a proportion of its weights to zero for each training sample. This forces the net to learn features in a distributed way, not relying to much on a particular weight, and therefore improves generalization.

***Activation layers***, Activation functions are really important for a any Neural Network to learn and make sense of something really complicated and Non-linear complex functional mappings between the inputs and response variable.They introduce **non-linear properties** to our Network.

***Pooling layers***, which downsample the image data extracted by the convolutional layers to reduce the dimensionality of the feature map in order to decrease processing time. A commonly used pooling algorithm is max pooling, which extracts subregions of the feature map (e.g., 2x2-pixel tiles), keeps their maximum value, and discards all other values. it gives **rotation invariant** feature extraction ability to model. 

***Dense (fully connected) layers***, which perform classification on the features extracted by the convolutional layers and downsampled by the pooling layers. In a dense layer, every node in the layer is connected to every node in the preceding layer.

Typically, a CNN is composed of a stack of convolutional modules that perform feature extraction. Each module consists of a convolutional layer followed by a pooling layer. The last convolutional module is followed by one or more dense layers that perform classification. The final dense layer in a CNN contains a single node for each target class in the model (all the possible classes the model may predict), with a softmax activation function to generate a value between 0–1 for each node (the sum of all these softmax values is equal to 1). We can interpret the softmax values for a given image as relative measurements of how likely it is that the image falls into each target class.



Let's build a model to classify the images in the CIFAR-10 dataset using the following CNN architecture:


In [None]:
# BUILD CONVOLUTIONAL NEURAL NETWORKS

model = Sequential()

model.add(Conv2D(32,  kernel_size = 3,kernel_initializer='he_normal', activation='relu', input_shape = (32, 32, 3)))
model.add(BatchNormalization())
model.add(Dropout(0.2))
model.add(Conv2D(64, kernel_size = 3, kernel_initializer='he_normal', strides=1, activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(128, kernel_size = 3, strides=1, kernel_initializer='he_normal' ,padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, kernel_size = 3,kernel_initializer='he_normal', activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.2))


model.add(Flatten())
model.add(Dense(512,kernel_initializer='he_normal', activation = "relu"))
model.add(Dropout(0.2))
model.add(Dense(10, kernel_initializer='glorot_uniform', activation = "softmax"))


# Compile the model
model.compile(loss="categorical_crossentropy", optimizer="Nadam", metrics=["accuracy"])

# Summary of model
model.summary()


**Set the optimizer and annealer**

We train once with a smaller learning rate to ensure convergence. We then speed things up, only to reduce the learning rate by 10% every epoch. Keras has a function for this:

In [None]:
"""
import math
# learning rate schedule
def step_decay(epoch):
    initial_lrate = 0.1
    drop = 0.5
    epochs_drop = 3.0
    lrate = initial_lrate * math.pow(drop, math.floor((1+epoch)/epochs_drop))
    return lrate

# learning schedule callback
annealer = LearningRateScheduler(step_decay)
callbacks_list = [annealer]
"""

**Data augmentation**

In order to avoid overfitting problem, we need to expand artificially our dataset. We can make existing dataset even larger by altering the training data with small transformations.

For the data augmentation, i choosed to :

* Randomly rotate some training images by 10 degrees
* Randomly Zoom by 10% some training images
* Randomly shift images horizontally by 10% of the width
* Randomly shift images vertically by 10% of the height
* Randomly flip images horizontally.

I did not apply a vertical_flip since it could have lead to misclassify symetrical object like aeroplane and bird, truck and shi.

In [None]:
# data augumetation
datagen = ImageDataGenerator(
        rotation_range=0,  
        zoom_range = 0.0,  
        width_shift_range=0.1, 
        height_shift_range=0.1,
        horizontal_flip=False)

# data generator model to train and validation set
batch_size_1 = 500
train_gen = datagen.flow(X_train, Y_train, batch_size=batch_size_1)
val_gen = datagen.flow(X_val, Y_val, batch_size=batch_size_1)

**Generated sample**

In [None]:
# visualizing augumented image
X_train_augmented = X_train[9,].reshape((1,32,32,3))
Y_train_augmented = np.array(Y_train.iloc[9,:]).reshape((1,10))
plt.figure(figsize=(15,4.5))
for i in range(30):  
    plt.subplot(3, 10, i+1)
    X_train2, Y_train2 = datagen.flow(X_train_augmented,Y_train_augmented).next()
    plt.imshow(X_train2[0].reshape((32,32,3)),cmap=plt.cm.gray)
    plt.axis('off')
    if i==9: X_train_augmented = X_train[2000,].reshape((1,32,32,3))
    if i==19: X_train_augmented = X_train[1180,].reshape((1,32,32,3))
plt.subplots_adjust(wspace=-0.1, hspace=-0.1)
plt.show()

**Train model**

In [None]:
# traing parameters
epochs = 2
batch_size = 32

In [None]:
# Fit the model
history = model.fit_generator(train_gen, 
                              epochs = epochs, 
                              steps_per_epoch = X_train.shape[0] // batch_size,
                              validation_data = val_gen,
                              validation_steps = X_val.shape[0] // batch_size,
                              
                              verbose=1)

In [None]:
final_loss, final_acc = model.evaluate(X_val, Y_val, verbose=1)
print("Final loss: {0:.4f}, final accuracy: {1:.4f}".format(final_loss, final_acc))

### 4. Evaluate the model

**Training and validation curves**



In [None]:
# Plot the loss and accuracy curves for training and validation 
fig, ax = plt.subplots(2,1, figsize=(15, 5))
ax[0].plot(history.history['loss'], color='b', label="Training loss")
ax[0].plot(history.history['val_loss'], color='r', label="validation loss",axes =ax[0])
legend = ax[0].legend(loc='best', shadow=True)

ax[1].plot(history.history['accuracy'], color='b', label="Training accuracy")
ax[1].plot(history.history['val_accuracy'], color='r',label="Validation accuracy")
legend = ax[1].legend(loc='best', shadow=True)

**Visualize Result**

In [None]:
# making predictions
prediction = model.predict_classes(X_val)

# PREVIEW PREDICTIONS
plt.figure(figsize=(20,8))
for i in range(40):  
    plt.subplot(4, 10, i+1)
    plt.imshow(X_val[i].reshape((32,32,3)),cmap=plt.cm.gray)
    plt.title(f"predict={Y.columns.values[prediction[i]]}")
    plt.axis('off')
plt.subplots_adjust(wspace=0.3, hspace=0.3)
plt.show()

### 5. Prediction and submition
**Predict and Submit results**

In [None]:
# prediction on test
prediction = model.predict_classes(test)


***Predicted samples distribution***

In [None]:
#test_ids.head()
test_ids['label'] =str(0)
print(test_ids.head())
for i in tqdm(range(sample_submission.shape[0])):
    test_ids.loc[i, 'label'] = Y.columns.values[prediction[i]]

print(test_ids.head())
    

In [None]:
final_file = pd.merge((sample_submission, teat_ids), how='inner', left_on='id', right_on='id')
final_file.to_csv('samplefile.csv')

In [None]:
# make submission
"""
for i in tqdm(range(sample_submission.shape[0])):
    if i<100000 and i>=0:
        sample_submission.loc[i, 'label'] = Y.columns.values[prediction_1[i]]
    elif i <200000 and i>=100000:
        sample_submission.loc[i, 'label'] = Y.columns.values[prediction_2[i-100000]]
    elif i<300000 and i >=200000:
        sample_submission.loc[i, 'label'] = Y.columns.values[prediction_3[i-200000]]
"""       
#sample_submission.to_csv('sampleSubmission.csv')
#sample_submissions.head(20)

In [None]:
# distribution of predicted class
#sample_submission.label.value_counts().plot(kind='bar', title='Pridicted class ditribution', figsize=(15, 4.5))

### References: 

https://keras.io/models/sequential/
https://keras.io/layers/core/
https://keras.io/layers/convolutional/
https://keras.io/layers/pooling/
https://www.kaggle.com/yassineghouzam/introduction-to-cnn-keras-0-997-top-6
https://www.kaggle.com/toregil/welcome-to-deep-learning-cnn-99
https://www.kaggle.com/kanncaa1/convolutional-neural-network-cnn-tutorial
https://github.com/MazenAly/Cifar100

In [None]:
"""
# create model
model=Sequential()

#model.add(Lambda(standardize,input_shape=(28,28,1)))    
model.add(Conv2D(filters=32, kernel_size = (5,5), activation="relu", input_shape=(32,32,1)))
model.add(Conv2D(filters=64, kernel_size = (3,3), activation="relu"))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(BatchNormalization())
model.add(Conv2D(filters=64, kernel_size = (3,3), activation="relu"))
#model.add(Dropout(0.25))
model.add(Conv2D(filters=64, kernel_size = (3,3), activation="relu"))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(BatchNormalization())    
model.add(Conv2D(filters=128, kernel_size = (3,3), activation="relu"))

model.add(MaxPooling2D(pool_size=(2,2)))
    
model.add(Flatten())
model.add(BatchNormalization())
model.add(Dense(256,activation="relu"))
model.add(Dropout(0.25)) 
model.add(Dense(10,activation="softmax"))
"""