# CIFAR-10 Convolutional neural network

## Exercise - Load data

> **Exercise**: Load the CIFAR-10 data. Normalize the images and split them into train, validation and test sets. Define a `get_batches(X, y, batch_size)` function to generate random X/y batches of size `batch_size` using a Python generator.

### Load and normalize data

Normalization is done the same way we've done in previous unit (minus 128 and divide by 255)

In [4]:
import numpy as np
import os

CIFAR_NUMBER=6

# Load data
with np.load(os.path.join('data', 'cifar10-{}k.npz'.format(CIFAR_NUMBER)), allow_pickle=False) as npz_file:
    cifar = dict(npz_file.items())
    
# Convert pixels into floating point numbers
data = cifar['data'].astype(np.float32)

# Rescale pixel values between -0.5 and 0.5
data = (data - 128) / 255
print("CIFAR10 data shape:",data.shape)

CIFAR10 data shape: (6000, 3072)


### Create train and valid dataset

In [7]:
from sklearn.model_selection import train_test_split

img_width=32
img_height=32
img_nb_color=3

# Split into train and validation sets
X_train, X_valid, y_train, y_valid = train_test_split(
    # Reshape images (32 by 32 on 3 colors)
    data.reshape(-1, img_width, img_height, img_nb_color), # single channel (grayscale)
    cifar['labels'],
    test_size=500, random_state=0
)

# Print shape
print('Train:', X_train.shape, y_train.shape)
print('Valid:', X_valid.shape, y_valid.shape)

Train: (5500, 32, 32, 3) (5500,)
Valid: (500, 32, 32, 3) (500,)


### Define batch generator function

In [9]:
# Batch generator
def get_batches(X, y, batch_size):
    # Shuffle X,y
    shuffled_idx = np.arange(len(y)) # 1,2,...,n
    np.random.shuffle(shuffled_idx)
    
    # Enumerate indexes by steps of batch_size
    # i: 0, b, 2b, 3b, 4b, .. where b is the batch size
    for i in range(0, len(y), batch_size):
        # Batch indexes
        batch_idx = shuffled_idx[i:i+batch_size]
        yield X[batch_idx], y[batch_idx]

## Exercise - Create and train a ConvNet

> **Exercise:** Create a convolutional neural network and train it using your batch generator. Evaluate the accuracy on the validation set after each epoch. Test different architectures and parameters. Evaluate your best network on the test set. Save the trained kernel weights of the first convolutional layer in a variable.

In [None]:
???

## Exercise - Visualize kernels

> **Exercise**: Plot the kernels from the first convolutional layer with the `imshow()` function.

**Hint**: Remember that the `imshow()` function expects values between 0 and 1 for 3-dimensional arrays.

In [None]:
???