# Image Classification with Keras

**XBUS-512: Introduction to AI and Deep Learning**

In this exercise, we will see how to build a image classifier with Keras.

We will be using the [CIFAR-10 dataset](http://www.cs.toronto.edu/~kriz/cifar.html), which consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch contains exactly 1000 randomly-selected images from each class. The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batches contain exactly 5000 images from each class.


## Imports

In [19]:
import os
import pickle
import tarfile
import requests

## Download and Wrangle the Data

In [16]:
def fetch_data(url, fname):
    """
    Helper method to retreive the data from the UCI ML Repository.
    """
    response = requests.get(url)
    outpath  = os.path.abspath(fname)
    with open(outpath, "wb") as f:
        f.write(response.content)
    
    return outpath

In [23]:
# Fetch and unzip the data

URL = "http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz"
TARRED_FILE_PATH = "cifar-10-python.tar.gz"
UNTARRED_DIR = "cifar-10-python"
if not os.path.exists(os.path.join("../fixtures", UNTARRED_DIR)):
    os.makedirs(os.path.join("..", "fixtures", UNTARRED_DIR))

tarred_files = fetch_data(URL, os.path.join("..", "fixtures", TARRED_FILE_PATH))

In [25]:
tar = tarfile.open(tarred_files)
tar.extractall(path=os.path.join("..", "fixtures", UNTARRED_DIR))
tar.close()

The archive `cifar-10-python` contains the files `data_batch_1`, `data_batch_2`, ..., `data_batch_5`, as well as `test_batch`. 
Each of these files is a Python "pickled" object produced with cPickle, so they have to be unpickled to be read as dictionary objects. 

In [26]:
def unpickle(file):
    with open(file, "rb") as fo:
        return pickle.load(fo, encoding="bytes")

In [43]:
train = unpickle(os.path.join(
    "..", 
    "fixtures", 
    UNTARRED_DIR, 
    "cifar-10-batches-py",
    "data_batch_1"
))

test = unpickle(os.path.join(
    "..", 
    "fixtures", 
    UNTARRED_DIR, 
    "cifar-10-batches-py",
    "test_batch"
))

X_train = train[b'data']
y_train = train[b'labels']

X_test = test[b'data']
y_test = test[b'labels']