# Image Classification: Dogs and Cats
Trains a model to classify an image as a cat or a dog using 25,000 images.


Below we will do the following:

1. Setup the training environment.
2. Load 25,000 cat and dog images.
3. Build a classification model to predict whether an image is a cat or a dog.
4. Convert the model to CoreML and download it.

The example is based on [Turi Create's Image Classifier](https://apple.github.io/turicreate/docs/userguide/image_classifier/).

## Environment Setup
Below we ensure `CUDA 10` is installed and then use pip to install `turicreate` and `mxnet-cu100` libraries.

In [0]:
# Confirm that you have CUDA 10
!nvcc --version

In [0]:
# Install libraries - you might need to restart the runtime after doing this
!pip install turicreate==5.4
# The worng version of MXNET will be installed
!pip uninstall -y mxnet
# Install CUDA10-compatible version of mxnet 1.1.0
!pip install mxnet-cu100

## Data Preparation and Model Training
The training data for this example are 25,000 images, 12,500 cats and 12,500 dogs. The original data set is [here](https://www.microsoft.com/en-us/download/details.aspx?id=54765) and we have also included it in the public bucket listed below.

After unzipping and extracting the images, they are loaded into a Turi Create SFrame and labels are created for each image based on the path. The data is randomly split into train and test sets, where 80% of the data is used for training and 20% is used for model evaluation (if you desire). Training this model with a GPU is much faster than CPU time. By default, this runtime environment should be using a Python 3 GPU backend instance. Below, we tell Turicreate to use all available GPUs for processing.

In [0]:
# Import libraries and tell Turicreate to use all GPUs available - this may throw a warning
import urllib
import tarfile

import coremltools
import turicreate as tc
tc.config.set_num_gpus(-1)

In [0]:
# Specify the data set download url
data_url = "https://s3.amazonaws.com/skafos.example.data/ImageClassifier/PetImages.tar.gz"
data_path = "PetImages.tar.gz"

# Pull the compressed data and extract it
retrieve = urllib.request.urlretrieve(data_url, data_path)
tar = tarfile.open(data_path)
tar.extractall()
tar.close()

In [0]:
# Load images (Note: you can ignore 'Unexpected JPEG decode failure' errors)
data = tc.image_analysis.load_images('PetImages', with_path=True, ignore_failure=True)

# From the path-name, create a label column. This labels each image as either a dog or a cat 
data['label'] = data['path'].apply(lambda path: 'dog' if '/Dog' in path else 'cat')

# Make a train-set split
train_data, test_data = data.random_split(0.8)

In [0]:
train_data.head()

In [0]:
# Train an image classification model - consider increasing max_iterations
model = tc.image_classifier.create(
    dataset=train_data,
    target='label',
    model='resnet-50',
    batch_size=32,
    max_iterations=30
)

# Image Classification Training Docs:
# https://apple.github.io/turicreate/docs/api/generated/turicreate.image_classifier.create.html#turicreate.image_classifier.create

## Model Export and Download
We convert the model to CoreML format so that it can run on an iOS device. Then we download it locally so it can be delivered to your apps with **[Skafos](https://skafos.ai)**.

In [0]:
# Specify the CoreML model name
model_name = 'ImageClassifier'
coreml_model_name = model_name + '.mlmodel'

# Export the trained model to CoreML format
res = model.export_coreml(coreml_model_name) 


In [0]:
# Download the model you just trained. This may take a few moments and may throw an exception. It should still download.
from google.colab import files
files.download(coreml_model_name)

# If it fails to download, or downloads a corrupt file...
# Use the file explorer to download the .mlmodel file manually
# --> On the upper left side, expand the window, click the "Files" tab, right-click the file and select "download"