# Image Classification: More Pets
 Trains a model to classify an image as a rabbit, mouse, hamster, fish, lizard, or snake.

Below we do the following:

1. Setup training environment.
2. Load images of rabbits, mic, hamsters, fish, lizards, and snakes.
3. Train an image classifier model.
4. Convert the model to CoreML format and download it.

## Environment Setup

Below we ensure `CUDA 10` is installed and then use pip to install `turicreate` and `mxnet-cu100` libraries.

In [0]:
# Confirm that you have CUDA 10
!nvcc --version

In [0]:
# Install libraries - you will need to restart the runtime after doing this
!pip install turicreate
# The wrong version of MXNET will be installed
!pip uninstall -y mxnet
# Install CUDA10-compatible version of mxnet-cuXX
!pip install mxnet-cu100

## Data Preparation and Model Training
The training data for this example are hundreds of images of various animals, pulled from the [Open Images Dataset v4](https://storage.googleapis.com/openimages/web/index.html). 

After unzipping and extracting the images, they are loaded into a Turi Create SFrame and labels are created for each image based on the path. The data is randomly split into train and test sets, where 80% of the data is used for training and 20% is used for model evaluation (if you desire). Training this model with a GPU is much faster than CPU time. By default, this runtime environment should be using a Python 3 GPU backend instance. Below, we tell Turicreate to use all available GPUs for processing.

In [0]:
# Import necessary libraries - this may throw warnings
import os
import urllib
import tarfile

import coremltools
import turicreate as tc

In [0]:
# Specify the data set download url
data_url = "https://s3.amazonaws.com/skafos.example.data/ImageClassifier/MorePets.tar.gz"
data_path = "MorePets.tar.gz"

# Pull the compressed data and extract it
retrieve = urllib.request.urlretrieve(data_url, data_path)
tar = tarfile.open(data_path)
tar.extractall()
tar.close()

In [0]:
# Load images - you can ignore various jpeg decode errors
data = tc.image_analysis.load_images('MorePets', with_path=True, ignore_failure=True)

# From the path-name, create a label column. This labels each image as the appropriate plant
data['label'] = data['path'].apply(lambda path: os.path.basename(os.path.dirname(path)))

In [0]:
# Make a train-set split
train_data, test_data = data.random_split(0.8)

In [0]:
# Let's take a look at some training data
train_data.head()

In [0]:
# Train an image classification model - consider increasing max_iterations
model = tc.image_classifier.create(
    dataset=train_data,
    target='label',
    model='resnet-50',
    batch_size=32,
    max_iterations=20
)

# Image Classification Training Docs:
# https://apple.github.io/turicreate/docs/api/generated/turicreate.image_classifier.create.html#turicreate.image_classifier.create

## Model Evaluation

In [0]:
# Let's see how the model performs on the hold out tes data
predictions = model.predict(test_data)
accuracy = tc.evaluation.accuracy(test_data['label'], predictions)
print(f"Image classifer is {accuracy*100} % accuracte on the testing dataset", flush=True)

## Model Export and Download
We convert the model to CoreML format so that it can run on an iOS device. Then we download it locally so it can be delivered to your apps with Skafos.

In [0]:
# Specify the CoreML model name
model_name = 'ImageClassifier'
coreml_model_name = model_name + '.mlmodel'

# Export the trained model to CoreML format
res = model.export_coreml(coreml_model_name)

In [0]:
# Download the model you just trained. This may take a few moments and may throw an exception. It should still download.
from google.colab import files
files.download(coreml_model_name)

# If it fails to download, or downloads a corrupt file...
# Use the file explorer to download the .mlmodel file manually
# --> On the upper left side, expand the window, click the "Files" tab, right-click the file and select "download"