# Producing CNTK and Tensorflow models for image classification

In this notebook, we illustrate how one can produce residual networks (ResNets) to classify aerial images based on land use type (developed, forested, cultivated, etc.). We apply transfer learning with Microsoft Cognitive Toolkit (CNTK) and Tensorflow (TF) to adapt pretrained models for our classification use case.

This notebook is part of the [Embarrassingly Parallel Image Classification](https://github.com/Azure/Embarrassingly-Parallel-Image-Classification) git repository. It assumes that a dataset and Azure N-series GPU VM have already been created for model training as described in the previous [Image Set Preparation](https://github.com/Azure/Embarrassingly-Parallel-Image-Classification/blob/master/image_set_preparation.ipynb) notebook. For instructions on applying the trained models to large image sets using Spark, see the [Scoring on Spark](https://github.com/Azure/Embarrassingly-Parallel-Image-Classification/blob/master/scoring_on_spark.ipynb) notebook.

## Outline
- [Prepare the VM and training data](#input)
- [Clone or download this repository](#repo)
- [Retrain an AlexNet with Microsoft Cognitive Toolkit (CNTK)](#cntk)
   - [Install CNTK 2.0 beta version 12 for GPU](#installcntk)
   - [Download the pretrained model](#alexnet)
   - [Update and run the training script](#cntkrun)
- [Retrain a pretrained ResNet with TensorFlow](#tensorflow)
   - [Download a pretrained model](#tfmodel)
   - [Run the training script](#tfrun)
- [Next Steps](#nextsteps)

<a name="input"></a>
## Prepare the VM and training data

If you have not done so already, please complete the instructions in the [Image Set Preparation](https://github.com/Azure/Embarrassingly-Parallel-Image-Classification/blob/master/image_set_preparation.ipynb) notebook to prepare an Azure Data Science VM with the Deep Learning Toolkit and the necessary training data for this tutorial. Note that if you will use our provided training and validation images, it is sufficient to complete the "Prepare an Azure Data Science Virtual Machine for image extraction" and "Dataset preparation for deep learning" sections.

<a name="#repo"></a>
## Clone or download this repository

This repository ([Embarrassingly Parallel Image Classification](https://github.com/Azure/Embarrassingly-Parallel-Image-Classification)) contains Python scripts that will be referenced by the code cells below. Clone or download/decompress the repository's contents to a directory on your Azure GPU VM and make note of the path.

<a name="cntk"></a>
## Retrain an AlexNet with Microsoft Cognitive Toolkit (CNTK)
<a name="installcntk"></a>
### Install CNTK 2.0 RC1 for GPU

As of this writing, the Deep Learning Toolkit does not include the most recent version of CNTK. To use the training script as written, you will need to download and install [CNTK 2.0 RC1 for GPU](https://github.com/Microsoft/CNTK/releases/tag/v2.0.rc1). Installation directions can be found [online](https://github.com/Microsoft/CNTK/wiki/Setup-Windows-Binary-Script). We recommend following the instructions for a "script-based" installation, and calling the installation script with the following optional parameters:
```
install.bat -AnacondaBasePath C:\Anaconda -PyVersion 35 -Execute
```

<a name="alexnet"></a>
### Download the pretrained model
You will need to download [the pretrained AlexNet model](https://mawahstorage.blob.core.windows.net/aerialimageclassification/models/AlexNet_cntk2beta15.model) and save the file to a new directory on your temporary storage drive, `D:\models`.

<a name="cntkrun"></a>
### Update and run the training script
The `retrain.py` script in the `cntk` subfolder of this repo can be used to retrain an AlexNet for aerial image classification. The script is adapted from the [Object Detection using Fast-R-CNN](https://github.com/Microsoft/CNTK/tree/master/Examples/Image/Detection/FastRCNN) example in the [CNTK](https://github.com/Microsoft/CNTK) repository. If training on a multi-GPU VM, see the [CNTK ResNet/CIFAR10 image classification](https://github.com/Microsoft/CNTK/tree/master/Examples/Image/Classification/ResNet/Python) use case for example code illustrating distributed training.

Input, output, and model locations are hardcoded in `retrain.py`. If you used our default locations, no edits should be necessary. If you need to edit or view the script for any reason, we recommend using either WordPad or Visual Studio (both of which are pre-installed on the VM).

Run the `retrain.py` script in the `cntk` subfolder from an Anaconda prompt as follows:

The training script will load the pretrained AlexNet model, removing the final layer and freezing the weights in all retained layer. A transfer learning model is then created by subtracting an approximate mean value from the RGB channels of the input image, applying the frozen retained layers of AlexNet, and finally applying a dense, trainable last layer. The transfer learning model's output label is taken to be the index of the maximally-activated node in the final layer.

The training script applies several transforms when each minibatch's images are loaded, including a random crop/rescaling and random colorization. These transforms generate variety in the input set, limiting the degree of overfitting.

For details of the model evaluation process, please see the scoring notebook in the [Embarrassingly Parallel Image Classification](https://github.com/Azure/Embarrassingly-Parallel-Image-Classification) repository.

<a name="tensorflow"></a>
## Retrain a pretrained ResNet with TensorFlow

We made use of the [`tf-slim` API](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim) for Tensorflow, which provides pre-trained ResNet models and helpful scripts for retraining and scoring. During training set preparation, we created the [TFRecords](https://www.tensorflow.org/how_tos/reading_data/#file_formats) that the training script will use as input. For more details on the training data, please see the image preparation notebook in the [Embarrassingly Parallel Image Classification](https://github.com/Azure/Embarrassingly-Parallel-Image-Classification) repository. 

Our retraining script, `retrain.py` in the `tf` folder of [this repository](https://github.com/Azure/Embarrassingly-Parallel-Image-Classification), is a modified version of `train_image_classifier.py` from the [Tensorflow models repo's slim subdirectory](https://github.com/tensorflow/models/tree/master/slim).

<a name="tfmodel"></a>
### Download a pretrained model

We obtained a 50-layer ResNet pretrained on ImageNet from a link in the [Tensorflow models repo's slim subdirectory](https://github.com/tensorflow/models/tree/master/slim). The pretrained model can be obtained and unpacked with the code snippet below:

In [1]:
import urllib.request
import tarfile
import os

repo_dir = 'D:\\repo'

urllib.request.urlretrieve('http://download.tensorflow.org/models/resnet_v1_50_2016_08_28.tar.gz',
                           os.path.join(repo_dir, 'tf', 'resnet_v1_50_2016_08_28.tar.gz'))
with tarfile.open(os.path.join(repo_dir, 'tf', 'resnet_v1_50_2016_08_28.tar.gz'), 'r:gz') as f:
    f.extractall(path=os.path.join(repo_dir, 'tf'))
os.remove(os.path.join(repo_dir, 'tf', 'resnet_v1_50_2016_08_28.tar.gz'))

<a name="tfrun"></a>
### Run the training script

We recommend that you run the training script from an Anaconda prompt. The code cell below will help you generate the appropriate command based on your file locations.

In [4]:
# path where retrained model and logs will be saved during training
train_dir = os.path.join(repo_dir, 'tf', 'models')
if not os.path.exists(train_dir):
    os.makedirs(train_dir)
    
# location of the unpacked pretrained model
checkpoint_path = os.path.join(repo_dir, 'tf', 'resnet_v1_50.ckpt')

# Location of the TFRecords and other files generated during image set preparation
training_image_dir = 'D:\\balanced_training_set'

command = '''activate py35
python {0} --train_dir={1} --dataset_name=aerial --dataset_split_name=train --dataset_dir={2} --checkpoint_path={3}
'''.format(os.path.join(repo_dir, 'tf', 'retrain.py'),
           train_dir,
           training_image_dir,
           checkpoint_path)

print(command)

activate py35
python D:\repo\tf\retrain.py --train_dir=D:\repo\tf\models --dataset_name=aerial --dataset_split_name=train --dataset_dir=D:\balanced_training_set --checkpoint_path=D:\repo\tf\resnet_v1_50.ckpt



The training script will load the pretrained ResNet model, freezing the weights for all but the final logits layer. The transfer learning model's output label is taken to be the index of the maximally-activated node in the final layer.

The training script applies several transforms when each minibatch's images are loaded, including subtracting an approximation of the mean values for each channel (red, blue, and green) and randomly cropping/colorizing the image. These transforms generate variety in the input set, limiting the degree of overfitting.

For details of the model evaluation process, please see the scoring notebook in the [Embarrassingly Parallel Image Classification](https://github.com/Azure/Embarrassingly-Parallel-Image-Classification) repository.

<a name="nextsteps"></a>
## Next Steps

For details of the model evaluation process, please see the [scoring notebook](https://github.com/Azure/Embarrassingly-Parallel-Image-Classification/blob/master/scoring_on_spark.ipynb) in the [Embarrassingly Parallel Image Classification](https://github.com/Azure/Embarrassingly-Parallel-Image-Classification) repository.