# Painting Image Recognition with Deep Transfer Learning

This hands-on tutorial shows how to use [Transfer Learning](https://en.wikipedia.org/wiki/Inductive_transfer) to take an existing trained model and adapt it to the painting data.  

This notebook lab adapts and updates the tutorial from the Microsoft tutorial at https://github.com/Microsoft/CNTK/blob/master/Tutorials/CNTK_301_Image_Recognition_with_Deep_Transfer_Learning.ipynb.

## Problem
You acquired a set of painting images which need to be classified.


However, the number of images is far less than what is needed to train a state-of-the-art classifier such as a [Residual Network](https://github.com/KaimingHe/deep-residual-networks). You have a rich annotated data set of images of natural scene images such as shown below (courtesy [t-SNE visualization site](http://cs.stanford.edu/people/karpathy/cnnembed/)).

![](http://www.cntk.ai/jup/cntk301_imagenet.jpg)

This tutorial introduces deep transfer learning as a means to leverage multiple data sources to overcome data scarcity problem.

### Why Transfer Learning?

As stated above, Transfer Learning is a useful technique when, for instance, you know you need to classify incoming images into different categories, but you do not have enough data to train a Deep Neural Network (DNN) from scratch. Training DNNs takes a lot of data, all of it labeled, and often you will not have that kind of data on hand. If your problem is similar to one for which a network has already been trained, though, you can use Transfer Learning to modify that network to your problem with a fraction of the labeled images (we are talking tens instead of thousands). 

### What is Transfer Learning?

With Transfer Learning, we use an existing trained model and adapt it to our own problem. We are essentially building upon the features and concepts that were learned during the training of the base model. With a Convolutional DNN (ResNet_18 in this case), we are using the features learned from ImageNet data and _cutting off_ the final classification layer, replacing it with a new dense layer that will predict the class labels of our new domain. 

The input to the old and the new prediction layer is the same, we simply reuse the trained features. Then we train this modified network, either only the new weights of the new prediction layer or all weights of the entire network.

This can be used, for instance, when we have a small set of images that are in a similar domain to an existing trained model. Training a Deep Neural Network from scratch requires tens of thousands of images, but training one that has already learned features in the domain you are adapting it to requires far fewer. 


In our case, this means adapting a network trained on ImageNet images (dogs, cats, birds, etc.) to paintings. However, Transfer Learning has also been successfully used to adapt existing neural models for translation, speech synthesis, and many other domains - it is a convenient way to bootstrap your learning process.

**Importing CNTK and other useful libraries**

Microsoft's Cognitive Toolkit comes in Python form as `cntk`, and contains many useful submodules for IO, defining layers, training models, and interrogating trained models. We will need many of these for Transfer Learning, as well as some other common libraries for downloading files, unpacking/unzipping them, working with the file system, and loading matrices.

In [1]:
from __future__ import print_function
import glob
import os
import numpy as np
from PIL import Image
import requests
# Some of the flowers data is stored as .mat files
from scipy.io import loadmat
from shutil import copyfile
import sys
import tarfile
import time

# Load the right urlretrieve based on python version
try: 
    from urllib.request import urlretrieve 
except ImportError: 
    from urllib import urlretrieve
    
import zipfile
from pathlib import Path

# Useful for being able to dump images into the Notebook
import IPython.display as D

# Import CNTK and helpers
import cntk as C

Starting Spark application


ID,YARN Application ID,Kind,State,Spark UI,Driver log,Current session?
2,application_1514913157146_0012,pyspark3,idle,Link,Link,✔


SparkSession available as 'spark'.


There are two run modes:
- *Fast mode*: `isFast` is set to `True`. This is the default mode for the notebooks, which means we train for fewer iterations or train / test on limited data. This ensures functional correctness of the notebook though the models produced are far from what a completed training would produce.

- *Slow mode*: We recommend the user to set this flag to `False` once the user has gained familiarity with the notebook content and wants to gain insight from running the notebooks for a longer period with different parameters for training. 

For *Fast mode* we train the model for 5 epochs and results have low accuracy but is good enough for development. The model yields good accuracy after 10-20 epochs.

Feel free to adjust the learning_params below and observe the results. You can tweak the **max_epochs** to train for longer, **mb_size** to adjust the size of each minibatch, or **lr_per_mb** to play with the speed of convergence (learning rate). 

Note that if you've already trained the model, you will want to set force_retraining to **True** to force the Notebook to re-train your model with the new parameters. 

In [29]:
#isFast = True
isFast = False

# Set parameters for the transfer learning
force_retraining = True

max_training_epochs = 5 if isFast else 20

learning_params = {
    'max_epochs': max_training_epochs,
    'mb_size': 50,
    'lr_per_mb': [0.2]*10 + [0.1],
    'momentum_per_mb': 0.9,
    'l2_reg_weight': 0.0005,
    'freeze_weights': True
}

### Data Access

The data has already been uploaded to the attached Azure Blob Storage.  However, we also need to download an already trained image model.

In [3]:
# By default, we store data in the Image directory
data_root = os.path.abspath(os.path.join('..', 'Image'))
    
datasets_path = os.path.join(data_root, 'DataSets')
output_path = os.path.abspath(os.path.join('.', 'temp', 'Output'))

def ensure_exists(path):
    if not os.path.exists(path):
        print('Making Directory: ', path)
        os.makedirs(path)
        
def download_unless_exists(url, filename, max_retries=3):
    '''Download the file unless it already exists, with retry. Throws if all retries fail.'''
    if os.path.exists(filename):
        print('Reusing locally cached: ', filename)
    else:
        print('Starting download of {} to {}'.format(url, filename))
        retry_cnt = 0
        while True:
            try:
                urlretrieve(url, filename)
                print('Download completed.')
                return
            except:
                retry_cnt += 1
                if retry_cnt == max_retries:
                    print('Exceeded maximum retry count, aborting.')
                    raise
                print('Failed to download, retrying.')
                time.sleep(np.random.randint(1,10))        

def download_model(model_root = os.path.join(data_root, 'PretrainedModels')):
    ensure_exists(model_root)
    resnet18_model_uri = 'https://www.cntk.ai/Models/ResNet/ResNet_18.model'
    resnet18_model_local = os.path.join(model_root, 'ResNet_18.model')
    download_unless_exists(resnet18_model_uri, resnet18_model_local)
    print('Downloaded model to: ', Path(resnet18_model_local).resolve())
    return resnet18_model_local

ensure_exists(datasets_path)
ensure_exists(output_path)
print ('datasets_path: ', Path(datasets_path).resolve())
print ('output_path: ', Path(output_path).resolve())

Making Directory:  /mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/Image/DataSets
Making Directory:  /mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/container_1514913157146_0012_01_000001/temp/Output
datasets_path:  /mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/Image/DataSets
output_path:  /mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/container_1514913157146_0012_01_000001/temp/Output

In [None]:
os.listdir(os.path.join(datasets_path, 'Paintings','Test'))

In [4]:
azure_blob_root = 'wasbs://images@marktabblob.blob.core.windows.net/'
paintings_stem = 'Image/DataSets/Paintings/'
training_stem = paintings_stem + 'Train/'
testing_stem = paintings_stem + 'Test/'

# "Downloading" the image files from the Azure Blob Storage uri into the Linux file system
def download_paintings_dataset(dataset_root = os.path.join(datasets_path, 'Paintings')):
    training_dir = os.path.join(dataset_root, 'Train')
    testing_dir = os.path.join(dataset_root, 'Test')
    
    # Make Directories if needed
    ensure_exists(dataset_root)
    ensure_exists(training_dir)
    ensure_exists(testing_dir)    
    
    uniqueclasseslocation = azure_blob_root + paintings_stem + 'uniqueclasses.txt'
    uniqueclassesfile = spark.sparkContext.textFile(uniqueclasseslocation)  
    
    transferfilelocation = azure_blob_root + paintings_stem + 'transferimages.txt'
    transferfile = spark.sparkContext.textFile(transferfilelocation)
    for row in transferfile.collect():
        uri = str(row.split(' ', 1)[0])
        dirname = os.path.join(dataset_root, str(row.split(' ', 2)[1]))
        filename = os.path.join(dataset_root, str(row.split(' ', 2)[1]), str(row.split(' ', 2)[2]))
        
        # Creates Linux directory if needed
        ensure_exists(dirname)
        
        download_unless_exists(uri, filename) 
        print("uri: ", uri, "  ", "dirname: ", dirname, "filename: ", Path(filename).resolve())
    return {
        'training_folder': training_dir,
        'testing_folder': testing_dir,
        'azure_training_map': os.path.join(azure_blob_root, training_stem, 'map.txt'),
        'azure_testing_map': os.path.join(azure_blob_root, testing_stem, 'map.txt'),
        'num_classes': uniqueclassesfile.take(1)[0]     
    }

print('Downloading paintings from Azure Blob Storage to Linux file system, this might take a while...')
paintings_data = download_paintings_dataset()
print('All data now available to the notebook!')

paintings_data

Downloading paintings from Azure Blob Storage to Linux file system, this might take a while...
Making Directory:  /mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/Image/DataSets/Paintings
Making Directory:  /mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/Image/DataSets/Paintings/Train
Making Directory:  /mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/Image/DataSets/Paintings/Test
Making Directory:  /mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/Image/DataSets/Paintings/Train/MonetPainting/
Starting download of https://marktabblob.blob.core.windows.net/images/Image/DataSets/Paintings/Train/MonetPainting/001c0.jpeg to /mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/Image/DataSets/Paintings/Train/MonetPainting/001c0.jpeg
Download completed.
uri:  https://marktabblob.blob.core.windows.net/images/Im

### Pre-Trained Model (ResNet)

For this task, we have chosen ResNet_18 as our trained model and  will it as the base model. This model will be adapted using Transfer Learning for classification of flowers and animals. This model is a [Convolutional Neural Network](https://en.wikipedia.org/wiki/Convolutional_neural_network) built using [Residual Network](https://github.com/KaimingHe/deep-residual-networks) techniques. Convolutional Neural Networks build up layers of convolutions, transforming an input image and distilling it down until they start recognizing composite features, with deeper layers of convolutions recognizing complex patterns are made possible. The author of Keras has a [fantastic post](https://blog.keras.io/how-convolutional-neural-networks-see-the-world.html) where he describes how Convolutional Networks "see the world" which gives a much more detailed explanation.

Residual Deep Learning is a technique that originated in Microsoft Research and involves "passing through" the main signal of the input data, so that the network winds up "learning" on just the residual portions that differ between layers. This has proven, in practice, to allow the training of much deeper networks by avoiding issues that plague gradient descent on larger networks. These cells bypass convolution layers and then come back in later before ReLU (see below), but some have argued that even deeper networks can be built by avoiding even more nonlinearities in the bypass channel. This is an area of hot research right now, and one of the most exciting parts of Transfer Learning is that you get to benefit from all of the improvements by just integrating new trained models.

![](https://adeshpande3.github.io/assets/ResNet.png)



In [5]:
print('Downloading pre-trained model. Note: this might take a while...')
# Download pretrained model to a Linux directory attached to this application instance
base_model_file = download_model()
print('Downloading pre-trained model complete!')

Downloading pre-trained model. Note: this might take a while...
Making Directory:  /mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/Image/PretrainedModels
Starting download of https://www.cntk.ai/Models/ResNet/ResNet_18.model to /mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/Image/PretrainedModels/ResNet_18.model
Download completed.
Downloaded model to:  /mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/Image/PretrainedModels/ResNet_18.model
Downloading pre-trained model complete!

### Inspecting pre-trained model

We print out all of the layers in ResNet_18 to show you how you can interrogate a model - to use a different model than ResNet_18 you would just need to discover the appropriate last hidden layer and feature layer to use. CNTK provides a convenient `get_node_outputs` method under `cntk.graph` to allow you to dump all of the model details. We can recognize the final hidden layer as the one before we start computing the final classification into the 1000 ImageNet classes (so in this case, `z.x`).

In [6]:
# define base model location and characteristics
base_model = {
    'model_file': base_model_file,
    'feature_node_name': 'features',
    'last_hidden_node_name': 'z.x',
    # Channel Depth x Height x Width
    'image_dims': (3, 224, 224)
}

# Print out all layers in the model
print('Loading {} and printing all layers:'.format(base_model['model_file']))
node_outputs = C.logging.get_node_outputs(C.load_model(base_model['model_file']))
for l in node_outputs: print("  {0} {1}".format(l.name, l.shape))

Loading /mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/Image/PretrainedModels/ResNet_18.model and printing all layers:
  ce ()
  errs ()
  top5Errs ()
  z (1000,)
  ce ()
  z (1000,)
  z.PlusArgs[0] (1000,)
  z.x (512, 1, 1)
  z.x.x.r (512, 7, 7)
  z.x.x.p (512, 7, 7)
  z.x.x.b (512, 7, 7)
  z.x.x.b.x.c (512, 7, 7)
  z.x.x.b.x (512, 7, 7)
  z.x.x.b.x._ (512, 7, 7)
  z.x.x.b.x._.x.c (512, 7, 7)
  z.x.x.x.r (512, 7, 7)
  z.x.x.x.p (512, 7, 7)
  z.x.x.x.b (512, 7, 7)
  z.x.x.x.b.x.c (512, 7, 7)
  z.x.x.x.b.x (512, 7, 7)
  z.x.x.x.b.x._ (512, 7, 7)
  z.x.x.x.b.x._.x.c (512, 7, 7)
  _z.x.x.x.r (512, 7, 7)
  _z.x.x.x.p (512, 7, 7)
  _z.x.x.x.b (512, 7, 7)
  _z.x.x.x.b.x.c (512, 7, 7)
  _z.x.x.x.b.x (512, 7, 7)
  _z.x.x.x.b.x._ (512, 7, 7)
  _z.x.x.x.b.x._.x.c (512, 7, 7)
  z.x.x.x.x.r (256, 14, 14)
  z.x.x.x.x.p (256, 14, 14)
  z.x.x.x.x.b (256, 14, 14)
  z.x.x.x.x.b.x.c (256, 14, 14)
  z.x.x.x.x.b.x (256, 14, 14)
  z.x.x.x.x.b.x._ (256, 14, 14)
  z.x.

### Training the Transfer Learning Model

In the code below, we load up the pre-trained ResNet_18 model and clone it, while stripping off the final `features` layer. We clone the model so that we can re-use the same trained model multiple times, trained for different things - it is not strictly necessary if you are just training it for a single task, but this is why we would not use `CloneMethod.share`, we want to learn new parameters. If `freeze_weights` is true, we will freeze weights on all layers we clone and only learn weights on the final new features layer. This can often be useful if you are cloning higher up the tree (e.g., cloning after the first convolutional layer to just get basic image features).

We find the final hidden layer (`z.x`) using `find_by_name`, clone it and all of its predecessors, then attach a new `Dense` layer for classification.

In [21]:
import cntk.io.transforms as xforms
ensure_exists(output_path)
np.random.seed(123)

# Creates a minibatch source for training or testing
def create_mb_source(map_file, image_dims, num_classes, randomize=True):
    transforms = [xforms.scale(width=image_dims[2], height=image_dims[1], channels=image_dims[0], interpolations='linear')]
    return C.io.MinibatchSource(C.io.ImageDeserializer(map_file, C.io.StreamDefs(
            features=C.io.StreamDef(field='image', transforms=transforms),
            labels=C.io.StreamDef(field='label', shape=num_classes))),
            randomize=randomize)

# Creates the network model for transfer learning
def create_model(model_details, num_classes, input_features, new_prediction_node_name='prediction', freeze=False):
    # Load the pretrained classification net and find nodes
    base_model = C.load_model(model_details['model_file'])
    feature_node = C.logging.find_by_name(base_model, model_details['feature_node_name'])
    last_node = C.logging.find_by_name(base_model, model_details['last_hidden_node_name'])

    # Clone the desired layers with fixed weights
    cloned_layers = C.combine([last_node.owner]).clone(
        C.CloneMethod.freeze if freeze else C.CloneMethod.clone,
        {feature_node: C.placeholder(name='features')})

    # Add new dense layer for class prediction
    feat_norm = input_features - C.Constant(114)
    cloned_out = cloned_layers(feat_norm)
    z = C.layers.Dense(num_classes, activation=None, name=new_prediction_node_name) (cloned_out)

    return z

We will now train the model just like any other CNTK model training - instantiating an input source (in this case a `MinibatchSource` from our image data), defining the loss function, and training for a number of epochs. Since we are training a multi-class classifier network, the final layer is a cross-entropy Softmax, and the error function is classification error - both conveniently provided by utility functions in `cntk.ops`.

When training a pre-trained model, we are adapting the existing weights to suit our domain. Since the weights are likely already close to correct (especially for earlier layers that find more primitive features), fewer examples and fewer epochs are typically required to get good performance.

In [22]:
# Trains a transfer learning model
def train_model(model_details, num_classes, train_map_file,
                learning_params, max_images=-1):
    num_epochs = learning_params['max_epochs']
    epoch_size = sum(1 for line in open(train_map_file))
    if max_images > 0:
        epoch_size = min(epoch_size, max_images)
    minibatch_size = learning_params['mb_size']
    
    # Create the minibatch source and input variables
    minibatch_source = create_mb_source(train_map_file, model_details['image_dims'], num_classes)
    image_input = C.input_variable(model_details['image_dims'])
    label_input = C.input_variable(num_classes)

    # Define mapping from reader streams to network inputs
    input_map = {
        image_input: minibatch_source['features'],
        label_input: minibatch_source['labels']
    }

    # Instantiate the transfer learning model and loss function
    tl_model = create_model(model_details, num_classes, image_input, freeze=learning_params['freeze_weights'])
    ce = C.cross_entropy_with_softmax(tl_model, label_input)
    pe = C.classification_error(tl_model, label_input)

    # Instantiate the trainer object
    lr_schedule = C.learning_parameter_schedule(learning_params['lr_per_mb'])
    mm_schedule = C.momentum_schedule(learning_params['momentum_per_mb'])
    learner = C.momentum_sgd(tl_model.parameters, lr_schedule, mm_schedule, 
                           l2_regularization_weight=learning_params['l2_reg_weight'])
    trainer = C.Trainer(tl_model, (ce, pe), learner)

    # Get minibatches of images and perform model training
    print("Training transfer learning model for {0} epochs (epoch_size = {1}).".format(num_epochs, epoch_size))
    C.logging.log_number_of_parameters(tl_model)
    progress_printer = C.logging.ProgressPrinter(tag='Training', num_epochs=num_epochs)
    for epoch in range(num_epochs):       # loop over epochs
        sample_count = 0
        while sample_count < epoch_size:  # loop over minibatches in the epoch
            data = minibatch_source.next_minibatch(min(minibatch_size, epoch_size - sample_count), input_map=input_map)
            trainer.train_minibatch(data)                                    # update model with it
            sample_count += trainer.previous_minibatch_sample_count          # count samples processed so far
            progress_printer.update_with_trainer(trainer, with_metric=True)  # log progress
            if sample_count % (100 * minibatch_size) == 0:
                print ("Processed {0} samples".format(sample_count))

        progress_printer.epoch_summary(with_metric=True)

    return tl_model

When we evaluate the trained model on an image, we have to massage that image into the expected format. In our case we use `Image` to load the image from its path, resize it to the size expected by our model, reverse the color channels (RGB to BGR), and convert to a contiguous array along height, width, and color channels. This corresponds to the 224x224x3 flattened array on which our model was trained.

The model with which we are doing the evaluation has not had the Softmax and Error layers added, so is complete up to the final feature layer. To evaluate the image with the model, we send the input data to the `model.eval` method, `softmax` over the results to produce probabilities, and use Numpy's `argmax` method to determine the predicted class. We can then compare that against the true labels to get the overall model accuracy.

In [9]:
# Evaluates a single image using the re-trained model
def eval_single_image(loaded_model, image_path, image_dims):
    # load and format image (resize, RGB -> BGR, CHW -> HWC)
    try:
        img = Image.open(image_path)
        
        if image_path.endswith("png"):
            temp = Image.new("RGB", img.size, (255, 255, 255))
            temp.paste(img, img)
            img = temp
        resized = img.resize((image_dims[2], image_dims[1]), Image.ANTIALIAS)
        bgr_image = np.asarray(resized, dtype=np.float32)[..., [2, 1, 0]]
        hwc_format = np.ascontiguousarray(np.rollaxis(bgr_image, 2))

        # compute model output
        arguments = {loaded_model.arguments[0]: [hwc_format]}
        output = loaded_model.eval(arguments)

        # return softmax probabilities
        sm = C.softmax(output[0])
        return sm.eval()
    except FileNotFoundError:
        print("Could not open (skipping file): ", image_path)
        return ['None']
        


# Evaluates an image set using the provided model
def eval_test_images(loaded_model, output_file, test_map_file, image_dims, max_images=-1, column_offset=0):
    num_images = sum(1 for line in open(test_map_file))
    if max_images > 0:
        num_images = min(num_images, max_images)
    if isFast:
        num_images = min(num_images, 300) #We will run through fewer images for test run
        
    print("Evaluating model output node '{0}' for {1} images.".format('prediction', num_images))

    pred_count = 0
    correct_count = 0
    np.seterr(over='raise')
    with open(output_file, 'wb') as results_file:
        with open(test_map_file, "r") as input_file:
            for line in input_file:
                tokens = line.rstrip().split('\t')
                img_file = tokens[0 + column_offset]
                probs = eval_single_image(loaded_model, img_file, image_dims)
                
                if probs[0]=='None':
                    print("Eval not possible: ", img_file)
                    continue

                pred_count += 1
                true_label = int(tokens[1 + column_offset])
                predicted_label = np.argmax(probs)
                if predicted_label == true_label:
                    correct_count += 1

                #np.savetxt(results_file, probs[np.newaxis], fmt="%.3f")
                if pred_count % 100 == 0:
                    print("Processed {0} samples ({1:.2%} correct)".format(pred_count, 
                                                                           (float(correct_count) / pred_count)))
                if pred_count >= num_images:
                    break
    print ("{0} of {1} prediction were correct".format(correct_count, pred_count))
    return correct_count, pred_count, (float(correct_count) / pred_count)

Finally, with all of these helper functions in place we can train the model and evaluate it on our paintings dataset.

You should see the model train and evaluate, with a final accuracy over 90%. At this point you could choose to train longer, or consider taking a look at the confusion matrix to determine if certain flowers are mis-predicted at a greater rate. You could also easily swap out to a different model and see if that performs better, or potentially learn from an earlier point in the model architecture.  The hyper-parameter settings appear at the top of this notebook.

Azure Blob storage has no real directories:  the container simply stores key-value pairs.  The slash gives an appearance of directories, but these slashes are just part of the key name.  The original tutorial (which this lab is based on) assumes a directory structure and builds a **map.txt** file; however, in this lab, that file was built at the time the image data were uploaded to the container from Azure ML Workbench.

The images are stored with `Train` and `Test` designators with the nested designator giving the class name (i.e. `MonetPaintings` and `VanGoghPaintings` folders). This is quite common, so it is useful to know how to convert that format into one that can be used for constructing the mapping files CNTK expects. 

In [None]:
# Set python version variable 
python_version = sys.version_info.major

# Create map file for CNTK ImageDeserializer
def create_map_file_from_blockblob(azure_file, mapname):
    map_file_name = os.path.abspath(os.path.join(data_root, mapname + ".txt"))
    
    map_file = None

    if python_version == 3: 
        map_file = open(map_file_name , 'w', encoding='utf-8')
    else:
        map_file = open(map_file_name , 'w')

    textLines = spark.sparkContext.textFile(azure_file)
    for row in textLines.collect():
        print(os.path.abspath(os.path.join(data_root, row)))
        try:
            # map_file.write(row)
            print(os.path.abspath(os.path.join(data_root, row)), file=map_file)
        except UnicodeEncodeError:
            print('error ***')
            continue

    map_file.close()  
    print(Path(map_file_name).resolve())
    
    return map_file_name

paintings_data['training_map'] = create_map_file_from_blockblob(paintings_data['azure_training_map'], 'trainmap')
paintings_data['testing_map'] = create_map_file_from_blockblob(paintings_data['azure_testing_map'], 'testmap')
paintings_data


In [None]:
with open(paintings_data['training_map'], 'r') as f:
    print(f.read())

In [None]:
with open('../Image/testmap.txt', 'r') as f:
    print(f.read())

In [11]:
# Set python version variable 
python_version = sys.version_info.major

def create_map_file_from_folder(root_folder, class_mapping, include_unknown=False, valid_extensions=['.jpg', '.jpeg', '.png']):
    map_file_name = os.path.join(root_folder, "map.txt")
    
    map_file = None

    if python_version == 3: 
        map_file = open(map_file_name , 'w', encoding='utf-8')
    else:
        map_file = open(map_file_name , 'w')

    print(len(class_mapping))
    for class_id in range(0, len(class_mapping)):
        folder = os.path.abspath(os.path.join(root_folder, class_mapping[class_id]))
        print(folder)
        if os.path.exists(folder):
            print ('path exists')
            for entry in os.listdir(folder):
                filename = os.path.abspath(os.path.join(folder, entry))
                if os.path.isfile(filename) and os.path.splitext(filename)[1].lower() in valid_extensions:
                    print('   attempt write')
                    try:
                        map_file.write("{0}\t{1}\n".format(filename, class_id))
                    except UnicodeEncodeError:
                        continue

    if include_unknown:
        for entry in os.listdir(root_folder):
            filename = os.path.abspath(os.path.join(root_folder, entry))
            if os.path.isfile(filename) and os.path.splitext(filename)[1].lower() in valid_extensions:
                try:
                    map_file.write("{0}\t-1\n".format(filename))
                except UnicodeEncodeError:
                    continue
                    
    map_file.close()  
    
    return map_file_name
  

def create_class_mapping_from_folder(root_folder):
    print(root_folder)
    print(Path(root_folder).resolve())
    classes = []
    for _, directories, _ in os.walk(root_folder):
        for directory in directories:
            classes.append(directory)
            print (directory)
    return np.asarray(classes)

paintings_data['class_mapping'] = create_class_mapping_from_folder(paintings_data['training_folder'])
paintings_data['training_map'] = create_map_file_from_folder(paintings_data['training_folder'], paintings_data['class_mapping'])

# Allows for adding additional images which were never seen
paintings_data['testing_map'] = create_map_file_from_folder(paintings_data['testing_folder'], paintings_data['class_mapping'], 
                                                          include_unknown=True)
paintings_data

/mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/Image/DataSets/Paintings/Train
/mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/Image/DataSets/Paintings/Train
MonetPainting
VanGoghPainting
PicassoPainting
3
/mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/Image/DataSets/Paintings/Train/MonetPainting
path exists
Resolved filename:  /mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/Image/DataSets/Paintings/Train/MonetPainting/156158.jpeg
   attempt write
Resolved filename:  /mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/Image/DataSets/Paintings/Train/MonetPainting/1bf2f.jpeg
   attempt write
Resolved filename:  /mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/Image/DataSets/Paintings/Train/MonetPainting/1841d8.jpeg
   attempt write
Resolved filename:  /mnt/resou

We can now train our model on our small domain and evaluate the results:

In [30]:
paintings_model = {
    'model_file': os.path.join(output_path, 'PaintingsTransferLearning.model'),
    'results_file': os.path.join(output_path, 'PaintingsPredictions.txt'),
    'num_classes': len(paintings_data['class_mapping'])
}

if os.path.exists(paintings_model['model_file']) and not force_retraining:
    print("Loading existing model from %s" % paintings_model['model_file'])
    trained_model = C.load_model(paintings_model['model_file'])
else:
    trained_model = train_model(base_model, 
                                paintings_model['num_classes'], paintings_data['training_map'],
                                learning_params)
    trained_model.save(paintings_model['model_file'])
    print("Stored trained model at %s" % paintings_model['model_file'])    

Training transfer learning model for 20 epochs (epoch_size = 322).
Training 1539 parameters in 2 parameter tensors.
Finished Epoch[1 of 20]: [Training] loss = 0.944333 * 322, metric = 40.68% * 322 7.345s ( 43.8 samples/s);
Finished Epoch[2 of 20]: [Training] loss = 0.214034 * 322, metric = 9.32% * 322 7.236s ( 44.5 samples/s);
Finished Epoch[3 of 20]: [Training] loss = 0.079264 * 322, metric = 1.86% * 322 6.553s ( 49.1 samples/s);
Finished Epoch[4 of 20]: [Training] loss = 0.089584 * 322, metric = 4.97% * 322 6.851s ( 47.0 samples/s);
Finished Epoch[5 of 20]: [Training] loss = 0.052831 * 322, metric = 1.86% * 322 6.327s ( 50.9 samples/s);
Finished Epoch[6 of 20]: [Training] loss = 0.047850 * 322, metric = 1.24% * 322 6.349s ( 50.7 samples/s);
Finished Epoch[7 of 20]: [Training] loss = 0.032318 * 322, metric = 1.86% * 322 6.682s ( 48.2 samples/s);
Finished Epoch[8 of 20]: [Training] loss = 0.015687 * 322, metric = 0.31% * 322 6.375s ( 50.5 samples/s);
Finished Epoch[9 of 20]: [Training]

Now that the model is trained for animals data. Lets us evaluate the images.

In [31]:
# evaluate test images
with open(paintings_data['testing_map'], 'r') as input_file:
    for line in input_file:
        tokens = line.rstrip().split('\t')
        img_file = tokens[0]
        true_label = int(tokens[1])
        probs = eval_single_image(trained_model, img_file, base_model['image_dims'])

        if probs[0]=='None':
            continue
        class_probs = np.column_stack((probs, paintings_data['class_mapping'])).tolist()
        class_probs.sort(key=lambda x: float(x[0]), reverse=True)
        predictions = ' '.join(['%s:%.3f' % (class_probs[i][1], float(class_probs[i][0])) \
                                for i in range(0, paintings_model['num_classes'])])
        true_class_name = paintings_data['class_mapping'][true_label] if true_label >= 0 else 'unknown'
        print('Class: %s, predictions: %s, image: %s' % (true_class_name, predictions, img_file))

Class: MonetPainting, predictions: MonetPainting:1.000 VanGoghPainting:0.000 PicassoPainting:0.000, image: /mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/Image/DataSets/Paintings/Test/MonetPainting/1f71f8.jpeg
Class: MonetPainting, predictions: MonetPainting:1.000 VanGoghPainting:0.000 PicassoPainting:0.000, image: /mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/Image/DataSets/Paintings/Test/MonetPainting/d307.jpeg
Class: MonetPainting, predictions: MonetPainting:0.760 VanGoghPainting:0.219 PicassoPainting:0.021, image: /mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/Image/DataSets/Paintings/Test/MonetPainting/1122e.jpeg
Class: MonetPainting, predictions: MonetPainting:0.656 VanGoghPainting:0.343 PicassoPainting:0.001, image: /mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/Image/DataSets/Paintings/Test/MonetPainting/3c38.jpeg
Class

### Evaluate

Evaluate the newly learnt paintings classifier by transfering the learning from a pre-trained ResNet model.

In [32]:
# Evaluate the test set
predict_correct, predict_total, predict_accuracy = \
   eval_test_images(trained_model, paintings_model['results_file'], paintings_data['testing_map'], base_model['image_dims'])
print("Done. Wrote output to %s" % paintings_model['results_file'])

Evaluating model output node 'prediction' for 72 images.
67 of 72 prediction were correct
Done. Wrote output to /mnt/resource/hadoop/yarn/local/usercache/livy/appcache/application_1514913157146_0012/container_1514913157146_0012_01_000001/temp/Output/PaintingsPredictions.txt

In [33]:
# Test: Accuracy on paintings data
print ("Prediction accuracy: {0:.2%}".format(float(predict_correct) / predict_total))

Prediction accuracy: 93.06%

### Final Thoughts, and Caveats

Transfer Learning has limitations. If you noticed, we re-trained a model that had been trained on ImageNet images. This meant it already _knew_ what "images" were, and had a good idea on concepts from low-level (stripes, circles) to high-level (dog's noses, cat's ears). Re-training such a model to detect sheep or wolves makes sense, but re-training it to detect vehicles from aerial imagery would be more difficult. You can still use Transfer Learning in these cases, but you might want to just re-use earlier layers of the model (i.e. the early Convolutional layers that have learned more primitive concepts), and you will likely require much more training data.

Adding a catch-all category can be a good idea, but only if the training data for that category contains images that are again sufficiently similar to the images you expect at scoring time. As in the above example, if we train a classifier with images of sheep and wolf and use it to score an image of a bird, the classifier can still only assign a sheep or wolf label, since it does not know any other categories. If we were to add a catch-all category and add training images of birds to it then the classifier might predict the class correctly for the bird image. However, if we present it, e.g., an image of a car, it faces the same problem as before as it knows only sheep, wolf and bird (which we just happened to call called catch-all). Hence, your training data, even for your catch-all, needs to cover sufficiently those concepts and images that you expect later on at scoring time.