## Style Transfer Network
In this notebook we will go through the process of converting and evaluating the style transfer model, the one linked in the readme page, to CoreML. This model takes in an image and a style index (one of 26 possible styles) and outputs the stylized image. 

We first download the TF model (.pb file)

In [None]:
# Download the model 
import os
import urllib
import zipfile
def download_file_and_unzip(url, dir_path='.'):
    """Download the frozen TensorFlow model and unzip it.
    url - The URL address of the frozen file
    dir_path - local directory
    """
    if not os.path.exists(dir_path):
        os.makedirs(dir_path)
    k = url.rfind('/')
    fname = url[k+1:]
    fpath = os.path.join(dir_path, fname)

    if not os.path.exists(fpath):
        urllib.urlretrieve(url, fpath)
    zip_ref = zipfile.ZipFile(fpath, 'r')
    zip_ref.extractall(dir_path)
    zip_ref.close()    

inception_v1_url = 'https://storage.googleapis.com/download.tensorflow.org/models/stylize_v1.zip'
download_file_and_unzip(inception_v1_url)

For conversion to CoreML, we need to find the input and output tensor names in the TF graph. This will also be required to run the TF graph for numerical accuracy check. Lets load the TF graph def and try to find the names. Inputs are generally the tensors that are outputs of the "Placeholder" op. 

In [None]:
# Load the TF graph definition
import tensorflow as tf
tf_model_path = './stylize_quantized.pb'
with open(tf_model_path, 'rb') as f:
    serialized = f.read()
tf.reset_default_graph()
original_gdef = tf.GraphDef()
original_gdef.ParseFromString(serialized)

# Lets get some details about a few ops in the beginning and the end of the graph
with tf.Graph().as_default() as g:
    tf.import_graph_def(original_gdef, name='')
    ops = g.get_operations()
    N = len(ops)
    for i in range(N):
        if ops[i].type == 'Placeholder':
            for x in ops[i].outputs:
                print("output name = {}, shape: {},".format(x.name, x.get_shape())),
                print('\n')

There are two inputs: the image input named "input:0" and the style index input named "style_num:0". For finding the output lets print some info of the last few ops

In [None]:
with tf.Graph().as_default() as g:
    tf.import_graph_def(original_gdef, name='')
    ops = g.get_operations()
    N = len(ops)
    for i in range(N-10,N):
        print('\n\nop id {} : op type: "{}"'.format(str(i), ops[i].type));
        print('input(s):'),
        for x in ops[i].inputs:
            print("name = {}, shape: {}, ".format(x.name, x.get_shape())),
        print('\noutput(s):'),
        for x in ops[i].outputs:
            print("name = {}, shape: {},".format(x.name, x.get_shape())), 

Generally some knowledge about the network may be required to correctly determine the output. In this case the output of the "Sigmoid" op is the normalized image (between 0-1) which goes into the "Mul" op followed by the "Squeeze" op. The final output we are interested in is the tensor "Squeeze:0" which is the RGB image with values between 0-255. 

Now lets convert the model to CoreML. In this particular model, the TF graph can take an image of any size (it will produce the output image of the same size). However, CoreML requires us to specify the exact size of all its inputs. Hence we choose a fixed size for our image. Lets say 256.  

In [None]:
import tfcoreml
mlmodel = tfcoreml.convert(
        tf_model_path = tf_model_path,
        mlmodel_path = './stylize.mlmodel',
        output_feature_names = ['Squeeze:0'],
        input_name_shape_dict = {'input:0':[1,256,256,3], 'style_num:0':[26]})

We see that the CoreML model expects two inputs: 'style\_num_\_0' which is a multiarray and a sequence of length 26 and 'input_\_0' which is a multiarray corresponding to the image input and of shape (3,256,256). It produces a multiarray output called 'Squeeze_\_0'

Lets now grab an image and using coremltools see what the coreml model predicts.  

In [None]:
import numpy as np
import PIL
import requests
from io import BytesIO
from matplotlib.pyplot import imshow
# This is an image of a golden retriever from Wikipedia
img_url = 'https://upload.wikimedia.org/wikipedia/commons/9/93/Golden_Retriever_Carlos_%2810581910556%29.jpg'
response = requests.get(img_url)
%matplotlib inline
img = PIL.Image.open(BytesIO(response.content))
img = img.resize([256,256], PIL.Image.ANTIALIAS)
img_np = np.asarray(img).astype(np.float32)
print img_np.shape, img_np.flatten()[:5]
imshow(img_np/255.0)

In [None]:
# Transpose the image since CoreML requires C,H,W format (3,256,256)
coreml_image_input = np.transpose(img_np, (2,0,1))

# The style index is a one-hot vector: a vector of zeros of length 26, with 1 in the index whose style we want
index = np.zeros((26)).astype(np.float32)
index[0] = 1 #Lets say we want to get style 0

# CoreML Multi array interpreation is (Seq, Batch, C,H,W). Hence the style index input, which is a sequence,
# must be of shape (26,1,1,1,1)
coreml_style_index = index[:,np.newaxis,np.newaxis,np.newaxis,np.newaxis]

coreml_input = {'input__0': coreml_image_input, 'style_num__0': coreml_style_index}
coreml_out = mlmodel.predict(coreml_input, useCPUOnly = True)['Squeeze__0']
print coreml_out.shape, coreml_out.flatten()[:5]

In [None]:
#Transpose back for visualization with imshow
coreml_out = np.transpose(np.squeeze(coreml_out), (1,2,0))
imshow(coreml_out/255.0)

That looks cool! Lets try another style. 

In [None]:
index = np.zeros((26)).astype(np.float32)
index[10] = 1 
coreml_style_index = index[:,np.newaxis,np.newaxis,np.newaxis,np.newaxis]
coreml_input = {'input__0': coreml_image_input, 'style_num__0': coreml_style_index}
coreml_out = mlmodel.predict(coreml_input, useCPUOnly = True)['Squeeze__0']
coreml_out = np.transpose(np.squeeze(coreml_out), (1,2,0))
imshow(coreml_out/255.0)

Lets also try to evaluate the same image and style with the TF model to check that the conversion was correct (we should get similar output)

In [None]:
tf_img = np.expand_dims(img_np,axis=0)
tf_input_name_image = 'input:0'
tf_input_name_style_index = 'style_num:0'
feed_dict = {tf_input_name_image: tf_img, tf_input_name_style_index: index}
tf_output_name = 'Squeeze:0'
with tf.Session(graph = g) as sess:
    tf_out = sess.run(tf_output_name, 
                      feed_dict=feed_dict)
imshow(tf_out/255.0)    