<a href="https://colab.research.google.com/github/zaidalyafeai/Notebooks/blob/master/WeightTransfer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## [@Zaid Alyafeai](https://twitter.com/zaidalyafeai)

# Introduction
In this tutorial we explain how to transfer weights from a static graph model built with TensorFlow to a dynamic graph built with Keras. We will first train a model using Tensorflow then we will create the same model in keras and transfer the trained weights between the two models. 

![alt text](https://raw.githubusercontent.com/zaidalyafeai/Notebooks/master/images/weightrasnfer.png)

# Dataset

We will use [QuickDraw10](https://github.com/zaidalyafeai/QuickDraw10) which is a suggested alternative for mnist. QuickDraw10 constains 100K grayscale images with shapes (28 x 28)seperated into 80K for training and 20K for testing for labeling 10 classes. 

## Download the Data

In [2]:
!git clone https://github.com/zaidalyafeai/QuickDraw10

Cloning into 'QuickDraw10'...
remote: Enumerating objects: 53, done.[K
remote: Counting objects: 100% (53/53), done.[K
remote: Compressing objects: 100% (49/49), done.[K
remote: Total 53 (delta 11), reused 0 (delta 0), pack-reused 0[K
Unpacking objects: 100% (53/53), done.


## Load the Data

In [0]:
import numpy as np

train_data = np.load('QuickDraw10/dataset/train-ubyte.npz')
test_data  = np.load('QuickDraw10/dataset/test-ubyte.npz')

x_train, y_train = train_data['a'], train_data['b']
x_test,  y_test  = test_data['a'],  test_data['b']

In [0]:
BATCH_SIZE = 32
N = x_train.shape[0]

x_train = np.reshape(x_train/ 255., (x_train.shape[0], 28, 28, 1)).astype('float32')
x_test = np.reshape(x_test/255., (x_test.shape[0], 28, 28, 1)).astype('float32')

# TensorFlow Graph

In [0]:
import tensorflow as tf 

Define the model inputs and outputs 

In [0]:
#define the data
with tf.name_scope("data"):
  X = tf.placeholder(tf.float32, shape = [None, 28, 28, 1], name = 'X')
  y = tf.placeholder(tf.int32,   shape = [None], name = 'y')

Create the layers

In [0]:
with tf.name_scope("block1"):
  conv1 = tf.layers.conv2d(X, filters = 8, kernel_size = 3, 
                           activation = tf.nn.relu, padding = 'same', name = 'conv1')
  pool1 = tf.layers.max_pooling2d(conv1, pool_size = 2, strides = 2, name = 'pool1')
  
with tf.name_scope("block2"):
  conv2 = tf.layers.conv2d(pool1, filters = 16, kernel_size = 3, 
                           activation = tf.nn.relu, padding = 'same', name = 'conv2')
  pool2 = tf.layers.max_pooling2d(conv2, pool_size = 2, strides = 2, name = 'pool2')
  
with tf.name_scope("block3"):
  conv3 = tf.layers.conv2d(pool2, filters = 32, kernel_size = 3, 
                           activation = tf.nn.relu, padding = 'same', name = 'conv3')
  pool3 = tf.layers.max_pooling2d(conv3, pool_size = 2, strides = 2, name = 'pool3')  
  
with tf.name_scope("flatten"):
  flatten = tf.reshape(pool3, shape = [-1, 3*3*32 ], name = 'flatten')
  
with tf.name_scope("dense"):
  logits = tf.layers.dense(flatten, units = 10)

Define the training procedure and the evaluation metrics 

In [0]:
with tf.name_scope("train"):
  #cross entropy loss
  entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits = logits, labels = y)
  loss = tf.reduce_mean(entropy)
  
  #minimize adam optimizer 
  optimizer = tf.train.AdamOptimizer()
  backprob = optimizer.minimize(loss)
  
with tf.name_scope("eval"):
  #calculate the accuracy  
  correct = tf.nn.in_top_k(logits,y,1)
  accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))

In [0]:
init = tf.global_variables_initializer()

In [9]:
with tf.Session() as sess:
  epochs = 3
  
  #initialize all the variables 
  sess.run(init)
  
  #training 
  for epoch in range(0, epochs):
    i = 0 
    while i < N:
      
      #get the next batch 
      x_batch = x_train[i: i+BATCH_SIZE]
      y_batch = y_train[i: i+BATCH_SIZE]
            
      #run the graph   
      out = sess.run(backprob, feed_dict= {X:x_batch, y:y_batch})
      i = i + BATCH_SIZE
    print('------')  
    acc_test  = accuracy.eval(feed_dict={X: x_test, y: y_test})
    print("Epoch:", epoch+1, "test accuracy:", acc_test)
  
  print('saving the weights ...')
  #extract and save the weights 
  variables = [v for v in tf.trainable_variables()]
  idx = 0
  weights = []
  for v in variables:
      out = sess.run(v)
      weights.append(out)

------
Epoch: 1 test accuracy: 0.923
------
Epoch: 2 test accuracy: 0.9353
------
Epoch: 3 test accuracy: 0.93875


In [0]:
tf.reset_default_graph()

# Keras Model

In [11]:
from keras.layers import Dense, Input, Convolution2D, MaxPooling2D, Flatten
from keras.models import Sequential

Using TensorFlow backend.


In [0]:
model = Sequential()
model.add(Convolution2D(filters = 8, kernel_size = 3, activation = 'relu', padding = 'same' , input_shape = (28, 28, 1)))
model.add(MaxPooling2D(pool_size = 2, strides = 2))
model.add(Convolution2D(filters = 16, kernel_size = 3, activation = 'relu', padding = 'same' , input_shape = (28, 28, 1)))
model.add(MaxPooling2D(pool_size = 2, strides = 2))
model.add(Convolution2D(filters = 32, kernel_size = 3, activation = 'relu', padding = 'same' , input_shape = (28, 28, 1)))
model.add(MaxPooling2D(pool_size = 2, strides = 2))
model.add(Flatten())
model.add(Dense(units = 10))
model.compile(loss = 'categorical_crossentropy', optimizer = 'adam', metrics = ['accuracy'])

## Load the Weights

In [0]:
i = 0 
for layer in model.layers:
  #load the weights to the model layers
  if 'conv2d' in layer.name or 'dense' in layer.name:
    W = weights[i]
    b = weights[i+1]
    layer.set_weights([W, b])
    i+=2

## Prediction

In [14]:
n_values = np.max(y_test) + 1
y_one_hot =  np.eye(n_values)[y_test]
model.evaluate(x = x_test, y = y_one_hot)[1]



0.93875

# References
https://www.kaggle.com/andrewrona22/an-example-of-cnn-using-tensorflow