<a href="https://colab.research.google.com/github/adekunleba/tensorflow_tutorials/blob/master/CNN_with_Tensorflow_Eager.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

###Learning to write Convolution Neural Networks with Tensorflow Eager Execution and Tf.keras API

In [0]:
import numpy as np
import tensorflow as tf



tf.enable_eager_execution()

Using MNIST Data can we do a tf.eager project to build a Model using Convolutional Neural Network.

The major focus of this project is to adapt tf.eager in the process of training machine learning, right from data generation and converting the DataSet which is the format that is recommended by tensorflow to passing the data to the model and computing gradients on the data as the training goes on.





In [0]:
from tensorflow.python.keras.datasets import mnist

In [0]:
# input_data.read_data_sets()

Mnist from keras datasets presents you with a way to get data first, it comes already prepared with the data and the label. In some other custom cases, you might need to define the data generator approach to get your data and read it into the dataset.

However, as smooth as the data from mnist is, it is also important to note that we had to convert a numpy array which is the format which the data from mnist is, to a Dataset,

Also, the labels existed as the corresponding figure, we needed to convert this to a one-hot approach with the help of `tf.one_hot` we are able to convert a numpy list of corresponding digit values to a one-hot encoding.




In [4]:
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((-1, 28, 28, 1))
x_test = x_test.reshape((-1, 28, 28, 1))

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [0]:
#x_train is a numpy array of `(60000, 28, 28, 1)`

In [0]:
#convert array to one hot with tensorflow
num_classes = 10
y_train = tf.one_hot(y_train, depth=num_classes)

In [0]:
y_test = tf.one_hot(y_test, depth=num_classes)

Using `tf.data.Dataset.from_tensor_slices` we can then get the numpy array of`images` and `labels` into a Dataset.

*Note that we can apply many other `map` functions to the datase if we find need of it.*


The interesting thing about dataset is also the ability to easily batch, hence on the call to `batch`, tensorflow takes care of batching the data to model.



In [0]:
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train.numpy()))
train_dataset = train_dataset.batch(32)


test_dataset = tf.data.Dataset.from_tensor_slices((x_test, y_test.numpy()))
test_dataset = test_dataset.batch(1)

Using Model subclassing API, this is an imperative approach introduced into Tensorflow. It allows enough flexibility and quick changes to the architecutre.

With this, you can control the approach of the forward pass of the data through your network. I have found this to be more usefull when trying out new Ideas. 

It's

In [0]:
#Build your CNN Model

class ConvBN(tf.keras.Model):
  
  def __init__(self, filters, size, apply_batchnorm=True, apply_pooling=False):
    super(ConvBN, self).__init__()
    self.apply_batchnorm = apply_batchnorm
    self.apply_pooling = apply_pooling
    initializer = tf.random_normal_initializer(0, 0.02)
    
    self.conv = tf.keras.layers.Conv2D(filters, (size, size), strides=2, padding="same",
                                       kernel_initializer=initializer, use_bias=False)
    self.batchnorm = tf.keras.layers.BatchNormalization()
    self.dropout = tf.keras.layers.Dropout(0.5)
    self.pool = tf.keras.layers.GlobalAveragePooling2D()
    
    
  def call(self, inputs, training):
    x = self.conv(inputs)
    if self.apply_batchnorm:
      x = self.batchnorm(x, training=training)
    if self.apply_pooling:
      x = self.pool(x)
    x = tf.nn.relu(x)
    return x

In [0]:
class MNISTCNN(tf.keras.Model):
  
  def __init__(self, num_classes):
    super(MNISTCNN, self).__init__()
    self.conv1 = ConvBN(64, 4)
    self.conv2 = ConvBN(128, 4, apply_pooling=False)
    self.conv3 = ConvBN(128, 4, apply_pooling=False)
    
    self.flatten = tf.keras.layers.Flatten()
    self.last = tf.keras.layers.Dense(num_classes)
  
  @tf.contrib.eager.defun
  def call(self, inputs, training):
    x = self.conv1(inputs, training=training)
    x = self.conv2(x, training=training)
    x = self.conv3(x, training=training)
    
    
    last = self.last(self.flatten(x))
    return last

In [0]:
model = MNISTCNN(10)

In [0]:
# model.fit_generator() Tensorflow Model Subclassing also have the ability to fit on a generator.

In [0]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv_bn (ConvBN)             multiple                  1280      
_________________________________________________________________
conv_bn_1 (ConvBN)           multiple                  131584    
_________________________________________________________________
conv_bn_2 (ConvBN)           multiple                  262656    
_________________________________________________________________
flatten (Flatten)            multiple                  0         
_________________________________________________________________
dense (Dense)                multiple                  20490     
Total params: 416,010
Trainable params: 415,370
Non-trainable params: 640
_________________________________________________________________


Teh

In [0]:
#Train the model

#define optimizer 
optimizer = tf.train.AdamOptimizer(2e-4, beta1=0.5)

In [0]:
import os
#Set up checkpoints
checkpoint_dir = './training_checkpoints'
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt")
checkpoint = tf.train.Checkpoint(optimizer=optimizer)

In [0]:
def gradient_loss(logits, one_hot_labels):
  return tf.losses.softmax_cross_entropy(logits=logits, onehot_labels=one_hot_labels)

In [0]:
#Write code to always see the prediction. The Test part of the deal

def generate_test(model, input, target):
  prediction = model(input, training=True)
  #prediction is (1, 10)
  print("model predicted {} while actual image is {}".format())
  

In [0]:
import time
EPOCHS = 200

for epoch in range(EPOCHS):
  start = time.time()
  
  for input_image, target in train_dataset:
    with tf.GradientTape() as tape:
            
      logits = model(input_image, training=True) #Run a foward pass
      loss = gradient_loss(logits=logits, one_hot_labels=target) #Note target was already in one-hot encoding
      
    gradients = tape.gradient(loss, model.variables)
    
    
    #Do Optimization
    optimizer.apply_gradients(zip(gradients, model.variables))
  
  
  if epoch % 2 == 0:
    for test_image, tar in test_dataset.take(1):
      prediction = model(test_image, training=True)
      print("model predicted {} while actual image is {}".format(np.argmax(prediction), np.argmax(tar)))
      
  
  if (epoch + 1) % 5 == 0:
    checkpoint.save(file_prefix = checkpoint_prefix)
    print ('Time taken for epoch {} is {} sec\n'.format(epoch,
                                                          time.time()-start))
      
      
      #Tape needs Gradient.
      

KeyboardInterrupt: ignored

In [0]:
!ls training_checkpoints/

checkpoint		    ckpt-3.data-00000-of-00001
ckpt-1.data-00000-of-00001  ckpt-3.index
ckpt-1.index		    ckpt-4.data-00000-of-00001
ckpt-2.data-00000-of-00001  ckpt-4.index
ckpt-2.index


In [0]:
np.argmax(np.array([0, 0, 1, 0, 0]))

2