##### In this notebook the model is created and trained for analysing individual frames of the the video independantly. The MobileNetV2 model is used as a convolutional base to extract features. On this stage this convolutional base is frozen and the only layers are trained that are added on top of it.

In [1]:
import tensorflow as tf
from keras.preprocessing.image import load_img
import pandas as pd
import numpy as np
import os
from random import shuffle
import matplotlib.pyplot as plt
from tqdm import tqdm
import keras
from keras.applications import MobileNetV2
from keras.models import Model,load_model
from keras.layers import *
import keras.backend as K
%matplotlib inline

Using TensorFlow backend.


##### To use tf.data API, a tensorflow session should be defined explicidly to run initializer of the data iterator

In [2]:
sess=tf.Session()
K.set_session(sess)

##### This cell contains parsing function to decode serialized samples read from tfrecords files. It returns tuple of tf.tensors of 1) image in format (width,height,color_chanel), 2)label - array of two numbers and 3) the weights to be used in loss function to take into account unbalanced dataset. They are stored in to last two columns of the feature 'train/label' 

In [3]:
def _parse_function(example_proto):
    feature = {'train/image': tf.FixedLenFeature((), tf.string),
               #'train/name': tf.FixedLenFeature((), tf.string),
               'train/label': tf.FixedLenFeature((4,), tf.float32)}
    
    parsed_features = tf.parse_single_example(example_proto, features=feature)
    
    image = tf.cast(tf.image.decode_jpeg(parsed_features['train/image']),dtype=tf.float32)/255.
    image = image[:448,:704,:]#crop the images from different cameras to make them of the same size 
    
    labels = parsed_features['train/label'][0:2]
    
    weights  = parsed_features['train/label'][2:]
   
    return image, labels, weights

##### Load model which makes out the convolution base. This conv base is fcn and accepts arbitrary image sizes 

In [4]:
mobinet=MobileNetV2(include_top=False) #any fcn model can be used here instead of MobileNetV2 



##### Difining Iterator to feed data into model as tf tensor

In [5]:
#the filenames of the dataset 
tfrecord_dir = '../Data_samples/train data/tfrecords'
filenames = [root+'/'+file for root, _, files in os.walk(tfrecord_dir) for file in files if file.endswith('tfrecords')]
batch_size=4

#creating a dataset
dataset = tf.data.TFRecordDataset(filenames)
dataset = dataset.map(_parse_function)
dataset = dataset.repeat(100)
dataset = dataset.batch(batch_size)
iterator = dataset.make_initializable_iterator()
next_element = iterator.get_next()

# creating the generator to feed data for training

def batch_generator(dataset):
    sess.run(iterator.initializer)
    while True:
        inputs, labels, weights = sess.run(next_element)
        features=mobinet.predict_on_batch(inputs)
        yield ([features, labels, weights],np.zeros((batch_size,))) # np.zeros are used as a dummy labels, since real labels
                                                                        # will be fed alongside with the sample 
            
gen=batch_generator(dataset)

##### Testing the generator

In [6]:
next(gen)

([array([[[[2.4537668 , 0.        , 0.        , ..., 0.        ,
            0.        , 0.        ],
           [4.349057  , 0.        , 0.        , ..., 0.        ,
            0.        , 0.8313675 ],
           [4.85713   , 0.        , 0.        , ..., 0.        ,
            0.        , 0.        ],
           ...,
           [0.        , 0.        , 0.        , ..., 0.        ,
            0.        , 0.        ],
           [0.        , 0.        , 0.        , ..., 0.        ,
            0.        , 0.        ],
           [0.        , 0.        , 0.        , ..., 0.        ,
            0.        , 0.        ]],
  
          [[4.714588  , 0.        , 0.        , ..., 0.        ,
            0.        , 0.        ],
           [6.        , 0.        , 0.        , ..., 0.        ,
            0.        , 0.82178545],
           [4.7617464 , 0.        , 0.        , ..., 0.        ,
            0.        , 0.        ],
           ...,
           [0.        , 0.        , 0.        

##### The definition of the top layers of the model. The first 5 layers of the top repersents self-attention mechanism which replaces GlobalAveraging of the original MobileNetV2, after that we have essentially linear regression model using the features produced by base convolutional model and self-attention mechanism. The last 4 layers produce custom loss function, that takes into account weights for balancing an unbalanced dataset.

In [7]:

inp0 = Input(shape=(None,None,1280))
inp1 = Input(shape=(2,))
inp2 = Input(shape=(2,))

x = inp0
labels = inp1
loss_weights = inp2


weights=Conv2D(1,(3,3),activation='sigmoid',padding='SAME',kernel_initializer=keras.initializers.Zeros(),
                                                   bias_initializer=keras.initializers.Ones())(x)

#next two lines make the weights sum up to 1: weights->weights/sum(weights)
norm = Lambda(lambda t: 1/K.sum(t,axis=[1,2]))(weights)

weights = merge.multiply([norm,weights])

#next two lines calculate weighted average, which is used instead of original global average
x = merge.multiply([x,weights])

x = Lambda(lambda t: K.sum(t,axis=[1,2]))(x)

# calculating the final prediction using lineal regression on the previousely generated features

prediction = Dense(2)(x)

# constructing the custom weighted loss
err = subtract([prediction,labels])
sqr_err = Lambda(K.square)(err)
weighted_sqr_err = multiply([sqr_err,loss_weights])
loss = Lambda(K.mean)(weighted_sqr_err)

##### Creating the model

In [8]:
model_top=Model(inputs=[inp0,inp1,inp2],outputs=[loss])#,weights

In [9]:
model_top.summary()

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_2 (InputLayer)            (None, None, None, 1 0                                            
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, None, None, 1 11521       input_2[0][0]                    
__________________________________________________________________________________________________
lambda_1 (Lambda)               (None, 1)            0           conv2d_1[0][0]                   
__________________________________________________________________________________________________
multiply_1 (Multiply)           (None, None, None, 1 0           lambda_1[0][0]                   
                                                                 conv2d_1[0][0]                   
__________

##### to use fit_generator method we need to define the custom dummy loss function. Since the model already produces loss as an output, all we need is to return the predicted value

In [10]:
def loss_compile(y_true,y):
      return y
    

##### Defining optimizer and compile model for training

In [11]:
adagr = keras.optimizers.Adagrad(lr=.001)

In [12]:
model_top.compile(optimizer=adagr,loss=loss_compile)

##### If model has been already trained and saved, uncomment the following line to load saved model

In [13]:
#model_top=load_model('mobile_top_3.h5',custom_objects={'loss_compile':loss_compile,'relu6':ReLU(6.),'tf':tf})

##### Training of the model. The for-loop is used to return the decaying learning rate back to it's original value and by that create additional disturbance to avoid falling into local minimum and also saving the model every N iterations. 

In [16]:
for i in tqdm(range(2)):
    model_top.fit_generator(gen,epochs=1,steps_per_epoch=2)
    model_top.save('mobile_top_3.h5')



  0%|                                                                                            | 0/2 [00:00<?, ?it/s]

Epoch 1/1




 50%|██████████████████████████████████████████                                          | 1/2 [00:55<00:55, 55.83s/it]

Epoch 1/1




100%|████████████████████████████████████████████████████████████████████████████████████| 2/2 [01:51<00:00, 55.98s/it]



##### In case of training on the cloud switch off the virtual machine after training is finished, to avoid additional costs for GPU time

In [None]:
!sudo poweroff

##### After the top layers were pretrained I build up the full model including convolutional base and save it for further training

In [13]:
inp0=Input((None,None,None))
inp1 = Input(shape=(2,))
inp2 = Input(shape=(2,))

x = inp0
labels = inp1
loss_weights = inp2

x=mobinet(x)
loss=model_top([x,inp1,inp2])

In [14]:
full_model = Model(inputs=[inp0,inp1,inp2],outputs=[loss])

In [16]:
full_model.summary()

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_8 (InputLayer)            (None, None, None, N 0                                            
__________________________________________________________________________________________________
mobilenetv2_1.00_224 (Model)    (None, None, None, 1 2257984     input_8[0][0]                    
__________________________________________________________________________________________________
input_9 (InputLayer)            (None, 2)            0                                            
__________________________________________________________________________________________________
input_10 (InputLayer)           (None, 2)            0                                            
__________________________________________________________________________________________________
model_1 (M

In [None]:
full_model.save('mobile_mini_top_attention_3.h5')