# Investigate InceptionV3 feature extraction

Here we compare tensorflow and keras implementations of trained inception_v3 feature extractor.
The result of both is will be the 2048 feature vector from the last conv layer.

## Questions
* Why does Keras give different results in different sessions? E.g. a bug?
* Why tf.hub and keras.applications feature vectors differ?
* What is the pooling used in the tensorflow model? In Keras we use max pooling.



In [1]:
import numpy as np

import tensorflow as tf
import tensorflow.keras as keras 
import tensorflow_hub as hub
from tensorflow.python.client import device_lib

tf.reset_default_graph()
#tf.set_random_seed(1234)


print(tf.__version__)
print(keras.__version__)
devices = [x.name for x in device_lib.list_local_devices()]
print(devices)


batch_size = 5

batch = np.zeros(shape=(batch_size,299,299,3),dtype=np.float32)
for i in range(batch_size):
    batch[i] = batch[i] + i

1.12.0
2.1.6-tf
['/device:CPU:0', '/device:GPU:0']


In [2]:
with tf.name_scope("Inputs"):
    images = tf.placeholder(shape=(None,299,299,3),dtype=tf.float32)

with tf.name_scope("Tensorflow"):
    tf_inception_v3 = hub.Module("https://tfhub.dev/google/imagenet/inception_v3/feature_vector/1")
    tf_features = tf_inception_v3(images)

with tf.name_scope("Keras"):
    keras_inception_v3 = keras.applications.inception_v3.InceptionV3(include_top=False, weights='imagenet', input_shape=(299,299,3), pooling='max')
    keras_features = keras_inception_v3(images)


for device in devices:

    with tf.device(device):
        
        print('Using device: ',device)
        for i in range(3):
            print('Session ',i)
            with tf.Session() as sess:
                sess.run(tf.global_variables_initializer())
                
                print('first run')
                tf_val,keras_val = sess.run([tf_features, keras_features],feed_dict={images:batch})
                for b in range(batch_size):
                    print('image{}: tf={}  keras={}'.format(b,np.sum(tf_val[b]),np.sum(keras_val[b])))
                    
                print('second run')
                tf_val,keras_val = sess.run([tf_features, keras_features],feed_dict={images:batch})
                for b in range(batch_size):
                    print('image{}: tf={}  keras={}'.format(b,np.sum(tf_val[b]),np.sum(keras_val[b])))
                print('')

INFO:tensorflow:Using C:\Users\xxxxxx\AppData\Local\Temp\tfhub_modules to cache modules.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
Using device:  /device:CPU:0
Session  0
first run
image0: tf=204.5306396484375  keras=0.0
image1: tf=200.6629638671875  keras=1.2956523895263672
image2: tf=379.994873046875  keras=2.5913047790527344
image3: tf=577.4241943359375  keras=3.8869566917419434
image4: tf=1049.082275390625  keras=5.182609558105469
second run
image0: tf=204.5306396484375  keras=0.0
image1: tf=200.6629638671875  keras=1.2956527471542358
image2: tf=379.994873046875  keras=2.5913054943084717
image3: tf=577.4241943359375  keras=3.8869566917419434
image4: tf=1049.082275390625  keras=5.182610988616943

Session  1
first run
image0: tf=204.5306396484375  keras=0.0
image1: tf=200.6629638671875  keras=0.6363641023635864
image2: tf=379.994873046875  keras=1.2727282047271729
image3: tf=577.4241943359375  keras=1.9090931415557861
image4: tf=1049.082

## Observations
### Keras inconsistency
Looks like tensorflow model is consistent across all devices and sessions, whereas the keras one is consistent only within a single session. Maybe the issue is related to variable initialization? -> Yes this is the case, because it is fixed when we use a fixed rng (tf.set_random_seed(1234)) Keras also becomes consistent

### What is the pooling used in both models?
Looking at the tensorflow graph, the pooling operation seems like tf.reduce_mean with keepdims=true and axis/reduction_indices=(1,2). This makes a ?,8,8,2048 -> ?,1,1,2048 reduction. It still does not explain the difference in the magnitude of the output values (Keras as configured uses max pooling and should have bigger values).

Note: 2048 is the number of filter in the last conv layer (I guess). And 8x8 is the size of the "spatial" fature maps of the penultimate conv layer before pooling


In [3]:
with tf.Session() as sess:
    writer = tf.summary.FileWriter("tensorboard", sess.graph)