<a href="https://colab.research.google.com/github/suchig/Intel-Image-Classification/blob/master/Embedding_Visualization_using_Tensorboard.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Visualization of embedding using tensorboard

---

Downloading files from Kaggle

In [0]:
#Below to be done to download data from Kaggle
!pip install -q kaggle
from google.colab import files
files.upload()
! mkdir ~/.kaggle
! cp kaggle.json ~/.kaggle/
! chmod 600 ~/.kaggle/kaggle.json
!kaggle datasets download -d puneet6060/intel-image-classification
!unzip intel-image-classification.zip
!rm intel-image-classification.zip
!rm seg_train.zip
!unzip seg_test.zip -d /tmp
!rm seg_test.zip
!rm seg_pred.zip

In [0]:
from tensorflow.keras.models import load_model
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.preprocessing import image
from tensorflow import argmax
import tensorflow as tf


import numpy as np
from pathlib import os
from PIL import Image

In [0]:
LOG_DIR="."
IMAGE_DIR="/tmp/seg_test"
EMBEDDING = "embedding.tsv"
META = "meta.tsv"
SPRITES = "sprites.png"
CHECKPOINT = "intel_tensorboard.ckpt"

#Image sprite and label creation

In [0]:
y=[]
i=-1
sprite=[]

for path in os.listdir(IMAGE_DIR):
  i += 1
  
  for f in os.listdir(IMAGE_DIR+"/"+ path):
    
    sprite_image = image.load_img(str(IMAGE_DIR+"/"+ path+"/"+f),target_size=(64,64,3))
    sprite.append(image.img_to_array(sprite_image))
    y.append(i)

sprite = np.asarray(sprite)

Load the model that was saved during Classification. Remove the last layer which does the classification. This step is done so that the Embedding features are extracted from the previous fully connected layer.

In [6]:
files.upload()
model = load_model("model12")
model.pop()
model.summary()

Saving model12 to model12


W0805 02:03:46.600184 140201871202176 deprecation.py:506] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops.py:97: calling GlorotUniform.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0805 02:03:46.602246 140201871202176 deprecation.py:506] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0805 02:03:46.612017 140201871202176 deprecation.py:506] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops.py:97: calling Zeros.__init__ (from tensorflow.python.ops.init_ops) with dtyp

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
vgg19 (Model)                (None, 4, 4, 512)         20024384  
_________________________________________________________________
global_average_pooling2d (Gl (None, 512)               0         
_________________________________________________________________
dense (Dense)                (None, 4096)              2101248   
_________________________________________________________________
dense_1 (Dense)              (None, 2048)              8390656   
_________________________________________________________________
dense_2 (Dense)              (None, 1024)              2098176   
_________________________________________________________________
dropout (Dropout)            (None, 1024)              0         
Total params: 32,620,614
Trainable params: 12,596,230
Non-trainable params: 20,024,384
___________________________________

Predict the embedding features for the images and save the same as a tsv file.

In [7]:
VALIDATION_DIR = IMAGE_DIR
validation_datagen = ImageDataGenerator(rescale=1.0/255.)
eval_generator = validation_datagen.flow_from_directory(
    IMAGE_DIR,
    target_size=(150,150),
    batch_size = 50,
    class_mode='categorical',
    shuffle=False)

predict_val = model.predict_generator(eval_generator)

np.savetxt(LOG_DIR+"/"+EMBEDDING,predict_val,delimiter="\t")


Found 3000 images belonging to 6 classes.


Save the labels (y) as meta.tsv.

In [0]:
np.savetxt(LOG_DIR+"/"+META,y,delimiter="\t")

Create Sprites and save the file

In [0]:
def create_sprite(data):
    
    n = int(np.ceil(np.sqrt(data.shape[0])))
    
    image_height = 64        
    image_width = 64
    
    big_image = Image.new(
      mode='RGBA',
      size=(image_width * n, image_height * n),
      color=(0,0,0,0)) 
    
    for i in range(len(data)):
      row = int(i / n)
      col = i % n
    
      row_loc = row * image_height
      col_loc = col * image_width
      big_image.paste(image.array_to_img(data[i]), (col_loc, row_loc)) 

    
    return big_image

In [0]:
sprite_data = create_sprite(sprite)
sprite_data.save(LOG_DIR+"/"+SPRITES)

The most crucial steps are
- to save the features or embedding vectors as checkpoint file using session.save. This generates checkpoint data, meta data(Not to be confused with label meta data) and index file.
- to let projector know the path to label meta data, Sprites file and name of the tensor pointing to features.

Visualize Embeddings will generate a configuration file based on all embedding information.

Outside of the notebook, the following has to be done
- Store all the generated files in a log directory
- Run Tensorboard --logdir=LOG_DIR_PATH
- Go to localhost:6006 to visualize the tensorboard

In [0]:
from tensorflow.contrib.tensorboard.plugins import projector
features = tf.Variable(predict_val, name='features')
with tf.Session() as sess:
    saver = tf.train.Saver([features])

    sess.run(features.initializer)
    saver.save(sess, LOG_DIR +"/"+CHECKPOINT)
    
    config = projector.ProjectorConfig()
    
    embedding = config.embeddings.add()
    embedding.tensor_name = features.name
    embedding.metadata_path = LOG_DIR+"/"+META
    
    embedding.sprite.image_path = LOG_DIR+"/"+SPRITES
    embedding.sprite.single_image_dim.extend([64,64])
    
    projector.visualize_embeddings(tf.summary.FileWriter(LOG_DIR), config)