In this notebook we will demonstrate how you can use Tensorboard to visualize word embeddings which we created in the Training_embeddings_using_gensim.ipynb notebook

In [1]:
# To install only the requirements of this notebook, uncomment the lines below and run this cell

# ===========================

!pip install tensorflow==1.14.0
!pip install gensim==3.6.0
!pip install numpy==1.19.5

# ===========================

Collecting tensorflow==1.14.0
  Using cached tensorflow-1.14.0-cp36-cp36m-win_amd64.whl (68.3 MB)
Collecting keras-preprocessing>=1.0.5
  Using cached Keras_Preprocessing-1.1.2-py2.py3-none-any.whl (42 kB)
Collecting google-pasta>=0.1.6
  Using cached google_pasta-0.2.0-py3-none-any.whl (57 kB)
Collecting gast>=0.2.0
  Using cached gast-0.5.0-py3-none-any.whl (10 kB)
Collecting tensorflow-estimator<1.15.0rc0,>=1.14.0rc0
  Using cached tensorflow_estimator-1.14.0-py2.py3-none-any.whl (488 kB)
Collecting keras-applications>=1.0.6
  Using cached Keras_Applications-1.0.8-py3-none-any.whl (50 kB)
Collecting grpcio>=1.8.6
  Using cached grpcio-1.39.0-cp36-cp36m-win_amd64.whl (3.2 MB)
Collecting tensorboard<1.15.0,>=1.14.0
  Using cached tensorboard-1.14.0-py3-none-any.whl (3.1 MB)
Collecting numpy<2.0,>=1.14.5
  Using cached numpy-1.19.5-cp36-cp36m-win_amd64.whl (13.2 MB)
Collecting termcolor>=1.1.0
  Using cached termcolor-1.1.0-py3-none-any.whl
Collecting wrapt>=1.11.1
  Using cached wrapt

In [2]:
# To install the requirements for the entire chapter, uncomment the lines below and run this cell

# ===========================

# try :
#     import google.colab
#     !curl https://raw.githubusercontent.com/practical-nlp/practical-nlp/master/Ch3/ch3-requirements.txt | xargs -n 1 -L 1 pip install
# except ModuleNotFoundError :
#     !pip install -r "ch3-requirements.txt"

# ===========================

In [3]:
#making the required imports
import warnings #ignoring the generated warnings
warnings.filterwarnings('ignore')

import tensorflow as tf
from tensorflow.contrib.tensorboard.plugins import projector
tf.logging.set_verbosity(tf.logging.ERROR)

import numpy as np
from gensim.models import KeyedVectors
import os

In [4]:
#Loading the model
cwd=os.getcwd() 
model = KeyedVectors.load_word2vec_format(cwd+'\Models\word2vec_cbow.bin', binary=True)

In [5]:
#get the model's vocabulary size
max_size = len(model.wv.vocab)-1

In [6]:
#make a numpy array of 0s with the size of the vocabulary and dimensions of our model
w2v = np.zeros((max_size,model.wv.vector_size))

In [7]:
#Now we create a new file called metadata.tsv where we save all the words in our model 
#we also store the embedding of each word in the w2v matrix
if not os.path.exists('projections'):
    os.makedirs('projections')
    
with open("projections/metadata.tsv", 'w+',encoding="utf-8") as file_metadata: #changed    added encoding="utf-8"
    
    for i, word in enumerate(model.wv.index2word[:max_size]):
        
        #store the embeddings of the word
        w2v[i] = model.wv[word]
        
        #write the word to a file 
        file_metadata.write(word + '\n')

In [8]:
#initializing tf session
sess = tf.InteractiveSession()

In [9]:
#Initialize the tensorflow variable called embeddings that holds the word embeddings:
with tf.device("/cpu:0"):
    embedding = tf.Variable(w2v, trainable=False, name='embedding')

In [10]:
#Initialize all variables
tf.global_variables_initializer().run()

In [11]:
#object of the saver class which is actually used for saving and restoring variables to and from our checkpoints
saver = tf.train.Saver()

In [12]:
#with FileWriter,we save summary and events to the event file
writer = tf.summary.FileWriter('projections', sess.graph)

In [13]:
# Initialize the projectors and add the embeddings
config = projector.ProjectorConfig()
embed= config.embeddings.add()

In [14]:
#specify our tensor_name as embedding and metadata_path to the metadata.tsv file
embed.tensor_name = 'embedding'
embed.metadata_path = 'metadata.tsv'

In [15]:
#save the model
projector.visualize_embeddings(writer, config)

saver.save(sess, 'projections/model.ckpt', global_step=max_size)

'projections/model.ckpt-161017'

Open a terminal window and type the following command

tensorboard --logdir=projections --port=8000

If the tensorboard does not work for you try providing the absolute path for projections and re-run the above command

If youve done everything right until you will get a link in your terminal through which you can access the tensorboard. Click on the link or copy paste it in your browser. You should see something similar to this.
![TensorBoard-1](Images/TensorBoard-1.png)
<br>
In the top right corner near "INACTIVE" click the dropdown arrow. And select PROJECTIONS from te dropdown menu
![TensorBoard-2](Images/TensorBoard-2.png)
<br>
Wait for a few seconds for it to load. You can now see your embeddings there are a lot of setting you can play around and experiment with.
![TensorBoard-3](Images/TensorBoard-3.png)
<br>
Output when we search for a specific word in this case "human" and isolate only those points
![TensorBoard-4](Images/TensorBoard-4.png)