# Integrating Glove and ConceptNet

This notebook is meant to demonstrate the features of Glove Embeddings to see how they can potentially be used in conjunction with the COCO dataset to numerically analyze explanations and such. The link to download the embeddings is [here](https://nlp.stanford.edu/projects/glove), and I downloaded the **6b** one with Wikipedia and all.

[Link to Glove Paper](https://nlp.stanford.edu/pubs/glove.pdf)

## Imports that may be Necessary

```python
from collections import defaultdict
import numpy as np
import gensim
from gensim.models.keyedvectors import KeyedVectors
from sklearn.decomposition import TruncatedSVD
import matplotlib.pyplot as plt
%matplotlib inline
```
*Note: You may need to conda install gensim*

In [1]:
#Imports Cell
from collections import defaultdict
import numpy as np
import gensim
from gensim.models.keyedvectors import KeyedVectors
from sklearn.decomposition import TruncatedSVD
import matplotlib.pyplot as plt
%matplotlib inline

In [2]:
# Put the path to glove here
# TODO: we may want to put a better pointer here.  
file_name = "glove.6B.50d.txt.w2v"
path = f"../glove/{file_name}"

#Now load the model into the variable "glove" (may take some time)
glove = KeyedVectors.load_word2vec_format(path, binary=False)

## How to Use Glove
```python
glove["word"] # Will give glove embedding vector for the word

"word" in glove #Checks if word is in glove (acts like a dictionary

glove["husband"] - glove["man"] + glove["woman"] #Should give representation that is wife

#To find most similar term to a vector:
    
glove.similar_by_vector(query)

#More advanced way to do this

glove.most_similar_cosmul(positive=['husband', 'woman'], negative=['man'])

#Since they are vectors, we can find the distance using dot products
```


In [14]:
#Define our domains based on the labels for searching the images
#Suppose "car" and "train" domain
# Iteratively find the most similar (ex. depth of 2 = for each main label, also add all their friend labels)
concept = "stoplight"
glove.similar_by_vector(glove[concept])

[('stoplight', 1.0000001192092896),
 ('flyspeck', 0.6498768329620361),
 ('one-horse', 0.6459125876426697),
 ('18-wheeler', 0.6452605128288269),
 ('sea-side', 0.615469217300415),
 ('nanopores', 0.613706111907959),
 ('tepee', 0.6063854694366455),
 ('12-block', 0.6035683751106262),
 ('abobo', 0.5967516303062439),
 ('hurriyah', 0.5955226421356201)]

In [15]:
# Compare to the reasonableness monitor 
import argparse
import requests
import sys
import logging

import nltk
from sympy import *  # python symbolic package
from sympy.logic import SOPform
import itertools
from time import process_time

# TODO we may want to do some data science stuff 
import pandas as pd
import numpy as np
import csv

#import commonsense.conceptnet as kb
import synthesizer.synthesize as synthesize

import monitor.reasonableness_monitor as monitor
anchors = ['animal', 'object', 'place', 'plant']
relations = ['AtLocation', 'LocatedNear'] # technically these are vehicle relations for now

In [16]:
#TODO: Simple parser entity extraction for the captions and their labels

In [17]:
stoplightExplain = monitor.snapshot_monitor([concept], anchors, relations, True)

REASONS ARE [['stoplight IsA object', 'Default anchor point']]
REASONS ARE [['stoplight IsA object', 'Default anchor point']]
[['stoplight IsA object', 'Default anchor point']]
Are we here in reasonable ['stoplight']


# General Ideas for Symbolic Reasoning

Suppose we had a bunch of **labels** from each of the outputs for the subsystems (the sem. seg & the two captions), we try to figure out how close all the different labels are across different systems.

We figure out a threshold and if 2-3 of them are super close in their vectors, and another one is not, we suspect that one, generally, but here are some other general ideas:

- Since each subsystem will present its own set of labels, (all the labels must be relatively close to each other) if **any of them seem abnormaly far away from the others** (maybe we can scramble to see this), then we say that it is not reasonable
- We do the same thing across multiple ones as well, and see the distances (maybe min distances) and try to figure out at a high level who is not reasonable
- We combine these local, and high-level checks with symbolic checks to determine overall reasonability

# Testing Demonstration

Let's take 3 different images from the CoCo Dataset where:

- Two of the images will be similar, and 1 image will be different

We will see if we can use the **semantic segmentation tags** + some basic distance calculations to see which images should be closest to each other:

Links to the three images used:

- 
- 
- 

# Math for Calculating Distances Between Captions

Suppose we have **sem. seg** labels that identify **O** objects, assuming we are using **N** dimensional vectors, we have a (**O**x**N**) array representing our values. 

Now suppose we have **caption labels** that identify **K** objects, assuming we are still using **N** dimensional vectors, we have a (**K**x**N**) array representing our values. 

## Finding Distances

When we are trying to find the distances, we need to use **multiplication** to make this easy. However, we need to remember that order matters if we were to just blindly do it:

```python
[[man], [woman], [cat]] * [[man], [woman], [cat]] = 0 #As distances between each element and itself would be zero

#However
[[cat], [woman], [man]] * [[man], [woman], [cat]] !=0 #As order is different so what will get multiplied is different
```

Therefore we will specifically use **matrix multiplication**. Therefore, the steps to find distances based on this are:


The python code to do so is as follows (uses `numpy`):

```python
O = 7
N = 2
K = 6

#Create two matrixes based on dimensions
x = np.arange(14).reshape((O,N))
y = np.arange(12).reshape((K,N))

# Distances are x^2 + y^2 - 2*x*y
dists = np.sum(x**2, axis=1)[:,np.newaxis] + np.sum(y**2, axis=1) 
dists -= 2*np.matmul(x,y.T)

distances = np.sqrt(dists)

#Now we find the minimum of this to represent as our distance
#For the axis, select whichever axis is smaller (will either be zero or 1)


np.min(distances,axis = np.argmin(distances.shape))

#By doing the above, we cover the one with more objects, so that there is a greater chance of greater distances (vs not properly accounting for an object's distance to others. Though, it shouldn't really matter either way.

```

In [5]:
arr_a = np.array([glove["man"],glove["woman"],glove["cat"]])
arr_b = np.array([glove["man"],glove["woman"],glove["cat"],glove["dog"]])

a = np.sum(arr_a**2,axis = 1)[:,np.newaxis]
b = np.sum(arr_b**2,axis = 1)

dists = a + b - 2*np.matmul(arr_a,arr_b.T)
dists[dists < 1e-6] = float(0.0)
dists = np.sqrt(dists)
type(np.min(dists,axis = np.argmin(dists.shape)))

numpy.ndarray

In [6]:
# Function based on all the computations above
def calcuate_distances(label_set_a:list, 
                       label_set_b:list) -> np.ndarray:
    """
    This function takes in two sets of glove embeddings vectors and returns the min distances between the two
    
    Parameters
    -------------    
    label_set_a : list 
            the first set of glove embedding vectors from one input source
    label_set_b : list
            the second set of glove embedding vectors from the second source
    
    Returns
    ---------
    numpy.ndarray
        The list of distances, where length = max(len(label_set_a),len(label_set_b))
    """
    
    #Turn both into numpy arrays
    arr_a = np.array(label_set_a)
    arr_b = np.array(label_set_b)
    
    #Square and transform as needed
    a = np.sum(arr_a**2,axis = 1)[:,np.newaxis]
    b = np.sum(arr_b**2,axis = 1)
    
    #Calculate the distances and take the square root
    #We are also cutting off where values too small
    dists = a + b - 2*np.matmul(arr_a,arr_b.T)
    dists[dists < 1e-6] = float(0.0)
    dists = np.sqrt(dists)
    
    #Return the minimum values across the axis with more glove embeddings
    return np.min(dists,axis = np.argmin(dists.shape))

In [7]:
# Function based on all the computations above
def calcuate_distance(label_set_a:list, 
                       label_set_b:list) -> np.ndarray:
    """
    This function takes in two sets of glove embeddings vectors and returns a single value representing the distance between the two values
    
    Parameters
    -------------    
    label_set_a : list 
            the first set of glove embedding vectors from one input source
    label_set_b : list
            the second set of glove embedding vectors from the second source
    
    Returns
    ---------
    float32
        A single value representing the distance between label_set_a and label_set_b
    """
    
    #Turn both into numpy arrays
    arr_a = np.array(label_set_a)
    arr_b = np.array(label_set_b)
    
    #Square and transform as needed
    a = np.sum(arr_a**2,axis = 1)[:,np.newaxis]
    b = np.sum(arr_b**2,axis = 1)
    
    #Calculate the distances and take the square root
    #We are also cutting off where values too small
    dists = a + b - 2*np.matmul(arr_a,arr_b.T)
    dists[dists < 1e-6] = float(0.0)
    dists = np.sqrt(dists)
    
    #Return the minimum values across the axis with more glove embeddings
    return np.sum(np.min(dists,axis = np.argmin(dists.shape)))

## Now let's use our above function with labels of 2 similar images and 1 different

For this test, we will simply **only use the sem.seg labels to see how well we can tell distances with that** (idea is that captions will use similar idea):
- [Similar Image A](https://cocodataset.org/#explore?id=5253) and [Similar Image B](https://cocodataset.org/#explore?id=277614) selected on coco site by clicking *stop sign* (stop) and *traffic light* (stoplight)
- [Different Image A](https://cocodataset.org/#explore?id=360877) selected on coco site by clicking *apple* and *chair*

Let's see the 3 way comparison test for distances:

In [8]:
#Code to manually look through available words
# asd = list(glove.vocab.keys())
# asd.sort()
#print(asd[360000:370000])
"traffic-sign" in glove

False

In [9]:
#List of representaion based on sem. seg labels
similar_image_a = [glove["stop"],glove["stoplight"]]
similar_image_b = [glove["stop"],glove["stoplight"],glove["train"],glove["clock"]]
different_image_a = [glove["person"],glove["bottle"],glove["banana"],glove["apple"],glove["chair"]]

In [10]:
print("Distances between similar_image_a, similar_image_b: ")
print(calcuate_distances(similar_image_a,similar_image_b))

print("\n")
print("Distances between similar_image_a, different_image_a: ")
print(calcuate_distances(similar_image_a,different_image_a))

print("\n")
print("Distances between similar_image_b, different_image_a: ")
print(calcuate_distances(similar_image_b,different_image_a))

print("\n")
print("Printing the sums of each:")
print(calcuate_distance(similar_image_a,similar_image_b))
print(calcuate_distance(similar_image_a,different_image_a))
print(calcuate_distance(similar_image_b,different_image_a))

Distances between similar_image_a, similar_image_b: 
[0.        0.        3.8940194 4.6098065]


Distances between similar_image_a, different_image_a: 
[4.951073  5.6056447 5.913838  6.0824575 5.8651605]


Distances between similar_image_b, different_image_a: 
[4.951073  5.6056447 5.913838  5.834715  4.8754997]


Printing the sums of each:
8.503826
28.418173
27.18077


#### Now we will try the same process as above but with scrambling some of the ordering (not making it nice and uniform across the diff. inputs)

In [11]:
#List of representaion based on sem. seg labels but with different ordering
similar_image_a = [glove["stoplight"], glove["stop"]]
similar_image_b = [glove["train"], glove["stop"], glove["stoplight"], glove["clock"]]
different_image_a = [glove["bottle"],glove["banana"],glove["person"],glove["chair"], glove["apple"]]

In [12]:
print("Distances between similar_image_a, similar_image_b: ")
print(calcuate_distances(similar_image_a,similar_image_b))

print("\n")
print("Distances between similar_image_a, different_image_a: ")
print(calcuate_distances(similar_image_a,different_image_a))

print("\n")
print("Distances between similar_image_b, different_image_a: ")
print(calcuate_distances(similar_image_b,different_image_a))

print("\n")
print("Printing the sums of each:")
print(calcuate_distance(similar_image_a,similar_image_b))
print(calcuate_distance(similar_image_a,different_image_a))
print(calcuate_distance(similar_image_b,different_image_a))

Distances between similar_image_a, similar_image_b: 
[3.8940194 0.        0.        4.6098065]


Distances between similar_image_a, different_image_a: 
[5.6056447 5.913838  4.951073  5.8651605 6.0824575]


Distances between similar_image_b, different_image_a: 
[5.6056447 5.913838  4.951073  4.8754997 5.834715 ]


Printing the sums of each:
8.503826
28.418175
27.18077


## Observations

As you can see from the cell above, the two *similar images* were close to each other, **and had similar distances to the different image** which can be really useful for **outlier analysis**. 

Another great thing that you can see is that *independent of the order of the different label input* **the outputs for the overall distances were identical** (and the vectors just in a different ordering, same values)

- We can also try this with higher dimensional (this was just with 50D vectors) so higher dimensional might lead to even tighter distances (closer things are closer, farther things are farther)
- We can also try this process and calculate distances **within 1 single image label set to find outlier labels** (test with scrambling labels)

# Labels within a Singular Object Distance

Now let's see what happens when we scramble labels within a single image and see if there is any possible way to identify that:

In [13]:
from scipy import spatial

#Labels from similar image b, with a refridgerator
test_labels = [glove["train"], glove["stop"], glove["stoplight"], glove["clock"],glove["refrigerator"]]

#Now let's see which one is the furthest from the others
empty_list =np.array([[0 for index in range(len(test_labels))] for index in range(len(test_labels))]).reshape(len(test_labels),-1)

for index in range(len(test_labels)):
    for sec_ind in range(len(test_labels)):
        print(test_labels[index])
        print(test_labels[sec_ind])
        print(1 - spatial.distance.cosine(test_labels[index], test_labels[sec_ind]))

print(empty_list)

[ 0.94971    0.34328    0.84504   -0.88519   -0.72078   -0.29309
 -0.74678    0.65122    0.47295   -0.74011    0.1877    -0.38279
 -0.55899    0.42952   -0.26984   -0.42383   -0.31236    1.3423
 -0.78567   -0.6302     0.91819    0.21126   -0.57442    1.4549
  0.75456   -1.6165    -0.0085015  0.0029134  0.51304   -0.47447
  2.5306     0.85944   -0.30667    0.057765   0.66231    0.20804
  0.64237   -0.5246    -0.053416   1.1404    -0.13703   -0.18361
  0.45459   -0.50963   -0.025539  -0.02861    0.18048   -0.4483
  0.40525   -0.36821  ]
[ 0.94971    0.34328    0.84504   -0.88519   -0.72078   -0.29309
 -0.74678    0.65122    0.47295   -0.74011    0.1877    -0.38279
 -0.55899    0.42952   -0.26984   -0.42383   -0.31236    1.3423
 -0.78567   -0.6302     0.91819    0.21126   -0.57442    1.4549
  0.75456   -1.6165    -0.0085015  0.0029134  0.51304   -0.47447
  2.5306     0.85944   -0.30667    0.057765   0.66231    0.20804
  0.64237   -0.5246    -0.053416   1.1404    -0.13703   -0.18361
  0.45

 -3.2283e-01 -2.4891e-01]
0.47796374559402466
[ 0.044751  0.22425   0.74523  -0.3927    0.73664  -0.086284 -0.010413
  0.41628   0.5249   -0.16484   0.29205  -0.095553  0.61566   1.602
  0.4641    1.2693   -1.1416    0.88237   0.47702  -0.78944   1.5937
 -0.34953   0.42561   0.56537  -0.32054  -0.5776   -0.39622   1.129
  1.8284   -0.10004   0.82675   0.46756  -0.23037   0.80698   0.52753
  1.0272    0.21814   1.1406    1.4022    0.30931   0.57774   0.2999
 -0.21302  -0.046468  0.076242  0.91611   0.47489  -0.55653  -0.45134
  0.049276]
[ 0.044751  0.22425   0.74523  -0.3927    0.73664  -0.086284 -0.010413
  0.41628   0.5249   -0.16484   0.29205  -0.095553  0.61566   1.602
  0.4641    1.2693   -1.1416    0.88237   0.47702  -0.78944   1.5937
 -0.34953   0.42561   0.56537  -0.32054  -0.5776   -0.39622   1.129
  1.8284   -0.10004   0.82675   0.46756  -0.23037   0.80698   0.52753
  1.0272    0.21814   1.1406    1.4022    0.30931   0.57774   0.2999
 -0.21302  -0.046468  0.076242  0.91611   

# Domain Generation

The cells below are specifically related to generating the terms related to the specific domain, which we can generate with a list of words:

```python
def gen_domain(list_words:list, depth:int) -> list:
```

In [14]:
def gen_domain(list_domain:list, depth:int) -> set:
    """
    This function takes in a list of strings, which represents the domain, and generates the relavent list of glove embeddings that represents this domain. 
    
    This does it through a recursive methodology
    
    
    Parameters
    -------------
    list_domain: list
    List of string terms that represent the domain
    
    depth: int
    How many layers should be used to generate the domain
    
    Returns
    ---------
    Set of strings of the terms that we should get the glove embeddings for
    """
    

    list_embeddings = set() #The final set representing the domain embeddings
    
    if depth == 0: #If we have gotten back to depth 0 it means we have added all the words to the depth we want
        return list_embeddings
    
    
    #We will use a BFS type function to generate our domain
    queue = set(list_domain)
    
    for word in queue: #For each domain word
        
        #Add the first thing in the queue to our list
        list_embeddings.add(word)
        
        #Get similar terms
        list_terms = set(word for word, similarity in glove.similar_by_vector(glove[word]))
        #Add its most similar words using a recursive call
        list_embeddings.update(gen_domain(list_terms, depth - 1))
    
    return list_embeddings    

Now let's make a more advanced version of the above, but where we try to make sure that the terms are similar to existing values in the domain.

The primary step in doing this is to define a **threshold** value that we only add values above that:
- Through basic visual analysis, both *0.7* and *0.75* can be tried for threshold values

In [15]:
def gen_thresh_domain(list_domain:list, depth:int, threshold=0.7) -> set:
    """
    This function takes in a list of strings, which represents the domain, and generates the relavent list of glove embeddings that represents this domain. 
    
    This does it through a recursive methodology and uses a threshold value to make sure that only relavent values are pulled up.
    
    
    Parameters
    -------------
    list_domain: list
    List of string terms that represent the domain
    
    depth: int
    How many layers should be used to generate the domain
    
    threshold: float
    The similarity threshold that we will only add values above, by default = 0.7
    
    Returns
    ---------
    Set of strings of the terms that we should get the glove embeddings for
    """
    

    list_embeddings = set() #The final set representing the domain embeddings
    
    if depth == 0: #If we have gotten back to depth 0 it means we have added all the words to the depth we want
        return list_embeddings
    
    
    #We will use a BFS type function to generate our domain
    queue = set(list_domain)
 
    for word in queue: #For each domain word
        
        #Add the first thing in the queue to our list
        list_embeddings.add(word)
        
        #Get similar terms
        list_terms = set(word for word, similarity in glove.similar_by_vector(glove[word]) if similarity >= threshold)
        
        #Add its most similar words using a recursive call
        list_embeddings.update(gen_thresh_domain(list_terms, depth - 1))
    
    return list_embeddings    

In [16]:
def convert_to_embeddings(domain:set) -> list:
    """
    Converts the domain of terms to a list of related embeddings
    
    Parameters
    -------------
    domain: set
    The set of terms that define the domain, each term is a string
    
    Returns
    ---------
    list of glove embeddings
    """
    return [glove[word] for word in domain]

In [17]:
b = gen_thresh_domain(["stop","stoplight"], 2)
len(b)

#print(set.intersection(vals, vally))


11

Let's do a test with this **huge** domain and see how close the vectors we had above are (the three vectors in the section above)

We need to make sure that we **convert our set of terms to list of glove embeddings**

In [18]:
depth = 4
bb = convert_to_embeddings(gen_thresh_domain(["stop","stoplight"], depth))
print("doneA")
cc = convert_to_embeddings(gen_thresh_domain(["apple","banana"], depth))
# print("Distances between similar_image_a, domain: ")
# print(calcuate_distances(similar_image_a,bb))

# print("\n")
# print("Distances between different_image_a, domain: ")
# print(calcuate_distances(different_image_a,bb))

# print("\n")
# print("Distances between similar_image_b, domain: ")
# print(calcuate_distances(similar_image_b,bb))

# print("\n")
# print("Printing the sums of each:")
# print("\n")
print("Dist Similar A to Domain: with Depth {}".format(depth))
a = calcuate_distance(similar_image_a,bb)
print(a)
print("\n")
print("Dist Similar B to Domain: with Depth {}".format(depth))
b = calcuate_distance(similar_image_b,bb)
print(b)
print("\n")
print("Dist Different A to Domain: with Depth {}".format(depth))
c = calcuate_distance(different_image_a,bb)
print(c)

print("_____________________________")
print("Sim A, Sim B  = {}".format(abs(b-a)))
print("Sim A, Diff A= {}".format(abs(c-a)))
print("Sim B, Diff A  = {}".format(abs(c-b)))

doneA
Dist Similar A to Domain: with Depth 4
1106.8966


Dist Similar B to Domain: with Depth 4
1075.3406


Dist Different A to Domain: with Depth 4
1343.2302
_____________________________
Sim A, Sim B  = 31.5560302734375
Sim A, Diff A= 236.3336181640625
Sim B, Diff A  = 267.8896484375


# Generating Vectors from Captions

To generate vectors from captions, there are several different methods, but I think the most basic way to do it for now is to make use of **stop-words**. 

## Stop Words

These are words in the english language like "a" "an" "the" and other terms we may consider to be *filler terms*. The idea is that by stripping these words, we can create more comprehensive and representative vectors. The `list_of_stop_words` below is from [this link](https://gist.github.com/sebleier/554280)

In [19]:
list_of_stop_words = ["i", "me", "my", "myself", "we", "our", "ours", "ourselves", "you", "your", "yours", "yourself", "yourselves", "he", "him", "his", "himself", "she", "her", "hers", "herself", "it", "its", "itself", "they", "them", "their", "theirs", "themselves", "what", "which", "who", "whom", "this", "that", "these", "those", "am", "is", "are", "was", "were", "be", "been", "being", "have", "has", "had", "having", "do", "does", "did", "doing", "a", "an", "the", "and", "but", "if", "or", "because", "as", "until", "while", "of", "at", "by", "for", "with", "about", "against", "between", "into", "through", "during", "before", "after", "above", "below", "to", "from", "up", "down", "in", "out", "on", "off", "over", "under", "again", "further", "then", "once", "here", "there", "when", "where", "why", "how", "all", "any", "both", "each", "few", "more", "most", "other", "some", "such", "no", "nor", "not", "only", "own", "same", "so", "than", "too", "very", "s", "t", "can", "will", "just", "don", "should", "now"]

In [64]:
import string
def gen_cap_terms(caption:str) -> list:
    """
    This function takes in a caption as a string and will return a list of the most important words that are also found in Glove
    
    Parameters
    -------------
    caption : str
        The caption in a string form
    
    Returns
    ---------
    set[str]
        Returns a set of important terms recognized by Glove as a String
    """
    caption = "".join(token for token in caption if token not in string.punctuation)

    caption_list = caption.split() #Tokenizes the words
    
    ret_set = set()  #set of terms we will return
    
    for word in caption_list:
        if word not in list_of_stop_words and word in glove:
            ret_set.add(word)
    
    return ret_set

In [68]:
#All test are from one image Sim Image A
test = "a pole with both street lights and a stop sign."
test2 = "a stop sign, road sign and traffic light sitting near buildings."
#Trying with stuff from diff image (Diff Image A)
diff1 = "a bunch of bananas are hanging."
diff2 = "a store with lots of unripe bananas and other products."
print(gen_cap_terms(diff1))


{'bunch', 'hanging', 'bananas'}


In [49]:
"car." in glove

False

Now that we have a way to get caption related info, let's run this with captions and see what distances between **same caption and image** are and **different captions and image** as well.

In [23]:
cap_a = convert_to_embeddings(gen_cap_terms(test))
a = calcuate_distance(similar_image_a,cap_a)
print("Dist Similar A to Caption A = {}".format(a))

print("\n")

cap_b = convert_to_embeddings(gen_cap_terms(test2))
b = calcuate_distance(similar_image_a,cap_b)
print("Dist Similar A to Caption B = {}".format(b))

print("\n")

cap_c = convert_to_embeddings(gen_cap_terms(diff1))
c = calcuate_distance(similar_image_a,cap_c)
print("Dist Similar A to Diff Cap A = {}".format(c))

print("\n")

cap_d = convert_to_embeddings(gen_cap_terms(diff1))
d = calcuate_distance(different_image_a,cap_d)
print("Dist Diff Image A to Diff Cap A = {}".format(d))

print("\n")

cap_e= convert_to_embeddings(gen_cap_terms(diff2))
e = calcuate_distance(different_image_a,cap_e)
print("Dist Diff Image A to Diff Cap B = {}".format(c))

print("\n")

cap_f = convert_to_embeddings(gen_cap_terms(test))
f = calcuate_distance(different_image_a,cap_f)
print("Dist Diff Image A to Sim Cap A = {}".format(f))

print("\n")

Dist Similar A to Caption A = 19.996829986572266


Dist Similar A to Caption B = 27.7014102935791


Dist Similar A to Diff Cap A = 15.85655403137207


Dist Diff Image A to Diff Cap A = 22.23456382751465


Dist Diff Image A to Diff Cap B = 15.85655403137207


Dist Diff Image A to Sim Cap A = 26.68285369873047




In [65]:
#Here we will do distances to domai instead, since this is already the baseline
print(gen_cap_terms(test2))
print(gen_cap_terms(test))
print(gen_cap_terms(diff1))
print(gen_cap_terms(diff2))



cap_a = convert_to_embeddings(gen_cap_terms(test))
cap_b = convert_to_embeddings(gen_cap_terms(test2))
cap_d = convert_to_embeddings(gen_cap_terms(diff1))
cap_e= convert_to_embeddings(gen_cap_terms(diff2))

a = calcuate_distance(bb,cap_a)
print(a)
a = calcuate_distance(bb,cap_b)
print(a)
a = calcuate_distance(bb,cap_d)
print(a)
a = calcuate_distance(bb,cap_e)
print(a)

print("--------------------")
setty = convert_to_embeddings(set(["sign","light","road","traffic"]))
a = calcuate_distance(bb,setty)
print(a)
setty = convert_to_embeddings(set(["lights","pole","sign","street"]))
a = calcuate_distance(bb,setty)
print(a)
setty = convert_to_embeddings(set(["bananas","food","bunch"]))
a = calcuate_distance(bb,setty)
print(a)
setty = convert_to_embeddings(set(["bananas","store","unripe"]))
a = calcuate_distance(bb,setty)
print(a)

{'near', 'stop', 'buildings', 'traffic', 'road', 'light', 'sign', 'sitting'}
{'stop', 'lights', 'pole', 'street', 'sign'}
{'bunch', 'car', 'hanging', 'bananas'}
{'bananas', 'unripe', 'lots', 'products', 'store'}
1092.3867
1063.3086
1328.0569
1351.4886
--------------------
1134.8907
1179.7527
1469.3591
1615.6538


# Verbs
We will try the same thing with stripping all the verbs from the domain and see if we get better results for the two domains we specified. Verbs from [link](https://github.com/datmt/English-Verbs)

First we have to read in the verbs from the text file and keep them as a list or set

In [25]:
set_verbs = set() #Set of Terms

f = open("verbsList.txt", "r")
for x in f:
  set_verbs.add(x[:-1])
f.close()

In [26]:
sorted(set_verbs)

['abandon',
 'abase',
 'abate',
 'abbreviate',
 'abdicate',
 'abduct',
 'abet',
 'abhor',
 'abide',
 'abjure',
 'abnegate',
 'abolish',
 'abominate',
 'abort',
 'abound',
 'abrade',
 'abridge',
 'abrogate',
 'abscond',
 'abseil',
 'absent',
 'absolve',
 'absorb',
 'abstain',
 'abstract',
 'abuse',
 'abut',
 'accede',
 'accelerate',
 'accent',
 'accentuate',
 'accept',
 'access',
 'accessorise',
 'accessorize',
 'acclaim',
 'acclimate',
 'acclimatise',
 'acclimatize',
 'accommodate',
 'accompany',
 'accomplish',
 'accord',
 'accost',
 'account',
 'accouter',
 'accoutre',
 'accredit',
 'accrue',
 'acculturate',
 'accumulate',
 'accuse',
 'accustom',
 'ace',
 'ache',
 'achieve',
 'acidify',
 'acknowledge',
 'acquaint',
 'acquiesce',
 'acquire',
 'acquit',
 'act',
 'action',
 'activate',
 'actualise',
 'actualize',
 'actuate',
 'adapt',
 'add',
 'addle',
 'address',
 'adduce',
 'adhere',
 'adjoin',
 'adjourn',
 'adjudge',
 'adjudicate',
 'adjure',
 'adjust',
 'administer',
 'admire',
 'adm

In [27]:
depth = 3
set(word for word in gen_thresh_domain(["stop","stoplight"], depth) if word not in set_verbs)

{'any',
 'attempting',
 'before',
 'blocking',
 'calling',
 'calls',
 'causing',
 'could',
 'declaring',
 'did',
 'forcing',
 'him',
 'instead',
 'intended',
 'kept',
 'making',
 'meant',
 'moves',
 'passing',
 'prevented',
 'preventing',
 'rather',
 'referring',
 'saying',
 'started',
 'stoplight',
 'stopped',
 'stopping',
 'stops',
 'them',
 'they',
 'threatened',
 'threatening',
 'to',
 'trains',
 'tried',
 'trying',
 'urging',
 'wanted',
 'way',
 'went',
 'when',
 'without',
 'would'}

Now that we have our verbs (from the `verbsList.text` placed in the **same level** as this file), we will remove **all verbs** from the domain and the captions and see if we get better performance

In [28]:
depth = 4
aa = convert_to_embeddings(set(word for word in gen_thresh_domain(["stop","stoplight"], depth) if word not in set_verbs))
print("doneA")
dd = convert_to_embeddings(set(word for word in gen_thresh_domain(["apple","banana"], depth) if word not in set_verbs))
# print("Distances between similar_image_a, domain: ")
# print(calcuate_distances(similar_image_a,bb))

# print("\n")
# print("Distances between different_image_a, domain: ")
# print(calcuate_distances(different_image_a,bb))

# print("\n")
# print("Distances between similar_image_b, domain: ")
# print(calcuate_distances(similar_image_b,bb))

# print("\n")
# print("Printing the sums of each:")
# print("\n")
print("Dist Similar A to Domain: with Depth {}".format(depth))
a = calcuate_distance(similar_image_a,aa)
print(a)
print("\n")
print("Dist Similar B to Domain: with Depth {}".format(depth))
b = calcuate_distance(similar_image_b,aa)
print(b)
print("\n")
print("Dist Different A to Domain: with Depth {}".format(depth))
c = calcuate_distance(different_image_a,aa)
print(c)

print("_____________________________")
print("Sim A, Sim B  = {}".format(abs(b-a)))
print("Sim A, Diff A= {}".format(abs(c-a)))
print("Sim B, Diff A  = {}".format(abs(c-b)))

doneA
Dist Similar A to Domain: with Depth 4
776.0975


Dist Similar B to Domain: with Depth 4
755.53284


Dist Different A to Domain: with Depth 4
926.8762
_____________________________
Sim A, Sim B  = 20.56463623046875
Sim A, Diff A= 150.77874755859375
Sim B, Diff A  = 171.3433837890625


In [69]:
#Here we will do distances to domai instead, since this is already the baseline
print(gen_cap_terms(test2))
print(gen_cap_terms(test))
print(gen_cap_terms(diff1))
print(gen_cap_terms(diff2))




cap_a = convert_to_embeddings(gen_cap_terms(test))
cap_b = convert_to_embeddings(gen_cap_terms(test2))
cap_d = convert_to_embeddings(gen_cap_terms(diff1))
cap_e= convert_to_embeddings(gen_cap_terms(diff2))

a = calcuate_distance(dd,cap_a)
print(a)
a = calcuate_distance(dd,cap_b)
print(a)
a = calcuate_distance(dd,cap_d)
print(a)
a = calcuate_distance(dd,cap_e)
print(a)

print("--------------------")
setty = convert_to_embeddings(set(["sign","light","road","traffic"]))
a = calcuate_distance(dd,setty)
print(a)
setty = convert_to_embeddings(set(["lights","pole","sign","street"]))
a = calcuate_distance(dd,setty)
print(a)
setty = convert_to_embeddings(set(["bananas","food","bunch"]))
a = calcuate_distance(dd,setty)
print(a)
setty = convert_to_embeddings(set(["bananas","store","unripe"]))
a = calcuate_distance(dd,setty)
print(a)

{'near', 'stop', 'buildings', 'traffic', 'road', 'light', 'sign', 'sitting'}
{'stop', 'lights', 'pole', 'street', 'sign'}
{'bunch', 'hanging', 'bananas'}
{'bananas', 'unripe', 'lots', 'products', 'store'}
2264.1357
2203.3262
1913.2384
1774.3337
--------------------
2232.8982
2280.894
1888.7606
1835.6624


# General Observations on Verbs

When we don't get rid of verbs, the general **distances are larger** and the **distances between different captions of different images are larger** but sometimes those distances can be sporadically different even if they are from the *same image*.


When we do get rid of the verbs, **the overall distances are smaller** and the **distances between different captions of different images is smaller** but these distances among captions from the same images **is a lot less**

So to summarize **having verbs gives greater absolute distances** whereas **not having verbs gives tighter distances among the same image captions**