# Simple examples

In this notebook we include a complete example for a pair of concepts (*gun-knife*) using the representation model presented in the (under review) article: "Vector Symbolic Architectures for Concept Representation". 

Many of the processes done in the following cells are not done explicitly in the main files of the project. This demostration is only for didactic purposes.


### Libraries

In [38]:
import random
import numpy
import pandas as pd
import matplotlib.pyplot as plt

# Declaring all functions in HDComputingBasics & EncodingDataset
%run EncodingDataset.ipynb

### Semantic features of *knife* and *gun*.

These are some of the semantic features for knife and gun from the McRae et al, dataset.

In [39]:
# General dictionary of definitions
Dictionary = {} #['knife', 'gun']

Dictionary['knife'] = ['has a handle', 'made of metal', 'is cutlery', 'is sharp', 
                 'used for cutting', 'is a weapon', 'used for killing', 'found in kitchens']

Dictionary['gun'] = ['has a trigger', 'made of metal', 'used for hunting', 'used for killing', 
              'is dangerous', 'is a weapon', 'usedy by police', 'requires bullets']

### Formatting semantic features 

Each semantic feature is divided into a relation and a feature.

Some features are lemmatized.

In [40]:
Dictionary['knife'] = [['has','handle'], ['made_of', 'metal'], ['is', 'cutlery'], ['is', 'sharp'], ['used_for', 'cut'],
                ['is','weapon'], ['found_in', 'kitchen']]

Dictionary['gun'] = [['has', 'trigger'], ['made_of', 'metal'], ['used_for', 'hunt'], ['used_for','kill'], ['is', 'dangerous'],
                ['is', 'weapon'], ['used_by', 'police'], ['requires', 'bullets']]

### From semantic features to HD vectors

####  Extracting all single words (relations & features) from Dictionary

In [41]:
All_words_set = set(Dictionary.keys())

for concept in Dictionary.values():
    for SemFeature in concept:
        All_words_set.add(SemFeature[0]) 
        All_words_set.add(SemFeature[1]) 

All_words = list(All_words_set)
print(All_words)

['dangerous', 'kill', 'requires', 'found_in', 'cut', 'sharp', 'trigger', 'gun', 'bullets', 'used_by', 'cutlery', 'knife', 'police', 'made_of', 'is', 'used_for', 'weapon', 'metal', 'handle', 'has', 'hunt', 'kitchen']


#### Assigning a random high-dimensional vector to each relation, feature and concept

In [42]:
# Dictionary of vectors
Dictionary_vecs = {}

for word in All_words:
    Dictionary_vecs[word] = HDvector(N)  # N = 10,000... 

# Dictionary_vecs

#### Verifying (approximate) orthogonality

This loop measures the distance between all the vectors declared so far.

In [43]:
Num_of_vecs = len(Dictionary_vecs.values())
list_vectors = list(Dictionary_vecs.values())
All_distances = []

for i in range(Num_of_vecs):
    for j in range(i+1, Num_of_vecs):
        # Function to measure distance between two vectors
        All_distances.append( HDvector.dist(list_vectors[i], list_vectors[j]) )
        
# Pandas series
All_distances_ser = pd.Series(All_distances)

# Descriptive analytics of distances:
All_distances_ser.describe()

count     231.000000
mean     4996.731602
std        51.353902
min      4870.000000
25%      4959.000000
50%      5003.000000
75%      5033.000000
max      5162.000000
dtype: float64

#### Multiplication leads to an orthogonal vector

By multiplying two orthogonal vectors the result will be another vector that is orthogonal to the first two. 

In [44]:
vec_A = HDvector(N) # A random vector
vec_B = HDvector(N) # Another random vector
print("Vector A:", vec_A)
print("Vector B:", vec_B)
print("Distance: ", vec_A.dist(vec_B))

mult = vec_A * vec_B
print('Multiplication...:', mult)
print('Distance from A to mult:', vec_A.dist(mult))
print('Distance from B to mult:', vec_B.dist(mult))

Vector A: [1 0 0 ... 0 1 0]
Vector B: [0 1 0 ... 1 1 0]
Distance:  4977
Multiplication...: [1 1 0 ... 1 0 0]
Distance from A to mult: 4988
Distance from B to mult: 5035


### Creating semantic pointers

The general equation for semantic features is:

$$Semantic Pointers = Relation_0 * Feature_0 + Relation_1 * Feature_1 + ... + Relation_n * Feature_n$$

In this cell we perform this equation for the concepts *knife* and *gun*

In [45]:
knife_SP = ADD([Dictionary_vecs[x[0]] * Dictionary_vecs[x[1]] for x in Dictionary['knife']])

gun_SP = ADD([Dictionary_vecs[x[0]] * Dictionary_vecs[x[1]] for x in Dictionary['gun']])

### Measuring Distance.

In [46]:
HDvector.dist(knife_SP, gun_SP)

4127

#### Measuring similarity

To measure the distance we apply the following formula:

$$ Semantic similarity = 1 - HammingDistance / N$$

In [47]:
SemSim = 1 - float(HDvector.dist(knife_SP, gun_SP)) / N

SemSim

0.5872999999999999

This semantic similarity value indicates that the concepts are clearly related (the closer to 1 the more similar). As seen before, the mean distance between two random vectors is (in this case) 5,000. The hyperdimensional computings operations performed were able to encode the semantic similarities between the sets of semantic features. 

The average semantic similarity according to humans for this pair was **4.88**.

### Interpretability

**What a knife has?, What a gun has?**

HD representations have the particular characteristics of being interpretable. It is possible to analyze the semantic features used to create its semantic pointer.

In [48]:
# Multiplying semantic pointers by the relation vector has
knife_has = knife_SP * Dictionary_vecs['has']
gun_has = gun_SP * Dictionary_vecs['has']

In [49]:
# Finding the closest vector 
knife_has_dists = []
gun_has_dists = []

for i in range(Num_of_vecs):
    knife_has_dists.append( HDvector.dist(knife_has, list_vectors[i]) )
    gun_has_dists.append( HDvector.dist(gun_has, list_vectors[i]) )
        
knife_has_dists = np.array(knife_has_dists)
gun_has_dists = np.array(gun_has_dists)

print(All_words[np.argmin(knife_has_dists)])
print(All_words[np.argmin(gun_has_dists)])

handle
trigger


**Why are *knife* and *gun* similar??**

To answer this question a similar analysis as the one in the preceding cell should be performed to each concept vector for every single relation, with the goal of finding semantic features coincidences.