# local vector similarity measure across slices of data (e.g. word embeddings across time slices)

for the algorithm and the mathematical formula see the paper: https://www.aclweb.org/anthology/D16-1229/
Cultural Shift or Linguistic Drift? Comparing Two Computational Measures of Semantic Change
William L. Hamilton, Jure Leskovec, Dan Jurafsky

1. measure cosine-similarity for every word_j that is in the union of word_i's k neighbours at time t with word_i's k neighbours at time t+1. The result of this measure will be input into position j of the s vector. This will be vector s_i at time t.

2. do the same except for t+1 and t+2 to get s_i at time t+1

3. get cosine dist, i.e.: 1 - cosine similarity of vector s_i at time t and vector s_i at t+1

In [2]:
import numpy as np

In [3]:
def cosinesim(x,y):
    '''cosine similarity'''
    return x@y/(np.linalg.norm(x)*np.linalg.norm(y))

In [10]:
def local_vec_similarity(word_i, neighbour_matrix_t1, neighbour_matrix_t2, k_neighbours=10):
    '''
    inputs:
        - word_i: the word whose change in meaning over time 
        (or also geographical region, or however the data is sliced) we're trying to track
        
        - neighbour_matrix_t1: the matrix of the vertically stacked ordered sets of vectors
        (from most to least cosine-similar) of word_i's k_neighbours at time t and at time t+1. 
        Its dimension is: 2*k_neighbours x embedding_size
        
        - neighbour_matrix_t2: same as above, except word_i's k_neighbours at time t+1 and time t+2
        
        - k_neighbours: number of neighbours to compare against. It should be between 10 and 50 (cf. op. cit., p. 2118).
        (programmatically it's not necessary to input this, because it can be inferred from#rows/2 from the neighbour_matrix,
        but given the fact that there is a clear recommendation as to how many neighbours should be used, 
        it is good to make this explicit in the input)

    '''
    
    j = k_neighbours*2
    
    assert(j == neighbour_matrix_t1.shape[0] == neighbour_matrix_t2.shape[0])
    
    s_i_t1 = cosinesim(word_i, neighbour_matrix_t1.T)
    s_i_t2 = cosinesim(word_i, neighbour_matrix_t2.T)
    
    assert(j == s_i_t1.shape[0] == s_i_t2.shape[0])
    
    return 1 - cosinesim(s_i_t1, s_i_t2)
    

In [36]:
word = np.ones(4)
mtx1 = np.ones((20,4)) * 3
mtx2 = np.ones((20,4))/2
local_vec_similarity(word_i=word, neighbour_matrix_t1=mtx1, neighbour_matrix_t2=mtx2, k_neighbours=10)

0.0

In [39]:
mtx11 = np.random.rand(20,4)
mtx22 = np.arange(20*4).reshape(20,-1)
local_vec_similarity(word_i=word, neighbour_matrix_t1=mtx11, neighbour_matrix_t2=mtx22, k_neighbours=10)

0.1351533051132151

In [35]:
mtx3 = np.random.rand(20,4)
local_vec_similarity(word_i=word, neighbour_matrix_t1=mtx11, neighbour_matrix_t2=mtx3, k_neighbours=10)

0.060764273511923816

In [38]:
mtx4 = np.flip(np.arange(20*4)).reshape(20,-1)
local_vec_similarity(word_i=word, neighbour_matrix_t1=mtx22, neighbour_matrix_t2=mtx4, k_neighbours=10)

0.5085434341020432