### Transforming word vectors

$$XR\approx Y $$

> In translation, X corresponds to the matrix of English word vector, and Y corresponds to the matrix of French word vector. R is the mapping matrix

**Learn R**

- Initialize R
- For loop
    - Loss = $\lVert XR-Y \rVert_{F}$
    - $g=\frac{d}{dR}Loss$
    - $R = R-\alpha * g$

**Frobenius norm**

$$A = \pmatrix{2 & 2 \\ 2 & 2}$$
$$\lVert A \rVert_{F} = \sqrt{2^2+2^2+2^2+2^2} = 4$$
$$\lVert A \rVert_{F} = \sqrt{\sum_{i=1}^m \sum_{j=1}^n \lvert a_{ij}\rvert^2}$$

**Hash table and hash functions**

a function that takes data of arbitrary sizes and maps it to a fixed value. The values returned are known as hash values or even hashes
> Hash function(vector) -> Hash value

> hash value = vector % number of buckets

In [None]:
def basic_has_table(value_l,n_buckets):
    def hash_function(value_l,n_buckets):
        return int(value_l)% n_buckets
    hash_table = {i:[] for i in range(n_buckets)}
    for value in value_l:
        hash_value = hash_function(value,n_buckets)
        hash_table[hash_value].append(value)
    return hash_table
    

**Approximate nearest neightbors**

In [None]:
def side_of_plane_matrix(P,v):
    dotproduct = np.dot(P,v.T)
    sign_of_dot_product = np.sign(dotproduct)
    return sign_of_dot_product
num_dimensions = 2
num_planes = 3
random_planes_matrix = np.random.normal(size=(num_planes,num_dimensions))
num_planes_matrix = side_of_plane_matrix(random_planes_matrix,v)

**Multiple Planes**

Given some point denoted by v, you can run it through several projections P1,P2,P3 to get one hash value. If you compute $P_1v^T$ you get a positive number, so you set $h_1 = 1$.$P_2v^T$  gives you a positive number so you get $h_2 =1$.$P_3v^T$ is a negative number so you set $h_3$ to be 0. Then

$hash = 2^0 \times h_1 +2^1 \times h_2 +2^2 \times h_3 = 1 \times 1 + 2\times 1 + 4 \times 0 = 3$

$$hashvalue = \sum_i^H 2^i \times h_i$$

In [None]:
def has_multiple_plane(P_l,v):
    #P_l list of planes. 
    hash_value = 0
    for i,P in enumerate(P_l):
        sign = side_of_plane(P,v)
        hash_i = 1 if sign >=0 else 0
        hash_value +=2**i*hash_i
    return hash_value

**Searching documents**

In [None]:
word_embedding ={"I":np.array([1,0,1]),
                "love":np.array([-1,0,1]),
                "learning":np.array([1,0,1])}
words_in_document = ["I","love","learning"]
document_embedding = np.array([0,0,0])
for word in words_in_document:
    document_embedding +=word_embedding.get(word,0)
print(document_embedding)