Ah! Ok, the simple idea is "cosine similarity of vectors"

In [2]:
import numpy as np
import random

In [None]:
user_pref_latent = np.array([random.randint(0, 1) for _ in range(10)])
company_data_latent = np.array([random.randint(0, 1) for _ in range(10)])
print(user_pref_latent)
print(company_data_latent)

[1 0 1 0 0 0 1 1 0 1]
[0 0 0 0 0 1 1 0 1 1]


now, what do these vectors mean? Well in this case right now, nothing, but we can measure a similarity score between them.

In [None]:
cos_similarity = np.dot(user_pref_latent, company_data_latent) / (np.linalg.norm(user_pref_latent) * np.linalg.norm(company_data_latent))
cos_similarity

np.float64(0.4472135954999579)

So, we would store the user company data as a table (matrix) for their interactions.

Each row would represent a user, each column being a company, each index being their interaciton result:

*   1 - the user liked the company
*   0 - the user swiped away the company
*   2 (placeholder value) - we don't have an interaction between them yet





In [None]:
user_company_matrix = np.array([[random.randint(0, 2) for _ in range(5)] for _ in range(5)])
user_company_matrix

array([[2, 0, 0, 2, 0],
       [1, 0, 0, 0, 0],
       [0, 1, 2, 1, 1],
       [2, 2, 0, 1, 2],
       [2, 2, 1, 2, 2]])

Since our database right now contains some interactions of 1's and zeros, we would want those to predict for those place where there's 2 right now, will that be a 1 or 0? Will the user like or hate the company?

So, what we can do is turn this matrix as the composition of 2 matrixes

In [None]:
user_latent_matrix = np.array([[round(random.random(), 2) for _ in range(10)] for _ in range(5)])
company_latent_matrix = np.array([[round(random.random(), 2) for _ in range(10)] for _ in range(5)])
print(user_latent_matrix)
print()
print(company_latent_matrix)

[[0.27 0.7  0.58 0.62 0.45 0.72 0.87 0.77 0.65 0.51]
 [0.73 0.3  0.76 0.55 0.92 0.24 0.31 0.82 0.39 0.66]
 [0.14 0.94 0.63 0.91 0.92 0.66 0.52 0.32 0.47 0.31]
 [0.86 0.69 0.99 0.42 0.65 0.26 0.5  0.22 0.27 0.82]
 [0.71 0.27 0.56 0.68 0.72 0.2  0.79 0.09 0.87 0.06]]

[[0.67 0.84 0.71 0.61 0.25 0.05 0.68 0.4  0.18 0.73]
 [0.93 0.63 0.92 0.3  0.16 0.85 0.72 0.76 0.39 0.9 ]
 [0.69 0.66 0.3  0.16 0.08 0.41 0.66 0.11 0.62 0.14]
 [0.55 0.5  0.5  0.87 0.58 0.33 0.81 0.99 0.97 0.72]
 [0.87 0.04 0.31 0.91 0.59 0.17 0.75 0.44 0.21 0.58]]


When we multiply (in dot product fashion) the two matrixes, we get our user company matrix

Of course they look very different right now, that's literally because all I did is fill in random values for these big tables.

But as you can see, if the values align up with what we already have in the matrix, the NEW values will be filled in the spots that we don't have answers in the calculated matrix.

Those are our predictions for if we think the user will like this company or not, if it's close to 1, reccomend it to them, if not, don't.

In [None]:
norm_term = np.linalg.norm(user_latent_matrix, axis=1, keepdims=True) * np.transpose(np.linalg.norm(company_latent_matrix, axis=1, keepdims=True))
user_company_matrix_calculated = np.matmul(user_latent_matrix, np.transpose(company_latent_matrix)) / norm_term
print(user_company_matrix)
print()
print(user_company_matrix_calculated)

[[2 0 0 2 0]
 [1 0 0 0 0]
 [0 1 2 1 1]
 [2 2 0 1 2]
 [2 2 1 2 2]]

[[0.85045583 0.8943326  0.82869308 0.9457052  0.78460214]
 [0.83894642 0.84702692 0.65423694 0.89915751 0.87215615]
 [0.80165129 0.74245792 0.72683745 0.83584909 0.7312311 ]
 [0.93979793 0.89721841 0.77962471 0.8102233  0.81981079]
 [0.75248948 0.70464034 0.83062123 0.85182542 0.83984843]]


So now here comes the question that I need to answer:


*   How do I get an initial vector representation for a user's vector?
*   How do I get an initial vector representation for a company's vector?


*   How do I update the user's vector as they make choices to like/dislike companies?
*   Should I update the company's vectors? If so, how as well?


As far as I can see, it is about "tracking and updating gradients" between these vectors.

Given 2 vectors (user and company), I would want to "nudge" user vector, and company vector closer together if they should have result of 1.

And vice versa, moving the 2 vectors further apart if they should have the result of 0.

This will be updated as we get more and more data about users and companies?

In [21]:
user_latent_vector = np.array([round(random.random(), 2) for _ in range(10)])
company_latent_vector = np.array([round(random.random(), 2) for _ in range(10)])
print(user_latent_vector)
print(company_latent_vector)

[0.87 0.62 0.16 0.79 0.92 0.02 0.01 0.61 0.57 0.52]
[0.54 0.26 0.52 0.36 0.71 0.93 0.6  0.2  0.89 0.63]


Assume that these two should have a result value of 1.

In [22]:
old_distance = np.dot(user_latent_vector, company_latent_vector) / (np.linalg.norm(user_latent_vector) * np.linalg.norm(company_latent_vector))
print(old_distance)

0.719319229837224


Now we need to update the vectors to make this value larger?

Let's call the vectors these variables respectively:

*   U - user vector
*   C - company vector

I need Delta (U) and Delta (C) in respect to the cosine similarity function, and even further, a loss function, the differnce between the cosine similairty and the value which we want, which requires multi-variable calculus.

But da faq how do I do this with vectors and matrices bro... GPT time

In [23]:
u = user_latent_vector.copy()
c = company_latent_vector.copy()
target = 1
learning_rate = 0.1

u_norm = np.linalg.norm(u)
c_norm = np.linalg.norm(c)

u_hat = u / u_norm
c_hat = c / c_norm

cos_similarity = np.dot(u_hat, c_hat)
loss = (cos_similarity - target) ** 2

# Gradient Inside Cos
grad_u_cos = (c_hat - cos_similarity * u_hat) / u_norm
grad_v_cos = (u_hat - cos_similarity * c_hat) / c_norm

# Chain Rule To Loss Function (God I forgot this part)
grad_u = 2 * (cos_similarity - target) * grad_u_cos
grad_v = 2 * (cos_similarity - target) * grad_v_cos

# Update!
u -= learning_rate * grad_u
c -= learning_rate * grad_v

#Check New Distance
new_distance = np.dot(u, c) / (np.linalg.norm(u) * np.linalg.norm(c))
print(old_distance)
print(new_distance)
print(new_distance - old_distance)

0.719319229837224
0.7339560594124371
0.014636829575213173


As you can see, we got a tad bit closer!

In [24]:
print(user_latent_vector)
print(company_latent_vector)
print(u)
print(c)

[0.87 0.62 0.16 0.79 0.92 0.02 0.01 0.61 0.57 0.52]
[0.54 0.26 0.52 0.36 0.71 0.93 0.6  0.2  0.89 0.63]
[0.86853641 0.61704284 0.16618182 0.78667225 0.92058324 0.0340369
 0.01908858 0.60623478 0.57726346 0.52383654]
[0.54747484 0.2666829  0.51680475 0.36820343 0.71639485 0.92020413
 0.59363556 0.20718133 0.88907251 0.63113017]


Now the question is...

*   How do I extend this behavior to matrices? How frequently do I update the prediction matrix?
*   Does This Work Good Enough? What do other algorithms do?

