Suppose we have a matrix of 4 latent factors for 5 users (5x4)

In [None]:
n_users = 5
n_factors = 4


In [2]:
from torch import Tensor

user_factors = Tensor(
    [
        [1, 0, 2, 1],
        [2, 1, 3, 1],
        [4, 2, 1, 3],
        [2, 3, 1, 0],
        [0, 4, 3, 1],
    ]
)
assert user_factors.shape == (n_users, n_factors)


## For one user

And suppose we want to select the factors for user 1, the second row, using matrix multiplication so that the process can be scaled with the corresponding hardware acceleration.

First, turn the user index 1 into a one-hot-encoded vector:

In [None]:
one_hot_user = Tensor([0, 1, 0, 0, 0])
assert one_hot_user.shape == (n_users,)


Then, perform the dot product between the user factors and the one-hot-encoded user index. This will select the latent factors for that user.

Normally, the dot product, which is a special case of matrix multiplication, takes two vectors of the same length and returns a scalar. But it can be performed between a matrix and a vector. The number of columns in the matrix needs to match the number of elements in the vector. The vector is implicitly treated as a column vector.

`one_hot_user` has 5 elements (there are 5 users), and  `user_factors` has 5 ***rows***, because each user corresponds to a row. So we need to transpose it, then perform the product, and we will have the latent factors for the user.

In [None]:
user_factors.t() @ one_hot_user


tensor([2., 1., 3., 1.])

Note how the above corresponds to the second row.

## For several users at once

Let's have a matrix of one-hot-encoded user indices. Let's try two. First and second users:

In [8]:
one_hot_users = Tensor(
    [
        [1, 0, 0, 0, 0],
        [0, 1, 0, 0, 0],
    ]
)
assert one_hot_users.shape == (2, n_users)


Remember our user factors:

In [12]:
assert user_factors.shape == (n_users, n_factors)
user_factors


tensor([[1., 0., 2., 1.],
        [2., 1., 3., 1.],
        [4., 2., 1., 3.],
        [2., 3., 1., 0.],
        [0., 4., 3., 1.]])

For matrix multiplication, the number of columns in the first matrix must equal the number of rows in the second matrix. `n_users` should be in the matching dimension in both cases. So `n_users` should correspond to the number of columns in `user_factors` (must transpose) and the number of rows in `one_hot_users` (must transpose too!).

So transpose both, multiply, and behold the user factors for the first two users:

In [None]:
selected_user_factors = user_factors.t() @ one_hot_users.t()
selected_user_factors


tensor([[1., 2.],
        [0., 1.],
        [2., 3.],
        [1., 1.]])

Right. I need to tranpose the result too:

In [None]:
selected_user_factors.t()


tensor([[1., 0., 2., 1.],
        [2., 1., 3., 1.]])

## But...

One-hot-encoded vectors are not as efficient as using an "embedding", a layer that indexes into a vector using an integer.

TODO: Wait, you can index into any matrix using a vector, no?

Jargon: Embedding

Multiplying by a one-hot-encoded matrix, using the computational shortcut that it can be implemented by simply indexing directly. This is quite a fancy word for a very simple concept. The thing that you multiply the one-hot-encoded matrix by (or, using the computational shortcut, index into directly) is called the embedding matrix.