# Movie Recommender
### Olteanu Fabian Cristian, FMI, AI Master, Year 1

In this project I will be presenting my implementation of the [Hierarchical Poisson Factorization model](http://jakehofman.com/inprint/poisson_recs.pdf) and training it with MCMC using Julia and Gen on the following dataset: https://www.kaggle.com/datasets/gargmanas/movierecommenderdataset

I will start by importing the necessary packages:

In [2]:
using CSV
using DataFrames
using Distributions
using Gen

In this next part I will do some preprocessing of the data. First, let's read the data from the CSV files and convert it to some DataFrames.

In [3]:
ratings_df = DataFrame(CSV.File("ratings.csv"))
movies_df = DataFrame(CSV.File("movies.csv"))

Row,movieId,title,genres
Unnamed: 0_level_1,Int64,String,String
1,1,Toy Story (1995),Adventure|Animation|Children|Comedy|Fantasy
2,2,Jumanji (1995),Adventure|Children|Fantasy
3,3,Grumpier Old Men (1995),Comedy|Romance
4,4,Waiting to Exhale (1995),Comedy|Drama|Romance
5,5,Father of the Bride Part II (1995),Comedy
6,6,Heat (1995),Action|Crime|Thriller
7,7,Sabrina (1995),Comedy|Romance
8,8,Tom and Huck (1995),Adventure|Children
9,9,Sudden Death (1995),Action
10,10,GoldenEye (1995),Action|Adventure|Thriller


We want to join those two dataframes so we can create a matrix where its rows are the UserIds, its columns are the MovieIds and its values are the ratings each user gives to each movie. To do that, I will join those two dataframes, while dropping unnecessary columns:

In [4]:
DataFrames.select!(ratings_df, Not(:timestamp)) #drop timestamp column as it's not needed

data = innerjoin(ratings_df, movies_df, on = :movieId) #join the two dataframes

DataFrames.select!(data, Not([:title, :genres])) #drop title,genre columns

data = unstack(data, :movieId, :rating) #create a pivot table (show each user grouped by each movie)

DataFrames.select!(data, Not(:userId)) #drop userId as each row is the userId

Row,1,3,6,47,50,70,101,110,151,157,163,216,223,231,235,260,296,316,333,349,356,362,367,423,441,457,480,500,527,543,552,553,590,592,593,596,608,648,661,673,733,736,780,804,919,923,940,943,954,1009,1023,1024,1025,1029,1030,1031,1032,1042,1049,1060,1073,1080,1089,1090,1092,1097,1127,1136,1196,1197,1198,1206,1208,1210,1213,1214,1219,1220,1222,1224,1226,1240,1256,1258,1265,1270,1275,1278,1282,1291,1298,1348,1377,1396,1408,1445,1473,1500,1517,1552,⋯
Unnamed: 0_level_1,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,Float64?,⋯
1,4.0,4.0,4.0,5.0,5.0,3.0,5.0,4.0,5.0,5.0,5.0,5.0,3.0,5.0,4.0,5.0,3.0,3.0,5.0,4.0,4.0,5.0,4.0,3.0,4.0,5.0,4.0,3.0,5.0,4.0,4.0,5.0,4.0,4.0,4.0,5.0,5.0,3.0,5.0,3.0,4.0,3.0,3.0,4.0,5.0,5.0,5.0,4.0,5.0,3.0,5.0,5.0,5.0,5.0,3.0,5.0,5.0,4.0,5.0,4.0,5.0,5.0,5.0,4.0,5.0,5.0,4.0,5.0,5.0,5.0,5.0,5.0,4.0,5.0,5.0,4.0,2.0,5.0,5.0,5.0,5.0,5.0,5.0,3.0,4.0,5.0,5.0,5.0,5.0,5.0,5.0,4.0,3.0,3.0,3.0,3.0,4.0,4.0,5.0,4.0,⋯
2,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,4.0,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,⋯
3,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,0.5,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,3.5,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,⋯
4,missing,missing,missing,2.0,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,2.0,5.0,1.0,missing,missing,missing,missing,missing,missing,missing,1.0,5.0,missing,missing,missing,missing,missing,2.0,missing,missing,5.0,missing,5.0,3.0,missing,missing,missing,missing,missing,missing,5.0,missing,missing,missing,missing,missing,missing,missing,4.0,missing,missing,missing,missing,missing,missing,2.0,4.0,5.0,missing,missing,missing,missing,missing,5.0,5.0,5.0,3.0,missing,missing,missing,4.0,missing,4.0,missing,missing,missing,missing,missing,missing,missing,4.0,missing,missing,missing,5.0,4.0,missing,missing,missing,missing,missing,missing,missing,4.0,4.0,missing,⋯
5,4.0,missing,missing,missing,4.0,missing,missing,4.0,missing,missing,missing,missing,missing,missing,missing,missing,5.0,2.0,missing,3.0,missing,missing,4.0,missing,missing,4.0,missing,missing,5.0,missing,missing,missing,5.0,3.0,missing,5.0,3.0,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,⋯
6,missing,5.0,4.0,4.0,1.0,missing,missing,5.0,4.0,missing,3.0,4.0,missing,3.0,missing,missing,2.0,5.0,5.0,5.0,5.0,3.0,4.0,missing,missing,5.0,5.0,5.0,3.0,3.0,3.0,5.0,5.0,3.0,4.0,3.0,3.0,missing,missing,missing,missing,5.0,5.0,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,3.0,5.0,missing,3.0,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,⋯
7,4.5,missing,missing,missing,4.5,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,5.0,missing,missing,missing,missing,5.0,missing,missing,missing,missing,missing,5.0,missing,missing,missing,missing,missing,missing,3.0,5.0,missing,missing,4.0,missing,missing,missing,missing,4.5,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,4.0,missing,missing,missing,4.0,4.0,missing,missing,5.0,4.5,missing,missing,missing,5.0,missing,missing,missing,5.0,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,3.5,missing,⋯
8,missing,missing,missing,4.0,5.0,missing,missing,3.0,missing,missing,missing,missing,missing,4.0,3.0,missing,4.0,missing,missing,missing,3.0,missing,3.0,missing,missing,3.0,4.0,2.0,5.0,missing,missing,missing,5.0,3.0,4.0,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,⋯
9,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,4.0,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,5.0,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,5.0,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,5.0,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,⋯
10,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,1.0,missing,missing,missing,3.5,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,missing,⋯


Next, I will replace the "missing" values with zeroes and order the columns by MovieId. I will also multiply every score by two so that the matrix that we will use for input is composed of integers. This will be helpful later when the data is fed to the model, since the y matrix that it generates takes values from a Poisson distribution, which requires the values of the input data to be integers.

In [5]:
for col in eachcol(data)
    replace!(col, missing => 0)
end

ordered_columns = sort([parse(Int, x) for x in names(data)]) 

data = data[:, string.(ordered_columns)]
mapcols!(col -> 2 * col, data) # to transform scores from floats to ints (0 - 5 -> 0 - 10)
data = convert.(Int64, data) #convert column types from Vector{Union{Missing, Float64}} to Vector{Float64}

Row,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,34,36,38,39,40,41,42,43,44,45,46,47,48,49,50,52,53,54,55,57,58,60,61,62,63,64,65,66,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,85,86,87,88,89,92,93,94,95,96,97,99,100,101,102,103,104,105,106,107,108,110,111,112,⋯
Unnamed: 0_level_1,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,Int64,⋯
1,8,0,8,0,0,8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,10,0,0,10,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,10,0,0,0,0,0,0,0,8,0,0,⋯
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,⋯
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,⋯
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,0,0,0,0,0,0,0,0,0,0,4,0,0,0,0,0,0,0,0,0,6,0,4,0,0,0,6,0,0,0,0,6,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,0,0,0,0,0,⋯
5,8,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,0,0,0,0,0,0,0,0,0,0,0,8,8,0,6,0,0,0,0,0,0,0,0,0,0,8,0,0,0,0,0,10,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,0,0,⋯
6,0,8,10,6,10,8,8,6,0,6,8,0,6,0,8,8,8,0,4,0,4,10,0,8,6,8,6,0,0,0,6,8,8,10,0,0,0,8,0,8,0,6,8,8,0,0,2,0,0,8,0,0,0,8,8,8,0,0,6,6,0,0,0,0,0,0,0,0,8,0,0,6,0,0,0,0,0,10,6,4,8,8,8,0,8,0,0,0,6,0,2,0,8,6,0,0,0,10,0,8,⋯
7,9,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,9,0,0,0,0,0,6,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,⋯
8,0,8,0,0,0,0,0,0,0,4,8,0,0,0,0,0,0,0,0,0,8,0,0,0,0,0,0,0,0,0,0,6,10,0,0,6,0,0,0,0,0,0,0,8,0,0,10,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,0,0,⋯
9,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,⋯
10,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,⋯


As a sanity check, I will choose a 5x5 matrix from the dataframe from above:

In [6]:
test_data = Matrix{Int64}(data[[19, 21, 475, 476, 477], [1, 2, 3, 4, 10]])

K = 5
no_users = size(test_data, 1)
no_items = size(test_data, 2)
test_data

5×5 Matrix{Int64}:
 8  6  6  0   4
 7  7  0  0  10
 0  9  0  0   0
 8  8  0  0   6
 8  8  6  0   0

I will now show my implementation of the HPF model. By default, all its hyperparameters are set to 0.3. K is an abstractization for the genres of the movies in the dataset, an integer. We have four latent variables, $\xi$, $\eta$, $\theta$ and $\beta$. The first two are arrays of values sampled from $\Gamma(a', a'/b')$, respectively $\Gamma(c', c'/d')$. The last two are matrices of dimensions $no\_users \times K$, respectively $no\_items \times K$. Afterwards we can sample our observations in the y matrix ($no\_users \times no\_items$).

In [7]:
@gen function hpf_model(
    K, 
    a_prime::Float64 = 0.3, b_prime::Float64 = 0.3, 
    c_prime::Float64 = 0.3, d_prime::Float64 = 0.3, 
    a::Float64 = 0.3, c::Float64 = 0.3
)
    
    xi = Float64[]
    for u = 1:no_users
        push!(xi, {(:xi, u)} ~ gamma(a_prime, a_prime/b_prime))
    end
    theta = Vector{Float64}[]
    for u = 1:no_users
        push!(theta, [{(:theta, u, k)} ~ gamma(a, xi[u]) for k = 1:K])
    end

    eta = Float64[]
    for i = 1:no_items
        push!(eta, {(:eta, i)} ~ gamma(c_prime, c_prime/d_prime))
    end
    beta = Vector{Float64}[]
    for i = 1:no_items
        push!(beta, [{(:beta, i, k)} ~ gamma(c, eta[i]) for k = 1:K])
    end

    y = Vector{Int64}[]
    for u = 1:no_users
        push!(y, [{(:y, u, i)} ~ poisson(transpose(theta[u]) * beta[i]) for i = 1:no_items])
    end
    
    y
end

DynamicDSLFunction{Any}(Dict{Symbol, Any}(), Dict{Symbol, Any}(), Type[Any, Float64, Float64, Float64, Float64, Float64, Float64], true, Union{Nothing, Some{Any}}[nothing, Some(0.3), Some(0.3), Some(0.3), Some(0.3), Some(0.3), Some(0.3)], var"##hpf_model#312", Bool[0, 0, 0, 0, 0, 0, 0], false)

Here is the function I used to create the constraints:

In [8]:
function make_constraints(ratings::Matrix{Int64})
    constraints = Gen.choicemap()
    for u = 1:size(ratings, 1)
        for i = 1:size(ratings, 2)
            constraints[(:y, u, i)] = ratings[u,i]
        end
    end
    constraints
end

make_constraints (generic function with 1 method)

Here is the function for block resimulation (using metropolis_hastings) I will use to do inference:

In [9]:
function block_resimulation_update(tr)
    latent_variable = Gen.select(:xi)
    (tr, _) = mh(tr, latent_variable)

    latent_variable = Gen.select(:theta)
    (tr, _) = mh(tr, latent_variable)

    latent_variable = Gen.select(:eta)
    (tr, _) = mh(tr, latent_variable)

    latent_variable = Gen.select(:beta)
    (tr, _) = mh(tr, latent_variable)
    
    tr

end

block_resimulation_update (generic function with 1 method)

Taking everything into account, I've sketched this function to do inference on the model (using MCMC):

In [10]:

function block_resimulation_inference(K::Int64, ratings::Matrix{Int64}, n_burnin::Int64, n_samples::Int64)
    observations = make_constraints(ratings)
    (tr, _) = generate(hpf_model, (K,), observations)
    
    for iter = 1:n_burnin
        tr = block_resimulation_update(tr)
    end

    trs = []
    for iter = 1:n_samples
        tr = block_resimulation_update(tr)
        push!(trs, tr)
    end

    trs

end

block_resimulation_inference (generic function with 1 method)

Using the block_resimulation_inference function with the test_data, 90000 iterations, and a burn-in of 50000, we can retrieve the traces, which we can use to retrieve two vectors of $\theta$ and $\beta$ matrices:

In [11]:
n_iter = 90000
n_burnin = 50000

trs = block_resimulation_inference(K, test_data, n_burnin, n_iter)

theta_samples = [[[trs[iter][(:theta, u, k)] for k=1:K] for u = 1:no_users] for iter=1:n_iter]
beta_samples = [[[trs[iter][(:beta, i, k)] for k=1:K] for i = 1:no_items] for iter=1:n_iter]

90000-element Vector{Vector{Vector{Float64}}}:
 [[0.03664859551315018, 0.16769764604846651, 0.00031838376940813176, 0.07282470969931838, 0.4131668872836315], [0.11676130530245563, 0.3155492846092558, 0.18119371996448158, 0.5028428518705034, 0.46023322756082397], [3.9830811121462384e-9, 1.7827897524096808e-8, 1.8970203719114552e-7, 1.1329494474168394e-9, 9.390379886488782e-11], [0.0020342931253977217, 0.00014026066749833323, 0.001179754503032824, 1.1627167867944792, 7.482373701146454e-7], [4.496980694650402e-6, 0.0018339821696443752, 0.003322496399470185, 0.09918764586160261, 9.227258882676746e-8]]
 [[0.03664859551315018, 0.16769764604846651, 0.00031838376940813176, 0.07282470969931838, 0.4131668872836315], [0.11676130530245563, 0.3155492846092558, 0.18119371996448158, 0.5028428518705034, 0.46023322756082397], [3.9830811121462384e-9, 1.7827897524096808e-8, 1.8970203719114552e-7, 1.1329494474168394e-9, 9.390379886488782e-11], [0.0020342931253977217, 0.00014026066749833323, 0.001179754503

Afterwards, to get the recommendations, we need to calculate the mean of all the matrices from theta_samples and beta_samples. Since those are stored as Vector{Vector{Float64}} (arrays of arrays), we need to cast them into matrices, for which we can use hcat to concatenate along the second dimension (in this case it doesn't matter which dimension we use since it's a square matrix). After computing the mean matrices, we return transpose(theta_mean) * beta_mean to get the matrix of probabilities (the likelihood of each user to watch a specific movie). We can then compare these likelihoods to the input matrix.

In [19]:
function get_recommendations(theta_samples, beta_samples)
    theta_mean = zeros((no_users, K))
    beta_mean = zeros((no_items, K))
    
    for i = 1:n_iter
        theta_mean += hcat(theta_samples[i]...)
        beta_mean += hcat(beta_samples[i]...)
    end
    theta_mean = theta_mean / n_iter
    beta_mean = beta_mean / n_iter
    return transpose(theta_mean) * beta_mean
end

get_recommendations(theta_samples, beta_samples)

5×5 Matrix{Float64}:
 0.0721665   0.236351    5.0123e-9    0.202203     0.0171473
 0.161471    0.462337    1.21113e-8   0.624685     0.0535991
 0.055904    0.068881    2.97455e-9   0.00940762   0.000857033
 0.00264512  0.00313517  3.38258e-11  0.000313078  2.71218e-5
 0.163472    0.262164    3.00304e-9   0.169428     0.0145574

In [17]:
test_data

5×5 Matrix{Int64}:
 8  6  6  0   4
 7  7  0  0  10
 0  9  0  0   0
 8  8  0  0   6
 8  8  6  0   0

In conclusion, the results are not entirely satisfactory, as the likelihood matrix produced by fitting the model did not correlate with the test input matrix, so my sanity check failed. I was unfortunately not able to find the reason as to why this happened.