Inference at test time? #5

smeyerhot · 2021-09-28T17:35:39Z

How should the FM be used to make predictions? For example, say I train this model on 1000 user movie pairs. I want to make a prediction for an unseen user, which is a vector where the values are predicted ratings for all possible movies. However, in the examples it looks like the same users get used for training and testing. ie. for user A the model trains on 80% of the known movie ratings and then tries to predict the remaining 20%. How should we call the model when we want to predict 80% of ratings for an unseen user ie. one not in the training set?

In other words I would like to take a vector of length n where I have m known ratings and infer the remaining n-m? Would I have to include the m known ratings in the training set?

tohtsky · 2021-09-29T01:54:51Z

So you are considering a pure matrix factorization (i.e., only features are user id and item ids) model, right?
In that case, as you said, you have to include known ratings of the user in the training set.
Is there any reason why you can't do that? (training requires too much time?)

smeyerhot · 2021-09-29T02:29:20Z

Thanks for getting back to me!

Yes, pure matrix factorization model. No reason right now, but yes I am worried that it may be expensive to train a new model every time I want to give a new recommendation but I guess that is necessary.

Just to recap, I have n items and m < n ratings. I should pass in a table like this:

User ID	Item Id
1	1
1	2
1	...
1	n - m

Where I have n-m rows for all unrated items (obviously they wont all be in order)

tohtsky · 2021-09-29T04:09:18Z

Thank you for clarifying the setting!
I still believe you have to include known ratings of the user in the training set.
I think this is a kind of cold-start problem.

For refefence, in a recent article (though it is for implicit-feedback setting) https://arxiv.org/abs/1911.07698 ,
there is a similar problem regarding the evaluation of "Mult-VAE" model (page 31 of the latest version).
The workaround there (and reference therein) is to pick up other users (known at the training time) who have similar rating logs and average the latent factors thereof.

smeyerhot changed the title ~~Test time?~~ Inference at test time? Sep 28, 2021

tohtsky closed this as completed Jan 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference at test time? #5

Inference at test time? #5

smeyerhot commented Sep 28, 2021

tohtsky commented Sep 29, 2021

smeyerhot commented Sep 29, 2021

tohtsky commented Sep 29, 2021

Inference at test time? #5

Inference at test time? #5

Comments

smeyerhot commented Sep 28, 2021

tohtsky commented Sep 29, 2021

smeyerhot commented Sep 29, 2021

tohtsky commented Sep 29, 2021