Recommender system tutorial evidence #449

makobot-sh · 2023-10-13T11:11:55Z

Hi everyone,
I've been trying to get the evidence for the model from the Recommender System Tutorial in the docs but I haven't been able to figure out how. I tried using mixture modelling as described in Computing model evidence for model selection but the resulting evidence has been very small (near 0). I'm not sure I'm doing things right so if anyone could give me feedback on the code I'd appreciate it enormously. Here is what I did (look at the RecommenderTutorialFromRepository.Evidence function).

I'm getting


evidence	0
log(evidence)	-2,57E+010

Despite using the following arguments for the data generation:

static int numUsers = 50;
static int numItems = 10;
static int numTraits = 2;
static int numObs = 100;
static int numLevels = 2;

And having been able to reproduce the results from the tutorial with said arguments:

true parameters	learned parameters
1,00 0,00	1,00 0,00
0,00 1,00	0,00 1,00
-0,42 0,73	-0,23 -0,07
-0,06 -0,03	-0,42 -0,04
0,80 -0,92	0,04 0,86

As an aside, I haven't been able to reproduce the results at the end of the tutorial (20000 observations instead of 100). The arguments for that one are the following (you can copy-paste onto my code to test, at RecommenderTutorialFromRepository.cs:17):

static int numUsers = 200;
static int numItems = 200;
static int numTraits = 2;
static int numObs = 20000;
static int numLevels = 2;

The estimated item traits don't match the truth or the tutorial's results at all. Instead, I'm getting the following estimations:

true parameters	learned parameters
1,00 0,00	1,00 0,00
0,00 1,00	0,00 1,00
0,44 -1,07	3,01 2,80
-0,38 -0,83	3,01 2,80
0,11 0,68	3,03 2,82

Any ideas why this could be?

The text was updated successfully, but these errors were encountered:

tminka · 2023-11-26T15:43:31Z

To compute evidence correctly, you must put the If block around the whole model, especially the parameter declarations. The linked code doesn't do that.
You are right, the results aren't good anymore. To debug this problem, I tried a few things and finally I set engine.Compiler.OptimiseInferenceCode = false; and it worked. So it seems there is a bug in how Infer.NET is optimizing the inference code. Thanks for bringing this to my attention.

makobot-sh · 2023-11-29T23:43:26Z

Thank you so much for the response! Turning off code optimization improved the estimatations in both cases, and putting the if block around the whole model has definitely improved the geometric mean of the evidence for the 100 observations case (which is now 0.65, calculated $exp(\frac{log(evidence)}{numObs})$ ).

For the 20k observations case, however, the geometric mean evidence (and the evidence itself) is still 0. Do you have any idea why this could be? (Here's the amended code, also a fork with the code to reproduce on branch tutorial_prints in case that's more convenient, and a diff against the original infer repo code)

On a different note, would the bug affect the full Matchbox Recommender implementation as well? How can we change said configuration for it if so?

tminka · 2023-12-12T23:12:50Z

It seems that toggling OptimiseInferenceCode doesn't fix everything, it just makes some of the estimates better. I will look into a better solution. The full Matchbox Recommender implementation is not affected.

makobot-sh · 2023-12-13T20:05:35Z

I see, if you do please let me know! It would be nice to get the evidence for a trained Matchbox Recommender, but as far as I could tell it's not possible. Thanks for everything!

tminka · 2024-01-11T15:28:45Z

This is fixed by #457

makobot-sh · 2024-01-17T21:32:22Z

Thank you Tom! The tutorial works great now for both values of largeData. I was playing around with it a bit more and found that with bigData=true and 5+ traits instead of 2, the geometric mean of the evidence drops back down to 0 (and the "Evidence is too low" exception triggers). Is this expected behaviour? With bigData=false, 5 traits seem to work well (the geometric mean of the evidence is below randomness, but this could be attributed to the low number of observations).

tminka · 2024-03-07T21:05:04Z

Damping is required for large numbers of traits. The attached PR shows how to do this.

glandfried mentioned this issue Nov 23, 2023

Matchbox recommender message 7 #428

Open

tminka mentioned this issue Mar 7, 2024

RecommenderSystem has damping #464

Merged

tminka closed this as completed in #464 Mar 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recommender system tutorial evidence #449

Recommender system tutorial evidence #449

makobot-sh commented Oct 13, 2023

tminka commented Nov 26, 2023 •

edited

Loading

makobot-sh commented Nov 29, 2023 •

edited

Loading

tminka commented Dec 12, 2023

makobot-sh commented Dec 13, 2023

tminka commented Jan 11, 2024

makobot-sh commented Jan 17, 2024

tminka commented Mar 7, 2024

Recommender system tutorial evidence #449

Recommender system tutorial evidence #449

Comments

makobot-sh commented Oct 13, 2023

tminka commented Nov 26, 2023 • edited Loading

makobot-sh commented Nov 29, 2023 • edited Loading

tminka commented Dec 12, 2023

makobot-sh commented Dec 13, 2023

tminka commented Jan 11, 2024

makobot-sh commented Jan 17, 2024

tminka commented Mar 7, 2024

tminka commented Nov 26, 2023 •

edited

Loading

makobot-sh commented Nov 29, 2023 •

edited

Loading