Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model_name(als) trains out unexist movieIds #59

Closed
capricorn9527 opened this issue Oct 26, 2017 · 1 comment
Closed

model_name(als) trains out unexist movieIds #59

capricorn9527 opened this issue Oct 26, 2017 · 1 comment

Comments

@capricorn9527
Copy link

I copied the content of example file "movielens.py" into my program file "datatrain.py" , trained the ml-20m dataset with model(als).
some unexist movieIds were trained out. how is that happened??

movieId| name | similar_movieId | score
26462 | Bad Boys (1983)| 113812 | 0.926960830278

Traceback (most recent call last):
File "datatrain.py", line 108, in
min_rating=args.min_rating)
File "datatrain.py", line 86, in calculate_similar_movies
o.write("%s\t%s\t%s\n" % (movie, movie_lookup[other], score))
KeyError: 113812

benfred added a commit that referenced this issue Nov 1, 2017
There used to be problems with the movielens example, where
occasionally items with no users would be returned. Since these
items sometimes had no matching label, this would trip up the
code looking up the string to show (#59)

Fix this by detecting this case, and setting the factors to 0.
When returning similar items also make sure that we don't introduce
NaN values here.
@benfred
Copy link
Owner

benfred commented Nov 1, 2017

Thanks for pointing this out! There were some bugs related to handling of movies with no users (which the movie lens set has after filtering down to only positive reviews). I've fixed in that last PR, should work well now.

@benfred benfred closed this as completed Nov 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants