AlgoBase.predict() returns the same value for all uid and iid values #140

skurzhanskyi · 2018-02-02T15:54:26Z

Description

I tried to test Matrix Factorization-based algorithms on my own dataset and admitted that AlgoBase.predict() returns the same value for all uid and iid values. Then I tested on default dataset and the result was the same. Maybe, there's a mistake in my code, but I can't get it at the moment.

Steps/Code to Reproduce

from surprise import SVD
from surprise import Dataset


data = Dataset.load_builtin('ml-100k')
algo = SVD()
trainset = data.build_full_trainset()
algo.fit(trainset)

users_ids = trainset.all_users()
items_ids = trainset.all_items()

ratings = []
for i in range(min(100, len(users_ids))):
    for j in range(min(100, len(items_ids))):
        ratings.append(algo.predict(users_ids[i], items_ids[j]).est)
print len(set(ratings))

Expected Results

10000 or smth less (not 1)

Actual Results

1

Versions

Darwin-16.7.0-x86_64-i386-64bit
Python 2.7.14 (v2.7.14:84471935ed, Sep 16 2017, 12:01:12)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
surprise 1.0.5

The text was updated successfully, but these errors were encountered:

NicolasHug · 2018-02-02T17:22:39Z

That's because predict() expects raw ids but trainset.all_users/items() returns inner ids. Please see this note.

skurzhanskyi · 2018-02-02T20:08:15Z

Thank you for your swift answer!
I really didn't note that. But, actually, firstly I tried to use data from my file, but I get something like "this item not in unknown". Unfortunately, I am not on the work computer at the moment. Will it be OK if I describe my problem in detail on Monday?

NicolasHug · 2018-02-02T20:11:09Z

Yes, if you tried but didn't find a solution on your own feel free to ask (of course please do some research before:) )

skurzhanskyi · 2018-02-02T20:47:48Z

Of course. The thing is that I spent half of the day trying to solve this problem. I wouldn't write here if I spent less.

skurzhanskyi · 2018-02-05T12:38:45Z

You, actually, were right. The problem was in my dataset. There were non-ASCII characters in items, so unicode and string variables differed. Using to_raw_iid you get string, but the field in DataFrame is unicode. So, if you check their equality, you'll get True, but the result of predict depends on the type.

Thank for your help anyway. The task may be closed. But I think converting to string in predict would be great.

elaine-peiru · 2018-05-05T14:32:32Z

@skurzhanskyi I have the same problem when I using my own dataset, can you specify how you solve this problem? I also go all the same estimate rating all the time. I understand the raw id and inner id in the note, but I have no idea where I should modify in the origin code: I retrieved the inner id of user and item and converted them then put it to predict(), but I still got the same est=3.44 all the time.
Would be great if you can share your experience. Thank you :)

skurzhanskyi · 2018-05-06T17:22:31Z

@elaine-peiru, this code look strange. You first get inner_id and than get back to row_id. Maybe the problem is with str().

elaine-peiru · 2018-05-06T21:54:25Z

@skurzhanskyi thanks! the question was solved by remove the str().

alpalalpal · 2019-01-09T05:34:56Z

Solved this issue by convert ids from string to integer or float

olegyablokov · 2020-05-28T15:18:26Z

I had the same issue, but the problem turned out to be the fact that the matrix I used to fit the model with had NaNs. I fixed this and it worked.

skurzhanskyi closed this as completed Feb 7, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AlgoBase.predict() returns the same value for all uid and iid values #140

AlgoBase.predict() returns the same value for all uid and iid values #140

skurzhanskyi commented Feb 2, 2018 •

edited

NicolasHug commented Feb 2, 2018

skurzhanskyi commented Feb 2, 2018

NicolasHug commented Feb 2, 2018

skurzhanskyi commented Feb 2, 2018

skurzhanskyi commented Feb 5, 2018 •

edited

elaine-peiru commented May 5, 2018 •

edited

skurzhanskyi commented May 6, 2018

elaine-peiru commented May 6, 2018

alpalalpal commented Jan 9, 2019

olegyablokov commented May 28, 2020

AlgoBase.predict() returns the same value for all uid and iid values #140

AlgoBase.predict() returns the same value for all uid and iid values #140

Comments

skurzhanskyi commented Feb 2, 2018 • edited

Description

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

NicolasHug commented Feb 2, 2018

skurzhanskyi commented Feb 2, 2018

NicolasHug commented Feb 2, 2018

skurzhanskyi commented Feb 2, 2018

skurzhanskyi commented Feb 5, 2018 • edited

elaine-peiru commented May 5, 2018 • edited

skurzhanskyi commented May 6, 2018

elaine-peiru commented May 6, 2018

alpalalpal commented Jan 9, 2019

olegyablokov commented May 28, 2020

skurzhanskyi commented Feb 2, 2018 •

edited

skurzhanskyi commented Feb 5, 2018 •

edited

elaine-peiru commented May 5, 2018 •

edited