-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calling the fit method on the load_from_df dataset produces AttributeError: DatasetAutoFolds instance has no attribute 'global_mean' #161
Comments
You need to use |
Hi , were you able to find a solution for your query? It would be great if you can share your how you overcame this ? |
Hi kirtigroover, I'm pretty sure you know how to do that with ref to uid = str(196) # raw user id (as in the ratings file). They are **strings**!
iid = str(302) # raw item id (as in the ratings file). They are **strings**!
# get a prediction for specific users and items.
pred = algo.predict(uid, iid, r_ui=4, verbose=True) To retrieve est result, use Hope this help. :) |
Thanks for your reply!! :-) I have been also trying to use file reader, which allows usage of user defied test and train data. Hopefully it will work. |
Hi , were you able to find a solution for your query? It would be great if you can share your how you overcame this ? |
yes, I was able to find a solution, go through documentation and look for "load_from_folds(). " this is the function you will be using. this function can only read from a file not directly from dataframe. so just write your dataframe into file also you would need to make sure your files have 4 columns(userid, itemid, rating, timestamp). If you dont have any data for any of these columns just keep them blank. Now as you mentioned , your testset doesnt have ratings, so you would need to add ratings as well as timestamp column. You can also use xinyuewang1 approach of training the dataset on your traindata using build_full_trainset(), and then traversing through your test dataset using a for loop and getting prediction one by one, instead of getting prediction for entire testset at one go. I hope this make sense to you. Hope this would help!!! |
Description
Calling the fit method on the load_from_df dataset without split produces AttributeError: DatasetAutoFolds instance has no attribute 'global_mean'
Steps/Code to Reproduce
import pandas as pd
from surprise import Dataset
from surprise import Reader
from surprise.model_selection import cross_validate
from surprise import SVD
from surprise.model_selection import cross_validate, KFold
#Creation of the dataframe. Column names are irrelevant.
ratings_dict = {'itemID': [1, 1, 1, 2, 2],
'userID': [9, 32, 2, 45, 'user_foo'],
'rating': [3, 2, 4, 3, 1]}
df1 = pd.DataFrame(ratings_dict)
reader = Reader()
data = Dataset.load_from_df(df1[['userID', 'itemID', 'rating']], reader)
algo = SVD()
algo.fit(data)
Expected Results
I was expecting it to fit the model on this train data and I was planning to run predict method on the test data to get the actual predictions
Actual Results
AttributeError Traceback (most recent call last)
in ()
17
18 algo = SVD()
---> 19 algo.fit(data)
20
21 #kf = KFold(n_splits=3)
C:\Users\Sachin\Anaconda2\lib\site-packages\surprise\prediction_algorithms\matrix_factorization.pyx in surprise.prediction_algorithms.matrix_factorization.SVD.fit()
153
154 AlgoBase.fit(self, trainset)
--> 155 self.sgd(trainset)
156
157 return self
C:\Users\Sachin\Anaconda2\lib\site-packages\surprise\prediction_algorithms\matrix_factorization.pyx in surprise.prediction_algorithms.matrix_factorization.SVD.sgd()
202 cdef int u, i, f
203 cdef double r, err, dot, puf, qif
--> 204 cdef double global_mean = self.trainset.global_mean
205
206 cdef double lr_bu = self.lr_bu
AttributeError: DatasetAutoFolds instance has no attribute 'global_mean'
Versions
Windows-10-10.0.16299
('Python', '2.7.14 |Anaconda custom (64-bit)| (default, Oct 15 2017, 03:34:40) [MSC v.1500 64 bit (AMD64)]')
('surprise', '1.0.5')
Additional info.
However if I run the fit and predict using KFold split, it works properly. My intention is I have a separate test and train data and I want to fit the model on the train data without any KFolding and run the predict on the test data.
The text was updated successfully, but these errors were encountered: