Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fail to run when I only have rating matrix #4

Closed
wyfoffice opened this issue Jul 17, 2018 · 5 comments
Closed

fail to run when I only have rating matrix #4

wyfoffice opened this issue Jul 17, 2018 · 5 comments

Comments

@wyfoffice
Copy link

when I only use the rating matrix, the code will raise error as memory error, however, once I put user side information matrix in it, I can run it fluently.

all my matrix are in pandas dataframe form

my code (only rating matrix) is here

recommender = CMF(k=40, k_main=0, k_user=0, reg_param=0.1, w_main=1,
w_user=0, add_user_bias=False, add_item_bias=False, reindex=False)

recommender.fit(ratings=product_train.copy())

rec_item = list(recom.topN(user=altered_pair[i][0], n=300))

and my code that with user information is

recommender = CMF(k=40, k_main=0, k_user=0, reg_param=0.1, w_main=1,
w_user=0, add_user_bias=False, add_item_bias=False, reindex=False)

recommender.fit(ratings=product_train.copy(),
user_info=user_info.copy(), cols_bin_user=[cl for cl in user_info.columns if cl!='UserId'], cols_bin_item=None)

rec_item = list(recom.topN(user=altered_pair[i][0], n=300))

the error message is as follow:


MemoryError Traceback (most recent call last)
in ()
9 # recommender.fit(ratings=product_train.copy(),
10 # cols_bin_user=[cl for cl in user_info.columns if cl!='UserId'], cols_bin_item=None)
---> 11 recommender.fit(ratings=product_train.copy())
12
13 # recommender.fit(ratings=product_train.copy())

~/.local/lib/python3.6/site-packages/cmfrec/init.py in fit(self, ratings, user_info, item_info, cols_bin_user, cols_bin_item)
407 self._fit(self.w1, self.w2, self.w3, self.reg_param,
408 self.k, self.k_main, self.k_item, self.k_user,
--> 409 self.random_seed, self.maxiter)
410
411 self.is_fitted = True

~/.local/lib/python3.6/site-packages/cmfrec/init.py in _fit(self, w1, w2, w3, reg_param, k, k_main, k_item, k_user, random_seed, maxiter)
678 I_nonmissing:self._item_arr_notmissing, U_nonmissing:self._user_arr_notmissing,
679 I_nonmissing_bin:self._item_arr_notmissing_bin,
--> 680 U_nonmissing_bin:self._user_arr_notmissing_bin})
681 self.A = A.eval(session=sess)
682 self.B = B.eval(session=sess)

~/.local/lib/python3.6/site-packages/tensorflow/contrib/opt/python/training/external_optimizer.py in minimize(self, session, feed_dict, fetches, step_callback, loss_callback, **run_kwargs)
205 packed_bounds=self._packed_bounds,
206 step_callback=step_callback,
--> 207 optimizer_kwargs=self.optimizer_kwargs)
208 var_vals = [
209 packed_var_val[packing_slice] for packing_slice in self._packing_slices

~/.local/lib/python3.6/site-packages/tensorflow/contrib/opt/python/training/external_optimizer.py in _minimize(self, initial_val, loss_grad_func, equality_funcs, equality_grad_funcs, inequality_funcs, inequality_grad_funcs, packed_bounds, step_callback, optimizer_kwargs)
400
401 import scipy.optimize # pylint: disable=g-import-not-at-top
--> 402 result = scipy.optimize.minimize(*minimize_args, **minimize_kwargs)
403
404 message_lines = [

~/.local/lib/python3.6/site-packages/scipy/optimize/_minimize.py in minimize(fun, x0, args, method, jac, hess, hessp, bounds, constraints, tol, callback, options)
601 elif meth == 'l-bfgs-b':
602 return _minimize_lbfgsb(fun, x0, args, jac, bounds,
--> 603 callback=callback, **options)
604 elif meth == 'tnc':
605 return _minimize_tnc(fun, x0, args, jac, bounds, callback=callback,

~/.local/lib/python3.6/site-packages/scipy/optimize/lbfgsb.py in _minimize_lbfgsb(fun, x0, args, jac, bounds, disp, maxcor, ftol, gtol, eps, maxfun, maxiter, iprint, callback, maxls, **unknown_options)
309 x = array(x0, float64)
310 f = array(0.0, float64)
--> 311 g = zeros((n,), float64)
312 wa = zeros(2mn + 5n + 11mm + 8m, float64)
313 iwa = zeros(3*n, int32)

MemoryError:

@david-cortes
Copy link
Owner

david-cortes commented Jul 17, 2018

Please upload here pickle objects of the model that failss after having called the fit method, and a pickle of the one that succeeds stopping it before it finishes fitting (if you can see messages about iterations in the console, please stop it right there), and pickling it afterwards. They are likely to be heavy files so please use something like dropbox or google drive.

In order to save them, you need to use the pickle module:

import pickle
pickle.dump(model_obj, open("model.p", "wb"))

@david-cortes
Copy link
Owner

By the way, seeing as you are using reindex=False, please also try passing a numpy array instead of a dataframe and see if you still get the same error. It should have 3 columns, corresponding, in this order, to: UserId, ItemId, Rating.

@david-cortes
Copy link
Owner

One last question: could you also verify that the version you have installed says 0.5? If you installed it with pip, you can see this info with pip show cmfrec.

@wyfoffice
Copy link
Author

@david-cortes sorry to reply you so late, I will try to use pickle to record the errors, besides, the version of cmfrec is 0.5

@david-cortes
Copy link
Owner

Closing as there's a new version which removed the code where this error happened.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants