Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about large-scale data #41

Closed
Caojunxu opened this issue Jul 18, 2016 · 2 comments
Closed

Question about large-scale data #41

Caojunxu opened this issue Jul 18, 2016 · 2 comments

Comments

@Caojunxu
Copy link

Also ,First of all, thank you for this nice library, it is very interesting。

When I use BiasedMF with about 600M training data,the system will run so slowly ,after half of an hour,it threw an outofmemory exception , the configuration of JVM that I set was

”HEAP_OPTS="-Xmx2048m -XX:PermSize=64m -XX:MaxPermSize=256m -XX:-UseGCOverheadLimit"

Could I just add more Xmx memory to figure out this problem,or the system is suitable for small amount of data ? because there are several backups for the training data 。

@ranlocar
Copy link
Contributor

ranlocar commented Aug 2, 2016

Has it completed any iterations? Has it finished loading the data, and started building the model? How many users and how many items do you have? How many factors are you trying to extract?

Have you seen a message like this one:

[INFO ] 2016-08-02 09:18:17,810 -- Dataset: ./train_strings.txt
[DEBUG] 2016-08-02 09:18:23,331 -- With Specs: {Users, Items, Ratings} = {88715, 8577, 1864230}, Scale = {1.0,5.0,10.0,15.0}
[INFO ] 2016-08-02 09:18:24,208 -- With Setup: test-set -f ./test_strings.txt --test-view all --early-stop RMSE
[INFO ] 2016-08-02 09:18:24,209 -- Dataset: ./test_strings.txt
[DEBUG] 2016-08-02 09:18:24,588 -- With Specs: {Users, Items, Ratings} = {88715, 8577, 132767}, Scale = {1.0,10.0,15.0}
[DEBUG] 2016-08-02 09:18:24,988 -- BPR: [factors, lRate, maxLRate, regB, regU, regI, iters, boldDriver] = [-1.0,40,0.001,-1.0,0.001,0.001,100]

Can you post your stacktrace?

@KeqiangWang
Copy link

just add more Xmx memory to figure out this problem. -Xmx6144m is needed for 600m data ( about 10x space for data in Java), I think. You can try it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants