Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run_example failed in MacOS #246

Closed
cuiwow opened this issue Apr 12, 2019 · 9 comments
Closed

run_example failed in MacOS #246

cuiwow opened this issue Apr 12, 2019 · 9 comments

Comments

@cuiwow
Copy link

cuiwow commented Apr 12, 2019

I installed a new version with the build.sh in MacOS (Xcode and CMake installed first).
When running the run_example.sh, it aborted with the error message:
image

MacOS: Mojave 10.14.4
xlearn: 0.43

(the pip installed one in version 0.40a works on my computer, but has a bug just fixed in latest version)

@aksnzhy
Copy link
Owner

aksnzhy commented Apr 12, 2019

@cuiwow I find the same error. You can solve the problem by doing this:

mkdir a build outside the xlearn source code and go into it:

cmake ../xlearn
make -j4

and then you can try the demo successfully.

I will figure out why this error come out. Thanks!

@aksnzhy
Copy link
Owner

aksnzhy commented Apr 12, 2019

@cuiwow I fixed the bug and you can use the latest code to test it. Thank you!

@cuiwow
Copy link
Author

cuiwow commented Apr 12, 2019

@aksnzhy Thanks, that works.

You have to agree with the xcode agreement before using /usr/bin/cc, otherwise it would throw an exception. I don't know whether it is the reason, and how i could install it before without /usr/bin/cc.

@cuiwow
Copy link
Author

cuiwow commented Apr 12, 2019

@aksnzhy Another case: failed with my own data, the same error message.

Line size is about 2.1MB,
csv file format: label,value0,value1... label is a int 0/1, while values are binary float 0.0/1.0
cmd: xlearn_train ./fm_train.csv -s 2 -v ./fm_eval.csv -x acc

It works when using just 10 values. But failed using real data with more than 500,000 columns, as the attached file.
I have checked the columns' number, and each element is an integer or float number.
fm_train.txt

@aksnzhy
Copy link
Owner

aksnzhy commented Apr 12, 2019

@etveritas Could you please help @cuiwow to solve this issue? Thanks!

@etveritas
Copy link
Collaborator

@aksnzhy okay, and @cuiwow, I'll try to solve this issue, please give me a minute.

@etveritas
Copy link
Collaborator

etveritas commented Apr 12, 2019

@aksnzhy There is another kMaxLineSize left unchanged before, and I have modified it. @cuiwow after change this constant, I test on WSL with your data, it passed, you can use the latest source code, btw, xLearn still skip zeros when read CSV, you should comment this line out if need.

@cuiwow
Copy link
Author

cuiwow commented Apr 12, 2019

@aksnzhy @etveritas
Thanks a lot. I have success in running a demo. Then encountered another problem...

I have a line size of 2MB. Industry recommendation datasets always have more than 1M lines, or 10M lines at least. The total data size would be larger than 2TB, while the original data size is only 50GB.

I want to train it on a single server, to build a baseline. But my disk is not large enough..

I think the data should be processed into the right format batch by batch during training, not whole data.

Thanks for your efforts.

@etveritas
Copy link
Collaborator

etveritas commented Apr 12, 2019

@cuiwow I'm not sure whether your mean is having no enough memory. If it is, for large dataset, xLearn have supported on-disk train if your machine have enough memory, you can find more in https://xlearn-doc.readthedocs.io/en/latest/large/index.html.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants