Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying to install and getting xgboost errors #63

Closed
TimusLetap opened this issue Apr 17, 2019 · 19 comments
Closed

Trying to install and getting xgboost errors #63

TimusLetap opened this issue Apr 17, 2019 · 19 comments

Comments

@TimusLetap
Copy link

Systems is Kaggle kernel which is Ubuntu based which seems to be the desired environment

I rung this:

!apt-get install build-essential
!pip install cmake
!pip install xgboost>=0.6a2
!pip install lightgbm>=2.0.2
!pip install mlbox

Resulting in this:

Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-xib6_1h7/xgboost/

Can you please help me out? I see your examples are also Kaggle based, but they don't have the install steps. Do you somehow install packages from setup within the kernel???

@TimusLetap
Copy link
Author

when re-running multiple times the error changes to variants of xgboost errors, not sure what that means. I've been trying to replicate your titanic Kaggle build to see how to use and make it work, then I can confidently apply this to other datasets.

@AxeldeRomblay
Copy link
Owner

I'll work on the setup (to replace xgboost by Lightgbm). Meanwhile you can try to install xgboost==0.6a2 manually...

@AxeldeRomblay
Copy link
Owner

see #55

@TimusLetap
Copy link
Author

Same error(s) come up. I think you may be right that xgboost is causing the problem. Replacing might help. Lightgbm doesn't seem to be causing issues. Then again I haven't been able to run the actual package so I don't know what the limitations are.

@TimusLetap
Copy link
Author

Can you perhaps share a screen shot of how you installed on Kaggle Kernel to get your examples working?

@AxeldeRomblay
Copy link
Owner

Did you manage to install xgboost first ? If not please refer to : https://xgboost.readthedocs.io/en/latest/build.html

@TimusLetap
Copy link
Author

Are you able to share screen shots of you implementing and importing these libraries in your Kaggle examples? That would be super useful and would most likely address such issues. I was trying to follow your Kaggle examples but without being able to actually run the package it has proven quite difficult. I am able to install xgboost but the xgboost errors arise from the mlbox installation. xgboost on its own runs fine. When installing mlbox is when the xgboost specific errors arise.

@TimusLetap
Copy link
Author

TimusLetap commented Apr 25, 2019

I hope I am being clear and concise. If you can re-create your use of mlbox on your kaggle examples and share screenshots of how you were able to load and runt the package in that environment it would be highly appreciated and useful for recreating the process. Thank you.

@richinvest
Copy link

I have the same issue like Timus. No issues with x boost, just when installing mlbox.

@AxeldeRomblay
Copy link
Owner

Ok I have just removed XGBoost from the setup file (and the imports in the code)... Can you try to reinstall MLBox from pip or from the github please ? It should work now... see https://github.com/AxeldeRomblay/MLBox/tree/master/python-package
Thanks !
PS : I will share screenshots next week, I just need more time...

@TimusLetap
Copy link
Author

No worries! I understand. I just want to be able to use and spread the news of the package for others! Take your time. It's better to have this working than not. I'll post as soon as I run some tests.

@TimusLetap
Copy link
Author

I ran MLbox seems to run and call ok, just having issues loading dataset, are there parameters we can use for read function? Or is the tqdm wrapper compatible?

@TimusLetap
Copy link
Author

Is there a way to pass dataframes (pandas if need to be specified) for MLBox to process??

@AxeldeRomblay
Copy link
Owner

ok great :) can you open a new issue please with screenshots/snippets of the code/error message ?
For the moment you have to dump your dataframes and read+preprocess using MLBox... but for the next release it will be two separated tasks that so that you can skip the reading task if you have data already loaded...

@TimusLetap
Copy link
Author

What size csv files does it handle at the moment? I am having trouble reading in large dataset

@AxeldeRomblay
Copy link
Owner

how many rows and features do you have ?

@TimusLetap
Copy link
Author

I have 2 Columns with 64M rows, One column is designated as the feature and the other is the target variable.

@TimusLetap
Copy link
Author

TimusLetap commented May 10, 2019 via email

@AxeldeRomblay
Copy link
Owner

Hello ! The install problem is solved (xgboost is now removed in the latest release...). For the reading issue, I think your dataset is just too large ? maybe you can open a new issue with a screenshot (is it the reading or the preprocessing that takes a lot of time ?) Thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants