Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Python Bindings for LightGBM #11

Closed
chivee opened this issue Oct 18, 2016 · 13 comments
Closed

[Feature] Python Bindings for LightGBM #11

chivee opened this issue Oct 18, 2016 · 13 comments
Assignees

Comments

@chivee
Copy link
Collaborator

chivee commented Oct 18, 2016

Hi all, we have tentative plan on extending LightGBM to python users. please share us your opinions.

@ParadoxShmaradox
Copy link

I think this should be a great step forward with adoption, of course it should be also pretty easy to install and use.
These are the features I think you should consider for python support:

  • install via pip
  • scikit learn interface so people can use fit/predict and have a drop-in replacement for xgboost/GBM
  • Loading/Saving models via pickle or dedicated methods

@yychenca
Copy link

Agree with ParadoxShmaradox. And for me earlystopping would be a great feature to have.

@chivee
Copy link
Collaborator Author

chivee commented Oct 19, 2016

@ParadoxShmaradox @yychenca , we can break its procedure into two several phrases: 1. python bindings. 2 scikit-learn(or other popular framework) interface.

As for model (de)serialization, cross validation, and early stopping seems to be another popular features we should move on. I'll open issues for this features.

@geoHeil
Copy link

geoHeil commented Oct 19, 2016

Want to add, that something like xgboosts scale_pos_weight to handle imbalanced classes, or even better an Implementation of Example-Dependent Cost-Sensitive would be very interesting http://nbviewer.jupyter.org/github/albahnsen/CostSensitiveClassification/blob/master/doc/tutorials/tutorial_edcs_credit_scoring.ipynb

@mxbi
Copy link

mxbi commented Oct 19, 2016

I think we need a way to create the LightGBM binary format directly from a NumPy array instead of having to do an expensive write&read to CSV/SVMLight. For really big datasets (eg. Kaggle Bosch Competition) it can take an hour to read and write CSVs, so something like this (functionality similar to xgboost.DMatrix()) would be excellent.

@modkzs
Copy link

modkzs commented Oct 20, 2016

I think user-define loss and eval function is also important

@yychenca
Copy link

@chivee Thank you chivee. I had a chance to try LightGBM with a dataset of 180k rows and 30 features for a regression problem. The training completed in 8 seconds as compared to 52 secs by XGBoost using comparable parameters, 63 leaves vs 8 depths. And the accuracy of LIghtGBM (L1/MAE ) was even a bit better. Truly impressed!!

Noticed that early stopping has already been added and I'll probably give it another try soon.

Anyways, great job!

@ArdalanM
Copy link

A quick wrapper for LightGBM: https://github.com/ArdalanM/pyLightGBM
It still dump input to svm format before training on.

@chivee
Copy link
Collaborator Author

chivee commented Oct 21, 2016

@ArdalanM , thank you ardalanM , that will be very helpful!

@guolinke guolinke added this to the 0.1 milestone Oct 23, 2016
@chivee
Copy link
Collaborator Author

chivee commented Nov 10, 2016

Hi guys, a simple Python binding using Ctype can be found at
https://github.com/Microsoft/LightGBM/blob/master/tests/c_api_test/test.py,

Any feedback to the bindings will be great.

@ternaus
Copy link

ternaus commented Nov 15, 2016

Existing python binding is great, but I would love to be able to replace existing algorithms in the pipeline and most likely they have scikit-learn interface.

At least on the level of:

    1. fit
    1. transform
    1. fit_transform
    1. predict
    1. predict_proba

@guolinke
Copy link
Collaborator

refer to #94.
The basic class are finished. I think we can base on these classes to implement sklearn-liked interfaces.

guolinke added a commit that referenced this issue Dec 1, 2016
@guolinke
Copy link
Collaborator

guolinke commented Dec 1, 2016

Hi, All,

the branch of python-package is merged, refer to https://github.com/Microsoft/LightGBM/tree/master/python-package
Welcome to have a try and provide feedback and issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants