-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add LightGBM Regressor #1459
Add LightGBM Regressor #1459
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1459 +/- ##
=========================================
+ Coverage 100.0% 100.0% +0.1%
=========================================
Files 236 238 +2
Lines 16933 17136 +203
=========================================
+ Hits 16925 17128 +203
Misses 8 8
Continue to review full report at Codecov.
|
…o bc_296_lgbm_regressor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bchen1116 nice going! Great work debugging that weird looking glass error, haha.
I left some questions about the parameters, and some comments on the perf test doc. I think it would be cool to change the default max_depth
to something greater than 0, and see how that impacts the accuracy in the first batch. It would be cool to try perf testing a couple other tweaks too but not required.
SEED_MIN = 0 | ||
SEED_MAX = SEED_BOUNDS.max_bound | ||
|
||
def __init__(self, boosting_type="gbdt", learning_rate=0.1, n_estimators=100, max_depth=0, num_leaves=31, min_child_samples=20, n_jobs=-1, random_state=0, bagging_fraction=0.9, bagging_freq=0, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bchen1116 any particular reason for choosing these default values? It could be interesting to do some scanning on these parameters and see if they should be tweaked. max_depth=0
jumps out to me as being too low--the default in lightgbm is -1 i.e. "no limit", but I don't think we should do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, max_depth <= 0
results in the same behavior of setting "no limit" to the depth. I can definitely tweak it though, but the values I chose are the defaults for LightGBM, and I remember having a discussion to use these provided defaults rather than choosing arbitrary defaults on our end.
…o bc_296_lgbm_regressor
Perf docs updated with new tests that determine the idea |
…o bc_296_lgbm_regressor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bchen1116 This is great! The performance tests to see the impact of different default parameter values are awesome.
There's a lot of overlap between this class and the lgbm classifier. It'd be nice if we could share at least the logic for overriding the parameters in init and _encode_categories
. What do you think? Not blocking and probably best to do in another issue anyways if you agree.
@freddyaboulton definitely agree! It would be nice to run some perf tests on the LGBM Classifier and see if we can find better default values, especially since we have much more test data for classification. I think that is a good separate issue, which I've just submitted here |
fix #296
Perf tests here
Documentation here