Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix 'RF' error for LightGBM Classifier #1302
Fix 'RF' error for LightGBM Classifier #1302
Changes from 11 commits
ee0b36a
fdb2153
111fcbe
94cbf69
cd68682
d14006c
ae59831
c6c491c
78e40d5
492b248
d03ffc5
b3e8b40
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why default
bagging_freq
to 0? Won't that cause the bug whenboosting_type="rf"
? What default does lightgbm choose for this parameter?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LightGBM defaults to 0 for
bagging_freq
. Users can set it to 1 and changebagging_fraction
if they want to speed up computation and randomly select data for other boosting types, but it's required to be 1 forboosting_type=rf
(along with0 < bagging_fraction < 1.0
).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. This looks good. Is 0.9 the default
bagging_fraction
in lightgbm?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dsherry it defaults to 1.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bchen1116 could you please explain why adding these two parameters fixed the bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As some background, LightGBM has 4 boosting types: "gbdt", "dart", "goss", "rf".
Bagging_freq
refers to the frequency of bagging, where it bags everybagging_freq = k
iterations (0 means it doesn't bag).bagging_fraction
refers to the amount of data randomly selected without resampling (1 means select all, 0 means none). This can help speed up the training process.The default
bagging_freq
that LightGBM sets is 0, which works withgbdt
,dart
, andgoss
. However, forrf
, since its random forest, LightGBM requires that it uses bagging, which meansbagging_freq
must be 1 andbagging_fraction
must be set to be below 1.0. By adding those two parameters and changingbagging_freq
when theboosting_type=rf
, we do a simple fix to avoid this bug.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clear explanation! That makes sense.
Can we tweak the comment you left on line 48: