Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[R-package] add a tree plotting function #6729
base: master
Are you sure you want to change the base?
[R-package] add a tree plotting function #6729
Changes from 1 commit
6862821
0a7ea0e
5206b11
757dc84
55aba68
85ff97a
b4b648a
ed62441
2710705
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's simplify this, please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's please make this 1-based, as that's a direction we eventually want to move in the package: #4970 (review)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not totally convinced about this idea... it should be possible to recover the feature names from the model directly.
But before you remove this... can you please expand this doc and add examples and tests showing what this would look like? Right now, it's hard for me to understand what the content of
rules
is supposed to be.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand how
min_data = 1L
is related (growing a deeper tree makes the resulting plot more interesting). But I think we can safely removemetric = "l2"
(that will be the default for theregression
objective) and any customization of the learning rate (since here we're only interested in showing the structure of one tree).Let's simplify this, please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not need to repeat in a comment here the same information that's already in the roxygen comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't think of any situation where it would be ok for
model
ortree
to beNULL
, can you?If not, let's please require callers to provide values explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please follow the patterns used elsewhere in the library for this:
LightGBM/R-package/R/lgb.restore_handle.R
Lines 42 to 44 in 83c0ff3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't use the name
dt
. That is a function in the{stats}
package (for finding the density of a t-distribution)... try?dt
to see that.Shadowing names from the standard library can lead to confusing errors. Please use
modelDT
as the name for thisdata.table
instead.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please modify this error message so that it has enough information for someone to quickly debug the issue, like the provided value of
tree
and the number of trees in the model. And please combine it with the other check that the value is `>=01.Something like this:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this... what's the purpose of setting all rows to
0.0
and then immediately overwriting them? It seems to me that the0.0
could probably be removed.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add some comments to make it a bit easier to understand what's happening in this wall of code? It's very difficult to read (at least for me) as currently written).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's please avoid re-defining internal helper functions every time
lgb.plot.tree()
is called. This is a little bit expensive, and makes the code harder to read and develop.Please move this up near the top of the file, and give it a name beginning with a
.
to clarify that it's internaly-only, like.zero_present
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to my previous comment, please move this up out of the definition of
lgb.plot.tree()
and give it a name beginning with a.
, and without any other inner.
, like.levels_to_names
.Avoiding the inner dots is useful to reduce the risk of that function accidentally being interpreted as an S3 method in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this project, we prefer having an explicit
return()
statement in every function... to make the intention clearer and to avoid accidentally returning data unintentionally. See #3352 for some background.Please add an explicit return statement to every function you're defining here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this project, by convention we:
%>%
operatorPlease update this code and all the other code you're adding to follow that. Keeping all of the code looking the same across the codebase helps us to develop and review changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
xgboost
's implementation of similar functionality might be useful as a reference. See https://github.com/dmlc/xgboost/blob/e988b7cf1515b08ad0f949c26beb043ce0b33fe8/R-package/R/xgb.plot.tree.R#L159-L181Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.