[python] more detailed docs for trees_to_dataframe(), create_tree_digraph(), plot_tree() #3618

jameslamb · 2020-12-01T06:13:19Z

I've been looking at the [lightgbm] tag on Stacck Overflow recently, and have seen a few questions about how to interpret the output of some visualization methods in the Python Package.

I think there is an opportunity to improve the documentation for those methods. This PR proposes more detailed docs on trees_to_dataframe(), create_tree_digraph(), and plot_tree() to help people understand them better. It's very possible that I've also misunderstood the meaning of some of the these things, so I appreciate your thorough reviews.

Notes for Reviewers

I opened this from a LightGBM branch so that we can hopefully enable RTD builds and see if I made any formatting mistakes. @StrikerRUS if you agree with the spirit of this PR, could you enable RTD builds for this branch?

…raph(), plot_tree()

StrikerRUS · 2020-12-01T13:27:47Z

@jameslamb

could you enable RTD builds for this branch?

Sure, done!
https://readthedocs.org/projects/lightgbm/builds/

jameslamb · 2020-12-01T16:32:29Z

There are some warnings, will fix them tonight

/basic.py:docstring of lightgbm.Booster.trees_to_dataframe:6: WARNING: Unexpected indentation.
/basic.py:docstring of lightgbm.Booster.trees_to_dataframe:8: WARNING: Block quote ends without a blank line; unexpected unindent.
/plotting.py:docstring of lightgbm.create_tree_digraph:26: WARNING: Unexpected indentation.
/plotting.py:docstring of lightgbm.create_tree_digraph:27: WARNING: Block quote ends without a blank line; unexpected unindent.
/plotting.py:docstring of lightgbm.plot_tree:31: WARNING: Unexpected indentation.
/plotting.py:docstring of lightgbm.plot_tree:32: WARNING: Block quote ends without a blank line; unexpected unindent.
looking for now-outdated files... none found

https://readthedocs.org/projects/lightgbm/builds/12463523/

python-package/lightgbm/plotting.py

jameslamb · 2020-12-02T06:31:54Z

build is passing (https://readthedocs.org/projects/lightgbm/builds/12467879/) and I think the docs look ok, so this is ready for review.

Please check that my descriptions are correct, I'm not sure about all of them.

StrikerRUS

@jameslamb Thanks for this great PR! I believe it will help a lot users to better understand returned results of these methods.

Please check my comments below.

python-package/lightgbm/basic.py

StrikerRUS · 2020-12-02T15:05:50Z

python-package/lightgbm/basic.py

+            - ``split_feature``: string, identifier for the feature used for splitting.
+              This is of the form ``"Column_i"``, where ``i`` refers to the number of the
+              feature. For example, the first feature would be ``"Column_0"``. ``None``
+              for leaf nodes.


This description is not always true. Feature names can have custom values passed by user.

Suggested change

- ``split_feature``: string, identifier for the feature used for splitting.

This is of the form ``"Column_i"``, where ``i`` refers to the number of the

feature. For example, the first feature would be ``"Column_0"``. ``None``

for leaf nodes.

- ``split_feature``: string, name of the feature used for splitting.

``None`` for leaf nodes.

ok, I'll change this then. But I don't think just saying "name of the feature used for splitting" is sufficient. We should document both cases.

What do you think about this?

string, name of the feature used for splitting. If feature_name was not provided to the fit() call that produced this model, this will be of the form "Column_i", where i refers to the number of the feature. So for example, the first feature would be "Column_0".

I think it's important to specify that these Column_i names are 0-based (since that might not be obvious), and that they're based on the order of the input data.

@jameslamb

was not provided to the fit() call that produced this model

There are a lot of methods where feature names can be set, not only fit() from sklearn wrapper.

I believe it is enough to say that default feature names are in a form of "Column_i" (zero-based). But please read my next paragraph first.

We should document both cases.

Yeah, I support this idea! But I think that this can be documented in a more "general" place, because default feature names are common among all wrappers.

oh great points! Ok I'll change it, I agree with you

python-package/lightgbm/basic.py

python-package/lightgbm/plotting.py

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

…BM into docs/trees-to-dataframe

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

StrikerRUS

@jameslamb Great improvement! Very clean description!

@guolinke Could you please also check that field descriptions are correct?

python-package/lightgbm/plotting.py

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

guolinke

LGTM

github-actions · 2023-08-24T04:01:18Z

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

[python] more detailed docs for trees_to_dataframe(), create_tree_dig…

bd73a0e

…raph(), plot_tree()

jameslamb added the doc label Dec 1, 2020

jameslamb requested review from guolinke and StrikerRUS December 1, 2020 06:13

jameslamb requested review from chivee, henry0312 and wxchan as code owners December 1, 2020 06:13

fixing warnings

ae4fc07

fix warnings

02a20b5

jameslamb commented Dec 2, 2020

View reviewed changes

python-package/lightgbm/plotting.py Show resolved Hide resolved

undo unnecessary space

56bee7c

StrikerRUS requested changes Dec 2, 2020

View reviewed changes

jameslamb and others added 6 commits December 3, 2020 16:18

Merge branch 'master' into docs/trees-to-dataframe

54e2dc9

Apply suggestions from code review

57da659

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

Merge branch 'docs/trees-to-dataframe' of github.com:microsoft/LightG…

4f65232

…BM into docs/trees-to-dataframe

single line, better weight descriptions

6a79dfa

Apply suggestions from code review

a8921a2

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

column names

6efad43

jameslamb mentioned this pull request Dec 4, 2020

[docs] Add details on improving training speed #3628

Merged

StrikerRUS approved these changes Dec 6, 2020

View reviewed changes

python-package/lightgbm/plotting.py Show resolved Hide resolved

Update python-package/lightgbm/plotting.py

88ad5e3

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

guolinke approved these changes Dec 7, 2020

View reviewed changes

StrikerRUS merged commit eb03501 into master Dec 7, 2020

StrikerRUS deleted the docs/trees-to-dataframe branch December 7, 2020 12:06

github-actions bot locked as resolved and limited conversation to collaborators Aug 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python] more detailed docs for trees_to_dataframe(), create_tree_digraph(), plot_tree() #3618

[python] more detailed docs for trees_to_dataframe(), create_tree_digraph(), plot_tree() #3618

jameslamb commented Dec 1, 2020

StrikerRUS commented Dec 1, 2020

jameslamb commented Dec 1, 2020

jameslamb commented Dec 2, 2020

StrikerRUS left a comment

StrikerRUS Dec 2, 2020

jameslamb Dec 3, 2020

StrikerRUS Dec 4, 2020 •

edited

jameslamb Dec 4, 2020

StrikerRUS left a comment

guolinke left a comment

github-actions bot commented Aug 24, 2023

[python] more detailed docs for trees_to_dataframe(), create_tree_digraph(), plot_tree() #3618

[python] more detailed docs for trees_to_dataframe(), create_tree_digraph(), plot_tree() #3618

Conversation

jameslamb commented Dec 1, 2020

Notes for Reviewers

StrikerRUS commented Dec 1, 2020

jameslamb commented Dec 1, 2020

jameslamb commented Dec 2, 2020

StrikerRUS left a comment

Choose a reason for hiding this comment

StrikerRUS Dec 2, 2020

Choose a reason for hiding this comment

jameslamb Dec 3, 2020

Choose a reason for hiding this comment

StrikerRUS Dec 4, 2020 • edited

Choose a reason for hiding this comment

jameslamb Dec 4, 2020

Choose a reason for hiding this comment

StrikerRUS left a comment

Choose a reason for hiding this comment

guolinke left a comment

Choose a reason for hiding this comment

github-actions bot commented Aug 24, 2023

StrikerRUS Dec 4, 2020 •

edited