Show leaf values, i.e. leaf weights, for classification trees #239

mepland · 2023-01-07T00:45:52Z

Instead of printing the argmax predicted class name at each leaf for classification trees, allow the user to show the numeric value, i.e. weight, of the leaf as is done for regression trees. We may want to retain the current argmax class name behavior as an option for the user.

Somewhat related to #178

Current relevant code: trees.py

    prediction = node.prediction_name()

    if leaftype == 'pie':
        _draw_piechart(counts, size=size, colors=colors, filename=filename, label=f"n={nsamples}\n{prediction}",
                      graph_colors=graph_colors, fontname=fontname)
    elif leaftype == 'barh':
        _draw_barh_chart(counts, size=size, colors=colors, filename=filename, label=f"n={nsamples}\n{prediction}",
                      graph_colors=graph_colors, fontname=fontname)

For a get_prediction() example, see the sklearn_decision_trees.py implementation:

    def get_prediction(self, id):
        if self.is_classifier():
            counts = self.tree_model.tree_.value[id][0]
            return np.argmax(counts)
        else:
            return self.tree_model.tree_.value[id][0][0]

The text was updated successfully, but these errors were encountered:

mepland · 2023-01-09T04:11:47Z

Also discussed here.

parrt · 2023-01-14T19:09:59Z

yeah, let's see what @tlapusan thanks about creating a special function for classifiers, depending on the decision tree library, that returns a value to display.

tlapusan · 2023-01-18T17:14:17Z

The most important information of a leaf to display is the predicted class and after that the probability of the predictions, which shows the confidence of the predicted class. So IMO, we can add an option to display the probability, but not making it the default one. Indirectly... the user can deduce the probability of the predicted class by looking at the leaf pie chart...

All the dtreeviz visualisations were created to interprete trees which are independent (not interconnected), like a tree from a random forest... Indeed, xgboost is a little different and we can make some adjustments for it.

I'm in vacation this week, but I will thing about it while skiing ⛷️ .

mepland · 2023-01-19T23:55:33Z

So IMO, we can add an option to display the probability, but not making it the default one.

Totally happy to have the class name remain the default behavior. I would just like to extend it to also be able to show the leaf values if the user wants to enable them.

Indirectly... the user can deduce the probability of the predicted class by looking at the leaf pie chart...

For most tree models yes, but the FIGS model of csinva/imodels does not use the leaf positive class fraction for its leaf values; instead they are the residuals of the other trees in the ensemble for the points in the leaf. Plus it is always good to have a quantitative display option, rather than trying to read the leaf graph by eye for the % positive.

parrt · 2023-01-22T18:08:53Z

Rather than a user having a specify a dictionary, I think it's better if we come up with a function that is generic across libraries that returns a value that makes sense for that library. Then there is an option to flip it to show that value.

Or, we allow lambda or function as an argument that gets applied to each leaf node to get a value.

mepland · 2023-01-22T18:58:06Z

Yeah, makes sense - that is the elegant solution. I will work on writing up an implementation for sklearn.

mepland mentioned this issue Jan 7, 2023

Graphical tweaks and bug fixes #200

Closed

parrt added the enhancement New feature or request label Jan 8, 2023

mepland mentioned this issue Feb 14, 2023

FIGS dtreeviz support broken csinva/imodels#161

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Show leaf values, i.e. leaf weights, for classification trees #239

Show leaf values, i.e. leaf weights, for classification trees #239

mepland commented Jan 7, 2023

mepland commented Jan 9, 2023

parrt commented Jan 14, 2023

tlapusan commented Jan 18, 2023

mepland commented Jan 19, 2023

parrt commented Jan 22, 2023

mepland commented Jan 22, 2023

Show leaf values, i.e. leaf weights, for classification trees #239

Show leaf values, i.e. leaf weights, for classification trees #239

Comments

mepland commented Jan 7, 2023

mepland commented Jan 9, 2023

parrt commented Jan 14, 2023

tlapusan commented Jan 18, 2023

mepland commented Jan 19, 2023

parrt commented Jan 22, 2023

mepland commented Jan 22, 2023