No native way to change colors for plots such as Waterfall plots #3266

eZWALT · 2023-09-18T10:08:55Z

Problem Description

Hello, i've been using shap plots for a while and i've always had the same problem. Most plots have cmap parameter for changing the colors, but some of them , like shap.plots.waterfall and shap.plots.bar doesn't have support and i've really tried to edit things, and to some extent i've figured out a work-around to change the main colors. But there's no way to change things like text-color, edge-color (Which is pretty relevant , cause even if you change the main color, the border is still red) and other details. I suggest that a new parameter is introduced and colors are parametrized. Thank you!

Alternative Solutions

import matplotlib
import matplotlib.pyplot as plt

default_pos_color = "#ff0051"
default_neg_color = "#008bfb"
positive_color = "#ca0020"
negative_color = "#92c5de"
shap.plots.waterfall(shap_values[0], show = False)
for fc in plt.gcf().get_children():
    for fcc in fc.get_children():
        if (isinstance(fcc, matplotlib.patches.FancyArrow)):
            if (matplotlib.colors.to_hex(fcc.get_facecolor()) == default_pos_color):
                fcc.set_facecolor(positive_color)
            elif (matplotlib.colors.to_hex(fcc.get_facecolor()) == default_neg_color):
                fcc.set_color(negative_color)
        elif (isinstance(fcc, plt.Text)):
            if (matplotlib.colors.to_hex(fcc.get_color()) == default_pos_color):
                fcc.set_color(positive_color)
            elif (matplotlib.colors.to_hex(fcc.get_color()) == default_neg_color):
                fcc.set_color(negative_color)
plt.show()

This is a workaround to change colors, but it doesn't work if you change the positive colors to other color than red because the edgecolor is still red

Additional Context

No response

Feature request checklist

I have checked the issue tracker for duplicate issues.
I'd be interested in making a PR to implement this feature

The text was updated successfully, but these errors were encountered:

thatlittleboy · 2023-10-01T07:27:41Z

Seems like there is strong support in the community for this feature request, let's prioritize this for the following release. cc @connortann @dsgibbons

eZWALT · 2023-10-01T08:16:58Z

Thank you, @thatlittleboy , I would love to help if possible on implementing some features, but I don't understand the code structure at all (i haven't digged up much though). If i can help with something, just let me know. Discord: eZWALT

connortann · 2023-11-14T18:53:42Z

@CloseChoice , we have two PRs with possible implementations for this feature. I think it's quite an important design choice which direction we set, so I wanted to discuss some ideas with you before deciding how to proceed.

Design considerations

Some considerations that come to mind are:

Separation of config from code (e.g. perhaps using static config files)
Compatibility with the wider ecosystem (e.g. matplotib)
Simplicity
Customisability

In terms of functionality, we might want to be able to set a global "style" that is persistent and will affect every plot they create.

Option 1: Config options for each plot

I think a config dataclass makes sense for a single plot, as in #3377. However, I'm a bit concerned that it could rather messy if we extend this to handle other plots.

It becomes a bit tricky if we want to re-use certain parameters between plots, as it looks like each plot might have its own set of config options. Moreover, there is a really large number of possible style options, from fonts, text sizes, tick sizes, and so on. Some options would probably be shared between plots, whereas others are plot-specific.

If our config dataclass accounted for every customisable option, they might become quite large and bloated. Another difficulty is that it pollutes the signature of the plotting functions, as this config object has to be passed to each plot.

Option 2: matplotlib's rcParams

What do you think about using the matplotlib rcParams (runtime params) system? This is the mechanism that seaborn uses, and I think it works well. It allows us to separate the style configuration from the plotting, and the plotting functions just pick up whatever style is currently defined.

From the matplotlib docs:

You can dynamically change the default rc (runtime configuration) settings in a python script or interactively from the python shell. All rc settings are stored in a dictionary-like variable called matplotlib.rcParams, which is global to the matplotlib package.

Seaborn offers a high-level API that uses this system, e.g. sns.set_style. Seaborn has high-level functions which internally change the matplotlib rcParams object:

The style parameters control properties like the color of the background and whether a grid is enabled by default. This is accomplished using the matplotlib rcParams system.

To my mind, the advantage of using this kind of config system would be:

Consistency with wider ecosystem
Separation of concerns: "setting style options" vs "making plots"
Ability to override the global defaults, e.g. with a rcParams file
Ability to share config settings between different plots (e.g. primary color, secondary color...)

connortann · 2023-11-14T19:08:37Z

I note that matplotlib does not allow us to define custom parameters, as far as I can see. So, we could either to make use of existing settings, or perhaps create our own rcParams singleton.

As inspiration, I note that another plotting library ArviZ implemented a rcParams system that follows the behaviour of matplotlib. We could perhaps take that as inspiration:

arviz-devs/arviz#734

CloseChoice · 2023-11-15T20:20:10Z

@connortann
I can see your point and also see the downsides of the dataclass solution. Here a breakdown of my thoughts:

Option 1: Config options for each plot

this allows us and the user to specifically document what can be changed by which parameter (in contrast to rcParams. E.g. there are exactly four parameters that we allow to change in the waterfall plots (of course this can be extended in the future).
this option would give us a transparent overview over all styling options and then we can generalize styling options from there. Of course this might involve breaking changes which should be done with care but it allows us incrementally to use the styling feature.
I do not really see the issue (not considering my PR here) that this is polluting the signature. We can basically just allow the dataset class + one other option which should lead to highly bloated signatures (or avoid type hints here altogether).

Option 2: some sort of rcParams

since it is not obvious which plot supports rcParams settings it is hard to do this incrementally without confusing users. IMO we would need a major effort to implement this in most/all of our plots at once
when I am working with rcParams I am often confused which setting to use. So the lack of transparency is an issue IMO.
consistency with the ecosystem is an absolute plus
reusability is IMO in most cases a plus but it is also shared state between plots which as you pointed out might be hard to get right (which style settings should be applied for all plots, which shouldn't).

Why don't we go for a dataset class-ish solution where we initialize the dataclass values with settings from rcParams?
How would that look like? The style (or rc) argument in the plot is optional and defaults to None if it is not set the dataclass is instantiated with the matplotlib rcParam values. Then we can do things incrementally and also take into account global styling. Of course this would mean one data class for each plot.

connortann · 2023-11-29T15:14:33Z

Related to the rcParams discussion: #1430

connortann · 2023-11-30T11:10:30Z

I found quite a few issues from the past few years relating to the plotting API. I created a label "visualisation" to track these, and a "meta-issue" to discuss the over-arching effort: #3411

One thing that strikes me is that there is a desire not just to be able to customise elements such as colours, but also to have consistency between plots in things such as fonts, tick params & spines. I also think having separation of concerns is going to be really important, as I can see this getting very complex very quickly.

So, how about a hybrid approach of the options above: custom style options, implemented as a global setting or context manager?

Inspired by seaborn's example, the API could be like this:

# Global settings
shap.plots.set_style(...)
shap.plots.bar(shap_values)
shap.plots.beeswarm(shap_values)

# Or as a context manager
with shap.plots.set_style(...):
    shap.plots.bar(shap_values)

CloseChoice · 2023-11-30T16:16:59Z

@connortann This looks like a good approach to me. I think we might need a prototype for this and then decide on how to roll this out for other plots.

CloseChoice · 2023-12-06T13:30:30Z

related discussion #2566
related issues: #2767

RiccardoGTolli · 2024-03-19T20:10:04Z

    # Change color
    default_pos_color = "#ff0051"
    default_neg_color = "#008bfb"
    positive_color = "#ca0020"
    negative_color = "#92c5de"
    shap.plots.waterfall(shap_values_obs, show=False)
    for fc in plt.gcf().get_children():
        for fcc in fc.get_children():
            if isinstance(fcc, matplotlib.patches.FancyArrow):
                if matplotlib.colors.to_hex(fcc.get_facecolor()) == default_pos_color:
                    fcc.set_facecolor(positive_color)
                    fcc.set_edgecolor(positive_color)  # Set edge color for positive bars
                elif matplotlib.colors.to_hex(fcc.get_facecolor()) == default_neg_color:
                    fcc.set_facecolor(negative_color)
                    fcc.set_edgecolor(negative_color)  # Set edge color for negative bars
            elif isinstance(fcc, plt.Text):
                if matplotlib.colors.to_hex(fcc.get_color()) == default_pos_color:
                    fcc.set_color(positive_color)
                elif matplotlib.colors.to_hex(fcc.get_color()) == default_neg_color:
                    fcc.set_color(negative_color)

eZWALT added the enhancement Indicates new feature requests label Sep 18, 2023

eZWALT changed the title ~~ENH:~~ No native way to change colors for plots such as Waterfall plots Sep 20, 2023

thatlittleboy added this to the 0.44.0 milestone Oct 1, 2023

thatlittleboy added the todo Indicates issues that should be excluded from being marked as stale label Oct 5, 2023

This was referenced Nov 2, 2023

add waterfall color config for waterfall plot #3377

Open

#2806 waterfall color customisation #2807

Open

connortann added the visualization Relating to plotting label Nov 29, 2023

connortann mentioned this issue Nov 29, 2023

[Meta-issue] Improve plotting API for consistency #3411

Open

23 tasks

connortann modified the milestones: 0.44.0, 0.45.0 Nov 29, 2023

connortann mentioned this issue Mar 1, 2024

Expose arg s for marker size in beeswarm #3530

Merged

2 tasks

connortann modified the milestones: 0.45.0, 0.46.0 Mar 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No native way to change colors for plots such as Waterfall plots #3266

No native way to change colors for plots such as Waterfall plots #3266

eZWALT commented Sep 18, 2023 •

edited by thatlittleboy

thatlittleboy commented Oct 1, 2023 •

edited

eZWALT commented Oct 1, 2023

connortann commented Nov 14, 2023 •

edited

connortann commented Nov 14, 2023 •

edited

CloseChoice commented Nov 15, 2023 •

edited

connortann commented Nov 29, 2023

connortann commented Nov 30, 2023 •

edited

CloseChoice commented Nov 30, 2023

CloseChoice commented Dec 6, 2023 •

edited

RiccardoGTolli commented Mar 19, 2024 •

edited

No native way to change colors for plots such as Waterfall plots #3266

No native way to change colors for plots such as Waterfall plots #3266

Comments

eZWALT commented Sep 18, 2023 • edited by thatlittleboy

Problem Description

Alternative Solutions

Additional Context

Feature request checklist

thatlittleboy commented Oct 1, 2023 • edited

eZWALT commented Oct 1, 2023

connortann commented Nov 14, 2023 • edited

Design considerations

Option 1: Config options for each plot

Option 2: matplotlib's rcParams

connortann commented Nov 14, 2023 • edited

CloseChoice commented Nov 15, 2023 • edited

Option 1: Config options for each plot

Option 2: some sort of rcParams

connortann commented Nov 29, 2023

connortann commented Nov 30, 2023 • edited

CloseChoice commented Nov 30, 2023

CloseChoice commented Dec 6, 2023 • edited

RiccardoGTolli commented Mar 19, 2024 • edited

eZWALT commented Sep 18, 2023 •

edited by thatlittleboy

thatlittleboy commented Oct 1, 2023 •

edited

connortann commented Nov 14, 2023 •

edited

connortann commented Nov 14, 2023 •

edited

CloseChoice commented Nov 15, 2023 •

edited

connortann commented Nov 30, 2023 •

edited

CloseChoice commented Dec 6, 2023 •

edited

RiccardoGTolli commented Mar 19, 2024 •

edited