Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No native way to change colors for plots such as Waterfall plots #3266

Open
2 tasks done
eZWALT opened this issue Sep 18, 2023 · 10 comments · May be fixed by #3377
Open
2 tasks done

No native way to change colors for plots such as Waterfall plots #3266

eZWALT opened this issue Sep 18, 2023 · 10 comments · May be fixed by #3377
Labels
enhancement Indicates new feature requests todo Indicates issues that should be excluded from being marked as stale visualization Relating to plotting
Milestone

Comments

@eZWALT
Copy link

eZWALT commented Sep 18, 2023

Problem Description

Hello, i've been using shap plots for a while and i've always had the same problem. Most plots have cmap parameter for changing the colors, but some of them , like shap.plots.waterfall and shap.plots.bar doesn't have support and i've really tried to edit things, and to some extent i've figured out a work-around to change the main colors. But there's no way to change things like text-color, edge-color (Which is pretty relevant , cause even if you change the main color, the border is still red) and other details. I suggest that a new parameter is introduced and colors are parametrized. Thank you!

Alternative Solutions

import matplotlib
import matplotlib.pyplot as plt

default_pos_color = "#ff0051"
default_neg_color = "#008bfb"
positive_color = "#ca0020"
negative_color = "#92c5de"
shap.plots.waterfall(shap_values[0], show = False)
for fc in plt.gcf().get_children():
    for fcc in fc.get_children():
        if (isinstance(fcc, matplotlib.patches.FancyArrow)):
            if (matplotlib.colors.to_hex(fcc.get_facecolor()) == default_pos_color):
                fcc.set_facecolor(positive_color)
            elif (matplotlib.colors.to_hex(fcc.get_facecolor()) == default_neg_color):
                fcc.set_color(negative_color)
        elif (isinstance(fcc, plt.Text)):
            if (matplotlib.colors.to_hex(fcc.get_color()) == default_pos_color):
                fcc.set_color(positive_color)
            elif (matplotlib.colors.to_hex(fcc.get_color()) == default_neg_color):
                fcc.set_color(negative_color)
plt.show()

This is a workaround to change colors, but it doesn't work if you change the positive colors to other color than red because the edgecolor is still red

Additional Context

No response

Feature request checklist

  • I have checked the issue tracker for duplicate issues.
  • I'd be interested in making a PR to implement this feature
@eZWALT eZWALT added the enhancement Indicates new feature requests label Sep 18, 2023
@eZWALT eZWALT changed the title ENH: No native way to change colors for plots such as Waterfall plots Sep 20, 2023
@thatlittleboy
Copy link
Collaborator

thatlittleboy commented Oct 1, 2023

Seems like there is strong support in the community for this feature request, let's prioritize this for the following release. cc @connortann @dsgibbons

@thatlittleboy thatlittleboy added this to the 0.44.0 milestone Oct 1, 2023
@eZWALT
Copy link
Author

eZWALT commented Oct 1, 2023

Thank you, @thatlittleboy , I would love to help if possible on implementing some features, but I don't understand the code structure at all (i haven't digged up much though). If i can help with something, just let me know. Discord: eZWALT

@thatlittleboy thatlittleboy added the todo Indicates issues that should be excluded from being marked as stale label Oct 5, 2023
@connortann
Copy link
Collaborator

connortann commented Nov 14, 2023

@CloseChoice , we have two PRs with possible implementations for this feature. I think it's quite an important design choice which direction we set, so I wanted to discuss some ideas with you before deciding how to proceed.

Design considerations

Some considerations that come to mind are:

  • Separation of config from code (e.g. perhaps using static config files)
  • Compatibility with the wider ecosystem (e.g. matplotib)
  • Simplicity
  • Customisability

In terms of functionality, we might want to be able to set a global "style" that is persistent and will affect every plot they create.

Option 1: Config options for each plot

I think a config dataclass makes sense for a single plot, as in #3377. However, I'm a bit concerned that it could rather messy if we extend this to handle other plots.

It becomes a bit tricky if we want to re-use certain parameters between plots, as it looks like each plot might have its own set of config options. Moreover, there is a really large number of possible style options, from fonts, text sizes, tick sizes, and so on. Some options would probably be shared between plots, whereas others are plot-specific.

If our config dataclass accounted for every customisable option, they might become quite large and bloated. Another difficulty is that it pollutes the signature of the plotting functions, as this config object has to be passed to each plot.

Option 2: matplotlib's rcParams

What do you think about using the matplotlib rcParams (runtime params) system? This is the mechanism that seaborn uses, and I think it works well. It allows us to separate the style configuration from the plotting, and the plotting functions just pick up whatever style is currently defined.

From the matplotlib docs:

You can dynamically change the default rc (runtime configuration) settings in a python script or interactively from the python shell. All rc settings are stored in a dictionary-like variable called matplotlib.rcParams, which is global to the matplotlib package.

Seaborn offers a high-level API that uses this system, e.g. sns.set_style. Seaborn has high-level functions which internally change the matplotlib rcParams object:

The style parameters control properties like the color of the background and whether a grid is enabled by default. This is accomplished using the matplotlib rcParams system.

To my mind, the advantage of using this kind of config system would be:

  • Consistency with wider ecosystem
  • Separation of concerns: "setting style options" vs "making plots"
  • Ability to override the global defaults, e.g. with a rcParams file
  • Ability to share config settings between different plots (e.g. primary color, secondary color...)

@connortann
Copy link
Collaborator

connortann commented Nov 14, 2023

I note that matplotlib does not allow us to define custom parameters, as far as I can see. So, we could either to make use of existing settings, or perhaps create our own rcParams singleton.

As inspiration, I note that another plotting library ArviZ implemented a rcParams system that follows the behaviour of matplotlib. We could perhaps take that as inspiration:

arviz-devs/arviz#734

@CloseChoice
Copy link
Collaborator

CloseChoice commented Nov 15, 2023

@connortann
I can see your point and also see the downsides of the dataclass solution. Here a breakdown of my thoughts:

Option 1: Config options for each plot

  • this allows us and the user to specifically document what can be changed by which parameter (in contrast to rcParams. E.g. there are exactly four parameters that we allow to change in the waterfall plots (of course this can be extended in the future).
  • this option would give us a transparent overview over all styling options and then we can generalize styling options from there. Of course this might involve breaking changes which should be done with care but it allows us incrementally to use the styling feature.
  • I do not really see the issue (not considering my PR here) that this is polluting the signature. We can basically just allow the dataset class + one other option which should lead to highly bloated signatures (or avoid type hints here altogether).

Option 2: some sort of rcParams

  • since it is not obvious which plot supports rcParams settings it is hard to do this incrementally without confusing users. IMO we would need a major effort to implement this in most/all of our plots at once
  • when I am working with rcParams I am often confused which setting to use. So the lack of transparency is an issue IMO.
  • consistency with the ecosystem is an absolute plus
  • reusability is IMO in most cases a plus but it is also shared state between plots which as you pointed out might be hard to get right (which style settings should be applied for all plots, which shouldn't).

Why don't we go for a dataset class-ish solution where we initialize the dataclass values with settings from rcParams?
How would that look like? The style (or rc) argument in the plot is optional and defaults to None if it is not set the dataclass is instantiated with the matplotlib rcParam values. Then we can do things incrementally and also take into account global styling. Of course this would mean one data class for each plot.

@connortann
Copy link
Collaborator

Related to the rcParams discussion: #1430

@connortann connortann modified the milestones: 0.44.0, 0.45.0 Nov 29, 2023
@connortann
Copy link
Collaborator

connortann commented Nov 30, 2023

I found quite a few issues from the past few years relating to the plotting API. I created a label "visualisation" to track these, and a "meta-issue" to discuss the over-arching effort: #3411

One thing that strikes me is that there is a desire not just to be able to customise elements such as colours, but also to have consistency between plots in things such as fonts, tick params & spines. I also think having separation of concerns is going to be really important, as I can see this getting very complex very quickly.

So, how about a hybrid approach of the options above: custom style options, implemented as a global setting or context manager?

Inspired by seaborn's example, the API could be like this:

# Global settings
shap.plots.set_style(...)
shap.plots.bar(shap_values)
shap.plots.beeswarm(shap_values)

# Or as a context manager
with shap.plots.set_style(...):
    shap.plots.bar(shap_values)

@CloseChoice
Copy link
Collaborator

@connortann This looks like a good approach to me. I think we might need a prototype for this and then decide on how to roll this out for other plots.

@CloseChoice
Copy link
Collaborator

CloseChoice commented Dec 6, 2023

related discussion #2566
related issues: #2767

@RiccardoGTolli
Copy link

RiccardoGTolli commented Mar 19, 2024

    # Change color
    default_pos_color = "#ff0051"
    default_neg_color = "#008bfb"
    positive_color = "#ca0020"
    negative_color = "#92c5de"
    shap.plots.waterfall(shap_values_obs, show=False)
    for fc in plt.gcf().get_children():
        for fcc in fc.get_children():
            if isinstance(fcc, matplotlib.patches.FancyArrow):
                if matplotlib.colors.to_hex(fcc.get_facecolor()) == default_pos_color:
                    fcc.set_facecolor(positive_color)
                    fcc.set_edgecolor(positive_color)  # Set edge color for positive bars
                elif matplotlib.colors.to_hex(fcc.get_facecolor()) == default_neg_color:
                    fcc.set_facecolor(negative_color)
                    fcc.set_edgecolor(negative_color)  # Set edge color for negative bars
            elif isinstance(fcc, plt.Text):
                if matplotlib.colors.to_hex(fcc.get_color()) == default_pos_color:
                    fcc.set_color(positive_color)
                elif matplotlib.colors.to_hex(fcc.get_color()) == default_neg_color:
                    fcc.set_color(negative_color)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Indicates new feature requests todo Indicates issues that should be excluded from being marked as stale visualization Relating to plotting
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants