Improve multiple semantic legend titles #1440

mwaskom · 2018-05-30T14:02:48Z

Currently the legends for lineplot (#1285) and scatterplot (#1436) don't indicate what variable each of the levels represent. I punted on this but it needs a solution. One option is to cram everything into the legend title but that's not ideal. I think doing something with legend entries that have invisible handles and and then bumping the legend text so it looks centered would work:

f, ax = plt.subplots()

h1 = ax.scatter([], [], s=0, label="Title 1")
h2 = ax.scatter([], [], label="level 1")
h3 = ax.scatter([], [], s=0, label="Title 2")
h4 = ax.scatter([], [], label="level 2")

legend = ax.legend()
for t in legend.get_texts()[::2]:
    xfm = transforms.offset_copy(t.get_transform(), ax.figure, x=-10, units="points")
    t.set_transform(xfm)

But getting this to be really robust is going to be tricky.

CRiddler · 2018-06-07T22:45:37Z

Had my eye on this one for a few days now, and I just got some time to look into it, so here are some thoughts.

I think doing something with legend entries that have invisible handles and and then bumping the legend text so it looks centered would work

This was my thought at first too, however even when you specify s=0 and/or marker="" a legend marker is still drawn and is technically visible. At least based on your example, instead of looking for a marker thats invisible, we can look for a marker with a size of 0 (or an alpha of 0 if we want to go that route) and set it to be invisible. Once we set the marker to be invisible, we can set the legend column(s) to have a center align and that will properly align the elements that no longer have a visible marker. Copy paste example:

import matplotlib.pyplot as plt
import matplotlib

f, ax = plt.subplots()

h1 = ax.scatter([], [], alpha=0, label="Title 1") #example with alpha=0
h2 = ax.scatter([], [], label="level 1")
h3 = ax.scatter([], [], s=0, label="Title 2") #example with size=0
h4 = ax.scatter([], [], label="level 2")

legend = ax.legend()

for vpack in legend._legend_handle_box.get_children():
    vpack.align = 'center'
    
    for hpack in vpack.get_children():
        draw_area, text_area = hpack.get_children()
        for collection in draw_area.get_children():
            alpha = collection.get_alpha()
            sizes = collection.get_sizes()
            if alpha == 0 or all(sizes == 0):
                draw_area.set_visible(False)

plt.show()

mwaskom · 2018-06-13T19:38:15Z

That's a nice idea but unfortunately it won't work because we can't touch the private legend._legend_handle_box attribute. Is there a way to get at those boxes through the public API?

mwaskom · 2018-06-13T21:36:04Z

It's occurred to me that any tricks we do with the legend drawn by the axes-level plots won't propagate to the legends drawn by FacetGrid/PairGrid, so I'm starting to come to terms with adding labels for the semantics but just leaving then left-justified in the right column.

Perhaps there could be a special glyph to use that will identify what is a section title rather than an entry? Not sure what, though.

CRiddler · 2018-06-14T16:53:38Z

It's occurred to me that any tricks we do with the legend drawn by the axes-level plots won't propagate to the legends drawn by FacetGrid/PairGrid

Agreed. However since you end up building the legend by hand and adding it in for the user for both your axes level plots and your FacetGrids could we supply a post-hoc way of centering labels?

#Ex1 posthoc center legend labels that have alpha=0
def center_subtitles(legend):
    """Centers legend labels with alpha=0
    """
    vpackers = legend.findobj(matplotlib.offsetbox.VPacker)
    for vpack in vpackers[:-1]: #Last vpack will be the title box
        vpack.align = 'center'
        for hpack in vpack.get_children():
            draw_area, text_area = hpack.get_children()
            for collection in draw_area.get_children():
                alpha = collection.get_alpha()
                sizes = collection.get_sizes()
                if alpha == 0 or all(sizes == 0):
                    draw_area.set_visible(False)
    return legend


def ex1_mpl_posthoc_centering():
    """Explicit call to center the legend subtitles
    """
    f, ax = plt.subplots()
    
    h1 = ax.scatter([], [], alpha=0, label="Title 1")
    h2 = ax.scatter([], [], label="level 1")
    h3 = ax.scatter([], [], label="level 2")
    h4 = ax.scatter([], [], alpha=0, label="Title 2")
    h5 = ax.scatter([], [], label="level 1")
    
    ax.set_title("Ex1. outside func to center legend\nafter it has been created")
    ax_legend = ax.legend(loc="upper left")
    fig_legend = f.legend()
    
    #Center legend labels w/ alpha==0
    center_subtitles(ax_legend)
    center_subtitles(fig_legend)
    
    plt.show()

ex1_mpl_posthoc_centering()

You could build the legend regularly, adding in labels with alpha=0 and then call it for

mwaskom · 2018-06-14T18:50:26Z

That looks great.

I'm a little worried about unintended consequences of simply grabbing all the vpackers, but it seems robust to trying a few things.

One thing that doesn't work, unfortunately, is that it's not robust to multiple calls to legend(). That is going to make it difficult to revise the legend after it's been drawn, e.g. to move it to a new place. The easiest way to do that in other places in seaborn is just to call ax.legend again.

As for interfacing with FacetGrid, the basic approach there has been to make that as general as possible and to not make any assumptions about the functions that are being passed to it to draw. But there will be a higher-level function to handle important bookkeeping when faceting lineplot and scatterplot, similar to the relationship between factorplot and the categorical plotters. This logic could go in there. And perhaps it could be a public-facing utility function, although that feels like a broken abstraction.

Ideally the best solution would be official support for "sectioned" legends in matplotlib but seaborn wouldn't be able to take advantage of that at this point anyway.

CRiddler · 2018-06-14T19:40:00Z

I'm a little worried about unintended consequences of simply grabbing all the vpackers, but it seems robust to trying a few things.

Yeah I was worried about this as well. But, as long as the user hasn't meddled with the legend packing (I'm not sure if anyone does, because it would mean drawing up their own ax.legend() function essentially) this should work.

unfortunately, is that it's not robust to multiple calls to legend()

Yep. Maybe we could expose the centering function to users and they could use it if needed? Definitely not an ideal way of setting it up though. I did come up with a more robust solution, but it involves monkey patching the legend handlers which I'm not too keen on. I'll post it here so you want to take a look at it.

import matplotlib.pyplot as plt
from matplotlib.legend_handler import HandlerPathCollection
from matplotlib.legend import Legend
import functools

def subtitle_decorator(handler):
    @functools.wraps(handler)
    def wrapper(legend, orig_handle, fontsize, handlebox):
        handle_marker = handler(legend, orig_handle, fontsize, handlebox)
        if handle_marker.get_alpha() == 0:
            handlebox.set_visible(False)
    return wrapper

def mpl_patch_default_handler_map():
    
    #Adds our decorator to all legend handler functions
    for handler in Legend.get_default_handler_map().values():
        handler.legend_artist = subtitle_decorator(handler.legend_artist)
    
    f, ax = plt.subplots()
    
    h1 = ax.scatter([], [], alpha=0, label="Title 1")
    h2 = ax.scatter([], [], label="level 1")
    h3 = ax.scatter([], [], label="level 2")
    h4 = ax.scatter([], [], alpha=0, label="Title 2")
    h5 = ax.scatter([], [], label="level 1")
    
    ax.set_title("Ex2. set width of draw area to 0\nleft adjusts the text")
    ax_legend = ax.legend(loc="upper left")
    fig_legend = f.legend()
    plt.show()

mpl_patch_default_handler_map()

But as you'll notice, we're gaining robustness at the cost of mucking around with matplotlib's internals which will cause the unintended side effect of users wondering why all of their markers with alpha=0 are left justifying their associated text in the legend.

Ideally the best solution would be official support for "sectioned" legends in matplotlib but seaborn wouldn't be able to take advantage of that at this point anyway

Yeah, it'd be a lot of code refactoring :(

mwaskom · 2018-06-14T19:52:20Z

That is clever, but illegal :)

However, is there no way to get something similar to work with the handler_map that you can pass to ax.legend?

CRiddler · 2018-06-14T20:04:47Z

I did have a working function where you would pass it the handles you wanted to be subtitles and it would return a handler_map you can pass to ax.legend but I figured we would hit the same issue that we hit earlier with if a user repeated the call to ax.legend it would lose the handler_map data unless the user had access to it and knew to pass it into ax.legend()

However, thinking about it- I may have gotten another idea. I'll do some testing and post back.

CRiddler · 2018-06-14T22:05:33Z

Instead of changing the defaults to check for alpha=0 we can create our own handler_map with a class factory. Essentially we iterate through handles we want to be "subtitles", map them get their appropriate Handler class, subclass that and decorate the legend_artist method set the visibility to be 0. Then return a handler_map of each subtitle handle instance to the subclassed Handler. This way we don't step on any of the defaults of matplotlib

Only downside, is the user will need to supply the handler_map to calls of ax.legend()

import matplotlib
import matplotlib.pyplot as plt
from matplotlib.legend import Legend
from matplotlib.legend_handler import HandlerBase, update_from_first_child

def subtitle_handler_factory(inherit_from):
    """Class factory to subclass Handlers and add our custom functionality
    """
    class SubtitleHandler(inherit_from):
        def legend_artist(self, legend, orig_handle, fontsize, handlebox):
            handlebox.set_visible(False)
            return inherit_from.legend_artist(self, legend,
                                              orig_handle, fontsize,
                                              handlebox)
    
    #HandlerPatch class needs a special unpdate_func
    if inherit_from is matplotlib.legend_handler.HandlerPatch:
        return SubtitleHandler(update_func=update_from_first_child)
    return SubtitleHandler()

def subtitle_handler_map(subtitles):
    defaults_handler_map = Legend.get_default_handler_map()
    handler_map = {}
    
    for orig_handle in subtitles:
        handler = Legend.get_legend_handler(defaults_handler_map, orig_handle)
        
        #Subclass the Handler
        new_handler = subtitle_handler_factory(type(handler))
        handler_map[orig_handle] = new_handler
    return handler_map

def mpl_new_default_handler_map():
    f, ax = plt.subplots()

    h1 = ax.scatter([],[], label="Title 1")
    h2 = ax.scatter([],[], label="level 1")
    h3 = ax.scatter([],[], label="level 2")
    h4 = ax.scatter([],[], label="Title 2")
    h5 = ax.scatter([],[], label="level 1")

    h6 = ax.bar(0,0, label="Title 3")
    h7 = ax.bar(0,0, label="level 1")
    h8 = ax.bar(0,0, label="level 2")
    h9 = ax.bar(0,0, label="Title 4")
    h10 = ax.bar(0,0, label="level 1")

    subtitles = [h1, h4, h6, h9]
    handler_map = subtitle_handler_map(subtitles)
    
    ax.set_title("Ex2. Class factory to pass subtitles to")
    ax_legend = ax.legend(loc="upper left", handler_map=handler_map)
    fig_legend = f.legend(handler_map=handler_map)
    
    plt.show()

mpl_new_default_handler_map()

My only other idea, which I'm not even 100% sure on how to implement is to dynamically subclass the Handles themselves to be a Subtitle.[OriginalHandle] (in addition to subclassing their Handler counterparts). Then insert the new {Subtitle.[OriginalHandle]: subclassed-Handler} pair into the default Handler dictionary. The seaborn would still need to keep track of which handles they want to become subtitles, but it would circumvent the need of passing around a handler_map dictionary to ax.legend() all the time.

mwaskom · 2018-06-15T16:13:44Z

This last approach is definitely the most sophisticated, but I think if both options require user involvement, "call this function and it will left-align labels for non-visible legend entries" is a bit easier to understand than "pass this handler map when you call ax.legend and magic will happen".

CRiddler · 2018-06-15T17:27:23Z

I’m in agreement here. I like the nonmagicalness of the center_legend function. However I’m not 100% a fan of how it relies on detecting invisible legend entries to determine if a label becomes a subtitle. Seems like a roundabout way and not as explicit as I would like.

Just bouncing ideas here, but what if the center_legend also had an argument where the user could pass a dictionary to restructure the legend ordering. Something like

subtitle_legend(ax, legend_order)

Where legend_order would be a user generated dictionary of subtitles they with to add, and the labels that would appear under the given subtitle like so:

{‘subtitle1’: [‘group1’, ‘group2’], 
 ‘subtitle2’: [‘level1’, ‘level2’]}

Then we can go through the legend handles/labels, pull out the handles that match the inputted labels and add a subtitle to them. Then submit that ordering to a call of ax.legend(handles, labels). I’ll write up a function to better demonstrate what I’m talking about when I get some time today. The only down side I see here is that a label under one subtitle can’t be the same as a label under another subtitle.

mwaskom · 2018-06-15T18:10:20Z

I think rather than checking for 0 alpha/size, we can use legend artist visibility as a proxy for whether the label should be left-aligned (which I actually think looks a bit better than centered).

CRiddler · 2018-06-15T19:35:25Z

Ah I just saw your comment as I was coming to post this. I think checking on the legend artist visibility wouldn't be too hard and be a more appropriate proxy than alpha because we can set the visibility of the handle afterwards. What I was just thinking is that instead of even adding the subtitles to the Axes at all (right now, we're calling ax.scatter([], [], label="title 1") solely for the label) we don't really want anything to do with the scatter what if we had the user supply the ordering in a wrapper around ax.legend()? Sorry for throwing a billion ideas at you!

import seaborn as sns
import matplotlib
import matplotlib.pyplot as plt

def subtitle_legend(ax, legend_format):
    new_handles = []
    
    handles, labels = ax.get_legend_handles_labels()
    label_dict = dict(zip(labels, handles))
    
    #Means 2 labels were the same
    if len(label_dict) != len(labels):
        raise ValueError("Can not have repeated levels in labels!")
    
    for subtitle, level_order in legend_format.items():
        #Roll a blank handle to add in the subtitle
        blank_handle = matplotlib.patches.Patch(visible=False, label=subtitle)
        new_handles.append(blank_handle)
        
        for level in level_order:
            handle = label_dict[level]
            new_handles.append(handle)

    #Labels are populated from handle.get_label() when we only supply handles as an arg
    legend = ax.legend(handles=new_handles)

    #Turn off DrawingArea visibility to left justify the text if it contains a subtitle
    for draw_area in legend.findobj(matplotlib.offsetbox.DrawingArea):
        for handle in draw_area.get_children():
            if handle.get_label() in legend_format:
                draw_area.set_visible(False)

    return legend


def seaborn_scatter():
    sns.set()
    
    tips = sns.load_dataset('tips')
    ax = sns.scatterplot(x="total_bill", y="tip",
                         hue="sex", style="smoker", size="size",
                         data=tips)

    #nice and explicit call to reorder the legend
    legend_format = {'Sex (hue)': ['Male', 'Female'],
                     'Group Size (size)': ['0', '2', '4', '6'],
                     'Smoker (shape)': ['Yes', 'No']
                    }

    # Use this instead of `ax.legend()`
    #   this returns the legend as well for further tweaking
    #   can also set it up to take kwargs to pass onto `ax.legend` call
    subtitle_legend(ax, legend_format=legend_format)

    plt.show()

seaborn_scatter()

mwaskom · 2018-06-22T20:37:32Z

So to summarize, there are two good options:

A function that can be called with no arguments and relies on the assumption that invisible handles in an existing legend are meant to be treated as section titles. All it does is adjust their alignment.
A function that can flexibly restructure a legend by adding invisible artists to the axes and using their labels as section titles.

I think both functions are useful and are complimentary, since they do more or less different things. I would lean towards (1) as being most useful for a user who needs to quickly regenerate a legend, since they won't necessarily have access to the ordering that seaborn originally used. But (2) is a very nice utility function itself and I think would be nice to have ... it also is probably the most robust option to use inside of seaborn itself.

If you agree, can you please open a PR adding these functions to seaborn.utils?

CRiddler · 2018-06-27T14:51:14Z

I agree that these options are good. I think they can be wrapped into a single function that changes how it works depending whether or not a legend_format argument is supplied.

My only qualm with this approach now is that we're planning on scatterplot and lineplot to insert these subtitles automatically- which means that if a user does this:

ax = sns.scatterplot(…, hue="hello") #make a plot with hue
ax.legend()

The legend will have odd right adjusted subtitle of "hello". Which won't be what the user expects. We could use an extra function to hide the subtitle legend entries (prefix the label with "_") from ax.legend() and then use the seaborn legend functions to "unhide" them when they draw the legend.

mwaskom · 2018-06-27T15:10:26Z

To be clear, scatterplot will draw its own legend and use this function internally to make the subtitles look nice.

A user might want to call ax.legend again because sometimes that's the easiest way to adjust the legend (I don't think the matplotlib legend object has an obvious public API for moving it, for example). And that's why we are making this function public, so that they can call it to fix the new legend . It's basically a hack, but I don't see any other way to get subtitles into the legend. (And subtitles in the right column convey the same information, they just don't look as nice).

CRiddler · 2018-06-27T21:38:35Z

I see, so scatterplot will be changed to add in the subtitles as it adds in the points? Then these subtitle labels will show up in the legend no matter what (except for maybe the case where the user only passes 1 argument of hue or size or shape)?

What if by default the subtitles were hidden from the legend? So we had something that works like this:

User calls ax.legend(), legend is formed- all subtitles are invisible
User/seaborn calls sns.subtitle_legend(ax, **kwargs), legend is formed, and subtitles that were tucked away are now present.
User/seaborn calls sns.subtitle_legend(ax, legend_format, **kwargs) (where legend_format is a dictionary representing what subtitles map to what labels, as seen earlier in this thread). Legend is formed, and subtitles that were tucked away are ignored. New subtitles are added depending on the keys in the legend_format dictionary

subtitle_legend(fig_or_ax, legend_format=None, **kwargs), where fig_or_ax is a Figure or Axes, legend_format is an optional dictionary mapping {"subtitle1": ["label1", "label2"]…}, and kwargs are passed to a call of fig_or_ax.legend

This flexibility gives the user a choice on whether or not they want subtitles in the first place, determine if they want to redraw the legend to update some argument in **kwargs, or entirely change up the ordering of the subtitles/labels via the format_legend option. I prefer this approach, because then the user isn't stuck in a box of "If you don't want the subtitles, then you must delete the handle from inside the Axes."

fig, ax = plt.subplots()

h1 = ax.scatter([], [], label="Title 1")
h2 = ax.scatter([], [], label="level 1")
h3 = ax.scatter([], [], label="level 2")
h4 = ax.scatter([], [], label="Title 2")
h5 = ax.scatter([], [], label="group A")
h5 = ax.scatter([], [], label="group B")
register_subtitles(h1, h4)

ax.set_title("Flexible Legend handling")
legend_format = {
    "custom Title 1": ["level 1", "level 2"],
    "custom Title 2": ["group A", "group B"],
}

### Add our Axes level legends
# Regular calls to construct legend, subtitles remain hidden
original_legend = ax.legend(loc="upper left", title="ax.legend()")
ax.add_artist(original_legend)

# Seaborn call to construct legend, subtitles are visible and centered
new_legend_noargs = subtitle_legend(ax, loc="upper center", title="subtitle_legend(ax)")
ax.add_artist(new_legend_noargs)

# Seaborn call to construct legend, subtitles follow the format provided
subtitle_legend(
    ax,
    legend_format=legend_format,
    loc="upper right",
    title="subtitle_legend(\n    ax, legend_format\n)",
)


### Do the same as above, but at the Figure level
fig.legend(loc="lower left", title="fig.legend()")

subtitle_legend(
    fig, loc="lower center", title="subtitle_legend(fig)"
)

subtitle_legend(
    fig,
    legend_format=legend_format,
    loc="lower right",
    title="subtitle_legend(\n    fig, legend_format\n)",
)

plt.show()

This is a little more of a complex set up on the backend, but I think this set up makes it magical when it needs to be "subtitle_legend(ax) creates a legend with subtitles from seemingly nothing", but also "transparent when it should be subtitle_legend(ax, legend_format) makes subtitles that I choose."

But if we want to keep it simple we can definitely implement a function that centers legend entries with alpha==0 and the other subtitled legend function that takes a dictionary.

mwaskom · 2018-06-27T21:41:52Z

What if by default the subtitles were hidden from the legend? So we had something that works like this.

seaborn's default behavior in most contexts is to add semantic labels where they exist, and would be here too.

CRiddler · 2018-06-27T21:51:02Z

Fair enough, I'll get started on a PR implementing the two aforementioned functions.

center_subtitles(legend)
centers any labels that have a handle with alpha=0
subtitle_legend(ax_or_fig, legend_format)
creates a legend that follows the legend_format provided and uses the handles found in the supplied Axes (or if a Figure is provided, handles from Figure.axes)

mwaskom · 2018-06-27T21:59:25Z

A few initial suggestions:

center_subtitles(legend)

This would be better as align_subtitles (or ideally align_legend_subtitles) with a position parameter that defaults to "left" but accepts other alignments.

centers any labels that have a handle with alpha=0

I think we agreed that visible=False is a stronger signal.

mwaskom · 2018-06-27T22:00:22Z

subtitle_legend could have an optional remove boolean parameter that does the work of getting rid of the subtitles. Not an obvious API but would save some effort for those who know about it.

mwaskom · 2018-06-27T22:01:28Z

Also not something that needs to be in the first pass but it would be nice if subtitle_legend were eventually enhanced to let each subtitled group take up a column in the legend. I think this will often make for a more efficient use of space in a wide-aspect plot.

So to the extent that there are design decisions with tradeoffs, try to avoid making this impossible to add in the future (unclear if this will be a concern).

CRiddler · 2018-06-28T17:56:13Z

visible=False

Yes, but we can't pass visible=False to ax.scatter or the like, whereas we can with alpha.

mwaskom · 2018-06-28T18:06:45Z

Why not?

f, ax = plt.subplots()
ax.scatter(1, 1, visible=False)

mwaskom changed the title ~~Improve multiple semantic legend titles:~~ Improve multiple semantic legend titles May 30, 2018

This was referenced Jul 4, 2018

Omnibus documentation updates for v0.9 #1465

Merged

First approach to relational plot legends #1483

Merged

mwaskom closed this as completed in #1483 Jul 5, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve multiple semantic legend titles #1440

Improve multiple semantic legend titles #1440

mwaskom commented May 30, 2018 •

edited

CRiddler commented Jun 7, 2018

mwaskom commented Jun 13, 2018 •

edited

mwaskom commented Jun 13, 2018

CRiddler commented Jun 14, 2018 •

edited

mwaskom commented Jun 14, 2018

CRiddler commented Jun 14, 2018

mwaskom commented Jun 14, 2018

CRiddler commented Jun 14, 2018

CRiddler commented Jun 14, 2018

mwaskom commented Jun 15, 2018

CRiddler commented Jun 15, 2018

mwaskom commented Jun 15, 2018

CRiddler commented Jun 15, 2018

mwaskom commented Jun 22, 2018

CRiddler commented Jun 27, 2018

mwaskom commented Jun 27, 2018

CRiddler commented Jun 27, 2018

mwaskom commented Jun 27, 2018

CRiddler commented Jun 27, 2018

mwaskom commented Jun 27, 2018 •

edited

mwaskom commented Jun 27, 2018

mwaskom commented Jun 27, 2018 •

edited

CRiddler commented Jun 28, 2018

mwaskom commented Jun 28, 2018

Improve multiple semantic legend titles #1440

Improve multiple semantic legend titles #1440

Comments

mwaskom commented May 30, 2018 • edited

CRiddler commented Jun 7, 2018

mwaskom commented Jun 13, 2018 • edited

mwaskom commented Jun 13, 2018

CRiddler commented Jun 14, 2018 • edited

mwaskom commented Jun 14, 2018

CRiddler commented Jun 14, 2018

mwaskom commented Jun 14, 2018

CRiddler commented Jun 14, 2018

CRiddler commented Jun 14, 2018

mwaskom commented Jun 15, 2018

CRiddler commented Jun 15, 2018

mwaskom commented Jun 15, 2018

CRiddler commented Jun 15, 2018

mwaskom commented Jun 22, 2018

CRiddler commented Jun 27, 2018

mwaskom commented Jun 27, 2018

CRiddler commented Jun 27, 2018

mwaskom commented Jun 27, 2018

CRiddler commented Jun 27, 2018

mwaskom commented Jun 27, 2018 • edited

mwaskom commented Jun 27, 2018

mwaskom commented Jun 27, 2018 • edited

CRiddler commented Jun 28, 2018

mwaskom commented Jun 28, 2018

mwaskom commented May 30, 2018 •

edited

mwaskom commented Jun 13, 2018 •

edited

CRiddler commented Jun 14, 2018 •

edited

mwaskom commented Jun 27, 2018 •

edited

mwaskom commented Jun 27, 2018 •

edited