Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH]: Add split feature for violin plots #27812

Closed
anjabeck opened this issue Feb 22, 2024 · 4 comments · Fixed by #27815
Closed

[ENH]: Add split feature for violin plots #27812

anjabeck opened this issue Feb 22, 2024 · 4 comments · Fixed by #27815
Milestone

Comments

@anjabeck
Copy link

anjabeck commented Feb 22, 2024

Problem

Recently, I wanted to get split violin plots at defined x-positions (see image).
My problem is that while it's possible to specify the x-position in pyplot.violinplot(), there is no option for split violins.
seaborn.violinplot() on the other hand allows split violins but seems to enforce discrete x-positions with fixed intervals. This is not true anymore since v0.13

This example shows two data sets (black and red/blue) in four categories defined by $A=0,1$ and $B=0,1$.
example_violins

The code that produces the plot based on this answer on stackoverflow.

import matplotlib.pyplot as plt
import numpy as np

np.random.seed(0)

cols = ["firebrick","dodgerblue"]
leg = {"patches": [], "labels": []}

# Loop over categories A and B
for A in range(0,2):
    xpos   = []
    dLeft  = []
    dRight = []

    for B in range(0,2):
        # Create data for the left and right violins in the current category
        mu = np.random.randint(10,100)
        dLeft.append(np.random.poisson(mu,1000))
        dRight.append(np.random.normal(mu+10,np.sqrt(mu+10),1000))

        # Create x-positions for the violins
        xpos.append(B+(A-0.5)*0.15)

        # Create right violins (colors)
        vRight = plt.violinplot(dRight,positions=xpos,widths=0.1,showextrema=False,showmedians=False)
        for vr in vRight["bodies"]:
            # Get the center
            m = np.mean(vr.get_paths()[0].vertices[:, 0])
            # Get the rightmost point
            r = np.max(vr.get_paths()[0].vertices[:, 0])
            # Only allow values right of the center
            vr.get_paths()[0].vertices[:, 0] = np.clip(vr.get_paths()[0].vertices[:, 0], m, r)
            # Set category A color
            vr.set_color(cols[A])
            vr.set_alpha(1.)

        # Create left violins (black)
        vLeft = plt.violinplot(dLeft,positions=xpos,widths=0.15,showextrema=False,showmedians=False)
        for vl in vLeft["bodies"]:
            # Get the center
            m = np.mean(vl.get_paths()[0].vertices[:, 0])
            # Get the leftmost point
            l = np.min(vl.get_paths()[0].vertices[:, 0])
            # Only allow values left of the center
            vl.get_paths()[0].vertices[:, 0] = np.clip(vl.get_paths()[0].vertices[:, 0], l, m)
            # Set color
            vl.set_color("k")
            vl.set_alpha(1.)

    # Add legend entry
    leg["patches"].append(vRight["bodies"][0])
    leg["labels"].append(fr"$A={A}$")

plt.legend(leg["patches"],leg["labels"])
plt.xlabel(r"$B$")
plt.xticks([0,1],["0","1"])
plt.xlim(-0.5,1.5)
plt.tight_layout()
plt.savefig("example_violins.png")

Proposed solution

An easy-to-use solution would be to add an argument in the violinplot() function through which the user can define which side to plot. Something like asymetric=None, "below", "above", where "below"/"above" only plots the side below/above the values given to positions.

This would mean that calling

vRight = plt.violinplot(dRight,positions=xpos,widths=0.1,showextrema=False,showmedians=False,asymetric="above")

internally executes this loop

for vr in vRight["bodies"]:
    # Get the center
    m = np.mean(vr.get_paths()[0].vertices[:, 0])
    # Get the rightmost point
    r = np.max(vr.get_paths()[0].vertices[:, 0])
    # Only allow values right of the center
    vr.get_paths()[0].vertices[:, 0] = np.clip(vr.get_paths()[0].vertices[:, 0], m, r)
@timhoffm
Copy link
Member

Thanks for the suggestion. This is a reasonable addition. 👍 Do you want to make a pull request?

Technical remarks:

  • I would call the parameter side with values "low" / "high" / "both"
  • Rather than modifying the bodies afterwards, the implementation should change the creation of the bodies here:
    bodies += [fill(stats['coords'], -vals + pos, vals + pos,

@anjabeck
Copy link
Author

Sweet. I'll create a pull request :)

@mwaskom
Copy link

mwaskom commented Feb 23, 2024

seaborn.violinplot() on the other hand allows split violins but seems to enforce discrete x-positions with fixed intervals.

BTW this is not true as of seaborn v0.13

@anjabeck
Copy link
Author

seaborn.violinplot() on the other hand allows split violins but seems to enforce discrete x-positions with fixed intervals.

BTW this is not true as of seaborn v0.13

Oh that's cool actually. I completely missed that!

@QuLogic QuLogic added this to the v3.9.0 milestone Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants