Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for multiple hatches, edgecolors and linewidths in histograms #28073

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

Impaler343
Copy link
Contributor

@Impaler343 Impaler343 commented Apr 13, 2024

PR summary

Closes #26718 Distributes keyword args passed to each Patch using a cycler. Probably not the best way to do this?

PR checklist

@jklymak
Copy link
Member

jklymak commented Apr 13, 2024

I'd suggest showing what this does with an example either in the GitHub pr description, or ideally in the gallery

@github-actions github-actions bot added the Documentation: examples files in galleries/examples label Apr 14, 2024
@Impaler343
Copy link
Contributor Author

I'm not really sure if I need to add a new test or just modify an exisiting one(test_hist_stacked_bar) in test_axes.py

@oscargus
Copy link
Contributor

I guess there are two things here:

  1. Add/modify an example in the gallery to illustrate how it is used. (Will be easier for the reviewers to get an idea of how it is used etc.)
  2. Add a test to get the code coverage up and make sure no one breaks it later. Maybe there is some old example that one can "hi-jack", but otherwise create a new test with all the bells and whistles turned on.

@Impaler343
Copy link
Contributor Author

pinging @story645 for review. The failing tests are unrelated

Copy link
Member

@story645 story645 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also needs updated documentation that the patch properties are now vectorized (@timhoffm any concerns here?)

lib/matplotlib/axes/_axes.py Outdated Show resolved Hide resolved
lib/matplotlib/axes/_axes.py Outdated Show resolved Hide resolved
galleries/examples/statistics/histogram_multihist.py Outdated Show resolved Hide resolved
@timhoffm
Copy link
Member

timhoffm commented May 19, 2024

also needs updated documentation that the patch properties are now vectorized (@timhoffm any concerns here?)

What should we be concerned about? We already have vectorized label and color. It's only fair to vectorize these inside hist() as well. This should be documented with **kwargs (can't immediately suggest, because that part has not been touched in the PR). Something like:

        **kwargs
            `~matplotlib.patches.Patch` properties. The following properties additionally
            accept lists of property values, one element for each dataset:
            *edgecolors*, *linewidths*, *linestyles*, *hatches*.

This should also get a what's new entry.

@story645
Copy link
Member

What should we be concerned about? We already have vectorized label and color.

My bias is vectorize everything so I don't have concerns, but in the past for some vectorization discussions there have been concerns about the tradeoffs. But if there isn't opposition, awesome!

@timhoffm
Copy link
Member

I don't see any drawbacks for hist()

@Impaler343
Copy link
Contributor Author

So we are planning to vectorize all parameters of Patches? Like joinstyle, capstyle etc.

@story645
Copy link
Member

So we are planning to vectorize all parameters of Patches? Like joinstyle, capstyle etc

Not at this time w/ the current architecture, especially because nobody has asked for those.

@Impaler343 Impaler343 force-pushed the histogram branch 2 times, most recently from 150c63b to 2351cbd Compare May 29, 2024 07:34
@Impaler343
Copy link
Contributor Author

Codecov is acting fishy, it passed once and failed again after squashing. Anything else to add/change?

lib/matplotlib/axes/_axes.py Outdated Show resolved Hide resolved
lib/matplotlib/tests/test_axes.py Outdated Show resolved Hide resolved
lib/matplotlib/axes/_axes.py Outdated Show resolved Hide resolved
galleries/examples/statistics/histogram_multihist.py Outdated Show resolved Hide resolved
galleries/examples/statistics/histogram_multihist.py Outdated Show resolved Hide resolved
galleries/examples/statistics/histogram_multihist.py Outdated Show resolved Hide resolved
galleries/examples/statistics/histogram_multihist.py Outdated Show resolved Hide resolved
@Impaler343
Copy link
Contributor Author

Impaler343 commented Jun 5, 2024

As of the last commit, this behavior is consistent for all histtypes unless mentioned otherwise:

  • If edgecolor, facecolor and color are mentioned, color is completely ignored.
  • If edgecolor and color are mentioned, color is applied to only the faces and edgecolor to the edges.
  • If facecolor and color are mentioned, color is applied to only the edges and facecolor to the faces.
  • If facecolor and edgecolor are mentioned, edgecolor is applied to edges, and facecolor is applied to faces.
  • If only color is mentioned, the face is colored for all filled histograms, and only the edge is colored for unfilled histograms with the values in color.
  • If only facecolor is mentioned, the face is colored for all filled histograms with the value in facecolor, and the edges are colored with the default color cycle for unfilled histograms.
  • If only edgecolor is mentioned, the face is colored by the default color cycle for all filled histograms, and the edge is colored with the values in edgecolor for all histograms.
    I feel this is the required behavior, but I am unsure if the color setting for the kwargs can be written more concisely.
    Let me know if I should start writing tests based on this structure.

@story645
Copy link
Member

story645 commented Jun 6, 2024

@Impaler343 I started trying to flow chart that and then realized it's probably clearer as a table. Let me know if this jives with what you're saying:

hist type color facecolor edgecolor patch face color patch edge color
all set set set facecolor edgecolor
all set None set color edgecolor
all set set None facecolor color
all None set set facecolor edgecolor
bar, barstacked, stepfilled set None None color 'none'
step set None None 'none' color
bar, barstacked, stepfilled None set None color 'none'
step None set None 'none' default color cycle
bar, barstacked, stepfilled None None set default color cycle 'none'
step None None set 'none' edgecolor
bar, barstacked, stepfilled None None None
step None None None

@Impaler343
Copy link
Contributor Author

hist type color facecolor edgecolor patch face color patch edge color
all set set set facecolor edgecolor
all set None set color edgecolor
all set set None facecolor color
all None set set facecolor edgecolor
bar, barstacked, stepfilled set None None color 'none'
step set None None 'none' color
bar, barstacked, stepfilled None set None facecolor 'none'
step None set None 'none' default color cycle
bar, barstacked, stepfilled None None set default color cycle edgecolor
step None None set 'none' edgecolor
bar, barstacked, stepfilled None None None default color cycle 'none'
step None None None 'none' default color cycle

Have bolded the corrected ones

@Impaler343
Copy link
Contributor Author

I'm unable to fix CircleCI errors for docs. Could someone help me out?

@story645
Copy link
Member

story645 commented Jun 7, 2024

Hi, so the error is in your what's new:

/home/circleci/project/doc/users/next_whats_new/histogram_vectorized_parameters.rst:47: WARNING: Explicit markup ends without a blank line; unexpected unindent.
/home/circleci/project/doc/users/next_whats_new/histogram_vectorized_parameters.rst:51: ERROR: Unexpected indentation.
/home/circleci/project/doc/users/next_whats_new/histogram_vectorized_parameters.rst:64: WARNING: Explicit markup ends without a blank line; unexpected unindent.
/home/circleci/project/doc/users/next_whats_new/histogram_vectorized_parameters.rst:68: ERROR: Unexpected indentation.

error: https://app.circleci.com/pipelines/github/matplotlib/matplotlib/31532/workflows/2533c013-7833-4d0c-ae72-dead7c7fbc76/jobs/83481?invite=true#step-113-207133_109
rendering: https://output.circle-artifacts.com/output/job/2c962d26-1c66-49dd-a987-e5bf25c8fcbe/artifacts/0/doc/build/html/users/next_whats_new/histogram_vectorized_parameters.html

lib/matplotlib/axes/_axes.py Outdated Show resolved Hide resolved
lib/matplotlib/tests/test_axes.py Outdated Show resolved Hide resolved
lib/matplotlib/axes/_axes.py Show resolved Hide resolved
@Impaler343
Copy link
Contributor Author

Impaler343 commented Jun 8, 2024

Ok it turns out the vectorization of colors is somewhat incomplete:

  • When we pass None, calls the color cycle, which is fine.
  • When we pass a list of valid colors, it uses them, which is also fine.
  • When we pass a list of Nones or even if one value in the list is None, it throws an error saying invalid RBGA argument. What can we do in this situation? Should we say that we use the colour cycle even if one value in the list is None and raise a warning? Or call the color cycle's next color whenever a None is encountered in a list? Or simply assume that the user will not pass a list like color= [None, "green", None] and ignore this check?

@story645
Copy link
Member

story645 commented Jun 9, 2024

or simply assume that the user will not pass a list like color= [None, "green", None]

I think this input is ambiguous cause it can mean either:

  1. opt those two lines into the color cycle such that it maps to ['C0', "green", 'C1']
  2. set both lines to the same default ['C0', "green", 'C0']

So I think it's ok to error out (unless/until someone comes around expecting 1 or 2) but we should maybe special case and have a better error message, like if not all(colors) (if any are NaN) "Ambiguous color specification: colors in the list may not be None"

Or something like that.

@Impaler343
Copy link
Contributor Author

The added lines have all been covered, codecov seems to still fail

@story645
Copy link
Member

Now, to test the table we need to write 10 individual tests, so that we dont pass all of the parameters

Can parameterize this as a kwargs arg dict [{'color': 'blue' }, {'edgecolor': 'blue', 'facecolor':orange}, ...]

@Impaler343
Copy link
Contributor Author

Now, to test the table we need to write 10 individual tests, so that we dont pass all of the parameters

Can parameterize this as a kwargs arg dict [{'color': 'blue' }, {'edgecolor': 'blue', 'facecolor':orange}, ...]

Oh yes! I completely missed this method

@Impaler343
Copy link
Contributor Author

hist type color facecolor edgecolor patch face color patch edge color
bar, barstacked, stepfilled set set set facecolor edgecolor
step set set set 'none' edgecolor
bar, barstacked, stepfilled set None set color edgecolor
step set None set 'none' edgecolor
bar, barstacked, stepfilled set set None facecolor color
step set set None 'none' color
bar, barstacked, stepfilled None set set facecolor edgecolor
step None set set 'none' edgecolor
bar, barstacked, stepfilled set None None color rcParam(black)
step set None None 'none' color
bar, barstacked, stepfilled None set None facecolor rcParam(black)
step None set None 'none' default color cycle(blue)
bar, barstacked, stepfilled None None set default color cycle(blue) edgecolor
step None None set 'none' edgecolor
bar, barstacked, stepfilled None None None default color cycle(blue) rcParam(black)
step None None None 'none' default color cycle(blue)

Updated table by expanding the first few cases and correcting:

  • whenever edgecolor is not set, it defaults to its rcParam in the stepfilled case.
  • facecolor is none in the case of step

Copy link
Member

@story645 story645 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't special case none. Here you end up only checking the alpha channel. Otherwise, I think this hits the balance of everything tested but way fewer images tests?

lib/matplotlib/tests/test_axes.py Outdated Show resolved Hide resolved
@story645
Copy link
Member

PR cleanliness test is failing on the added/removed test images, that can get fixed by a squash rebase
https://matplotlib.org/devdocs/devel/development_workflow.html#manage-commit-history

Specify extensions for test

Added modified baseline images

Modified test for histogram with single parameters

Fixed test

Add modified baseline images

Made changes concise according to suggestion

Made a more detailed gallery example

Fix Docs

Added whats new note, documentation for vectorization, doc fix

Added new test, and reverted changes in old test

Added baseline images

Modified test to pass codecov, added plot in whats new entry

Fix test

Added baseline images

Altered whats new entry, docs and gallery example

Resolved edgecolor and facecolor setting

Minor fix

Fix docs

Modified files to include facecolor and added test

Removed figsize from test

Add multiple baseline image names

Fixed test?

Fixed test?

Removed parametrize usage

Add baseline images

Add baseline image

Fix docs

Fix docs

Deleted baseline images, changed test

Fix test

Fix test

Handled None array passing to color

Handled passing None list

Modified nested patch condition

Minor Fix

Grammar nits

Modified test, edited None handling in sequence so that it errors out
Separated tests

Modified test

Modified test to pass by using halved zorders
Edited expected facecolors for step

Changed blue to C0
galleries/examples/statistics/histogram_multihist.py Outdated Show resolved Hide resolved
lib/matplotlib/axes/_axes.py Show resolved Hide resolved
@@ -4614,6 +4672,10 @@ def test_hist_emptydata():
ax.hist([[], range(10), range(10)], histtype="step")


def test_hist_none_patch():
plt.hist([1, 2], label=["First", "Second"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
plt.hist([1, 2], label=["First", "Second"])
plt.hist([ ], label=["First", "Second"])

None patch is the empty list/array case

Copy link
Member

@story645 story645 Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And what you probably want to test is

assert len(patches) == 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This too works, basically the None condition is hit when we don't have enough datasets to cover each label. This is the doing of itertools.zip_longest

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should i still change it up? As the previous one seems clearer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I thought that zip_longest would copy the patch to both labels. I think there needs to be some sort of assert that you're test is doing what you think it is - maybe then checking that len(patches) != len(labels)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is len(lbs) here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number of labels that have been used in the plot

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant what were the actual numbers/does this test pass.

But yes if this test passes and coverage improves (add a comment that this test is to cover the if not patch case), then I'll let this concern go.

Copy link
Member

@story645 story645 Jul 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But also assert len(patches) < len(labels) too I think?

Copy link
Contributor Author

@Impaler343 Impaler343 Jul 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the tests pass, and yes we can add the length(patches) < len(labels) one as well, does the same comparison.
The actual numbers are 1 < 2

@story645
Copy link
Member

story645 commented Jul 1, 2024

Cause I think this maybe got lost, I was suggesting that this construction makes the skip case clearer:

        for patch, lbl in itertools.zip_longest(patches, labels):
            
            if not patch: continue 

            p = patch[0]
            kwargs.update({
                'hatch': next(hatches),
                'linewidth': next(linewidths),
                'linestyle': next(linestyles),
                'edgecolor': next(edgecolors),
                'facecolor': next(facecolors),
            })
            p._internal_update(kwargs)
            if lbl is not None:
                p.set_label(lbl)
            for p in patch[1:]:
                p._internal_update(kwargs)
                p.set_label('_nolegend_')

Copy link
Member

@story645 story645 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all your patience and extra thanks for tracing through and figuring out and testing our histogram color semantics/precedence 😅

I think the plot tests are now at a state where it's easy to see which parameter failed but there's also now only six image tests.

I'm pretty sure that the color semantics test doesn't try to draw anything but there might be a "build only artist tree" function that I'm missing.

Comment on lines 4675 to 4682
def test_hist_none_patch():
# To cover None patches when excess labels are provided
labels = ["First", "Second"]
patches = [[1, 2]]
fig, ax = plt.subplots()
ax.hist(patches, label=labels)
_, lbls = ax.get_legend_handles_labels()
assert (len(lbls) < len(labels) and len(patches) < len(labels))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please elaborate what this tests? What are None patches? Does this express what you want to test?

Suggested change
def test_hist_none_patch():
# To cover None patches when excess labels are provided
labels = ["First", "Second"]
patches = [[1, 2]]
fig, ax = plt.subplots()
ax.hist(patches, label=labels)
_, lbls = ax.get_legend_handles_labels()
assert (len(lbls) < len(labels) and len(patches) < len(labels))
def test_hist_unused_labels():
# When a list with one dataset and N elements is provided and N labels, ensure
# hat the first label is used for the dataset and all other labels are ignored
fig, ax = plt.subplots()
ax.hist([[1, 2, 3]], label=["values", "unused", "also unused"])
_, labels = ax.get_legend_handles_labels()
assert labels == ["values"]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None patches are the ones generated when the number of labels specified are more than the number of patches(Filled in by zip_longest). The suggested test also covers the line and has clearer labels, ill change it up

lib/matplotlib/tests/test_axes.py Outdated Show resolved Hide resolved
Comment on lines 4624 to 4626
for i, (xi, value) in enumerate(zip(x, values)):
axr.hist(xi, bins=bins, histtype=histtype, **{kw: value},
zorder=(len(x)-i)/2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the index/zorder acrobatics is only to get the draw order. It's simpler to do this via

Suggested change
for i, (xi, value) in enumerate(zip(x, values)):
axr.hist(xi, bins=bins, histtype=histtype, **{kw: value},
zorder=(len(x)-i)/2)
for x, value in reversed(zip(xs, values)):
axr.hist(x, bins=bins, histtype=histtype, **{kw: value})

Copy link
Contributor Author

@Impaler343 Impaler343 Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it now occurred to me that I had tried this earlier - gives completely different plots
Vectorized:
image
Superimposed:
image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you fix the typo:

 for xi, value in reversed(zip(x, values)):
            axr.hist(xi, bins=bins, histtype=histtype, **{kw: value})

Copy link
Contributor Author

@Impaler343 Impaler343 Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

        for x, value in reversed(list(zip(xs, values))):
                    axr.hist(x, bins=bins, histtype=histtype, **{kw: value})

yes, I tried this one

galleries/examples/statistics/histogram_multihist.py Outdated Show resolved Hide resolved
# * linestyles
#
#
# Edge-Colors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced that there is much to learn by having individual examples for every parameter. This feels lengthy without providing much additional insight. Maybe the whats new example or 2 individually crafted examples (one simple and one using multiple values simultaneously (e.g. edgecolor, facecolor, hatch) are better?

But I won't block over this. However, if you keep this, please use a consistent section naming, preferably as the parameter is written.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's most important to me is that the example only set one parameter rather than trying to set multiple. But also the reason I encouraged this approach was b/c https://matplotlib.org/devdocs/gallery/pie_and_polar_charts/pie_features.html#sphx-glr-gallery-pie-and-polar-charts-pie-features-py was in response to folks not knowing that they could set all these parameters. Basically, I wanted something that would pop up in a search.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is that these styling parameters all work conceptually similar. Repeating the same pattern for every parameter feels a bit lengthy and boring to me. I think the pie parameters are somewhat different in that they have different effects and one cannot guess what one is doing based on the example of another.

But it's not worth a longer discussion. If you feel they are all valuable, I won't block over that.

@story645
Copy link
Member

story645 commented Jul 9, 2024

#28533 reminded me that this should get a versionadded directive

https://matplotlib.org/devdocs/devel/api_changes.html#announce-new-and-deprecated-api

datasets in *x*:
*edgecolors*, *facecolors*, *linewidths*, *linestyles*, *hatches*.

.. versionadded:: 3.10
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry wgough

Suggested change
.. versionadded:: 3.10
.. versionadded:: 3.10
List input to hatch parameters

Sorry forgot to add, here especially you should briefly describe what's added. My sentence here is super clunky English, so more here for example than something you should stick with.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didnt get you....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation: examples files in galleries/examples New feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: stacked histogram does not properly handle edgecolor and hatches
6 participants