New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bugfix in stackplot (Issue #22393): Removes artifacts when input height is zero #22424
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for opening your first PR into Matplotlib!
If you have not heard from us in a while, please feel free to ping @matplotlib/developers
or anyone who has commented on the PR. Most of our reviewers are volunteers and sometimes things fall through the cracks.
You can also join us on gitter for real-time discussion.
For details on testing, writing docs, and our review process, please see the developer guide
We strive to be a welcoming and open project. Please follow our Code of Conduct.
lib/matplotlib/stackplot.py
Outdated
@@ -41,6 +41,9 @@ def stackplot(axes, x, *args, | |||
size of each layer. It is also called 'Streamgraph'-layout. More | |||
details can be found at http://leebyron.com/streamgraph/. | |||
|
|||
hide_empty : bool, optional | |||
If set, hides inputs where they have zero height |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a note about the kwargs this is incompatible with.
lib/matplotlib/stackplot.py
Outdated
@@ -107,9 +110,17 @@ def stackplot(axes, x, *args, | |||
first_line = center - 0.5 * total | |||
stack += first_line | |||
|
|||
edgecolor = kwargs.pop('edgecolor', None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to detect if the user passed any of these three kwargs (k in kwargs
) and raise if they have. As implemented this will silently disregard user input which is most frustrating things that a library can do ("I know I passed edgecolor in, I can see it right there in the code, why isn't doing anything!?!") and is something that we should do our utmost to avoid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great point. I included an error message if that is the case and enclodes the kwargs.pop
calls in an if-term.
I expect that the option of chopping up each level of the stack into N disconnected fill_betweens is not an option because it will break back-compatibility of the return type? I am very concerned about the "coupled kwargs" here. That is, the meaning (or if we pay attention to!) user input depends on the values of other kwargs. This sort of coupling is (in my experiance) one of the things that makes APIs hard to use and remember (because you not only have to remember what the inputs are, you have to remember exactly how they affect each other). |
lib/matplotlib/stackplot.py
Outdated
@@ -107,9 +111,22 @@ def stackplot(axes, x, *args, | |||
first_line = center - 0.5 * total | |||
stack += first_line | |||
|
|||
if hide_empty: | |||
if any(k in kwargs for k in ['edgecolor', 'interpolate', 'where']): | |||
raise ValueError('hide_empty is not compatible with edgecolor, ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is probably worth putting in the code to sort out which of the conflicting keys it is. The more information we can give the user in the error message the easier it will be for them to fix the problem, and (hopefully) the happier they will be :)
lib/matplotlib/stackplot.py
Outdated
raise ValueError('hide_empty is not compatible with edgecolor, ' | ||
'interpolate or where') | ||
else: | ||
edgecolor = kwargs.pop('edgecolor', None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can probably get away without doing the pop
and putting in the implicit defaults here by making another dictionary (hide_kwargs
or something like that)and then calling
fill_between` as
axes.fill_between(..., **kwargs, **hide_kwargs)
and conditionally put things in hide_kwargs
which means we can have the conditional a smaller number of places in the code (I think 3 total, once at the top for error checking, once for the base, and once in the loop rather than as 6 ternary invocations + error checking).
I do not feel super strongly about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally like the idea of having a hide_kwargs
dict as the if ... else ...
calls I wrote are rather ugly. But I don't think it would work here because where
and edgecolor
are unique to every iteration of the loop so that won't work for them.
It will only work for interpolate
but then I thought it was better to treat all 3 arguments the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can update / make a new dictionary under the same conditional in the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that makes sense.
I think at this point though I favor the solution of allowing where
to be a list (see below). Are you also fine with that?
I agree with you on that. The arguments are definitely convoluted now. I think the issue is that the real problem lies is that the anti-aliasing creates artifacts when there is a zero-width polygon. This can be fixed by setting So ideally, the user should never worry about |
The issue seems to be quite well known and appears in a bunch of different issues (see e.g. #9574). This is not a matplotlib error but rather a bug on the side of the renderer. So there won't be a way to fix the underlying issue. |
I think I have an idea how to solve this issue and also circumvent the whole interfering If we allow It would look like: where = kwargs.pop('where', None)
if hasattr(where, '__len__'):
if len(where) != len(stack):
raise ValueError("where has to have the same length as y")
else:
where = len(stack) * [where] and then coll = axes.fill_between(x, stack[i, :], stack[i + 1, :],
where=where[i],
facecolor=color, label=next(labels, None),
**kwargs) @tacaswell, are you happy with this solution? |
It will have to be done carefully to not break the case where someone passed in a 1d where and expects it to be re-used for each of the fill_betweens. We have a number of APIs that will do this automatic broadcasting / inference so it would fit in well with the rest of the library and user expectations, but that also means we are aware of how messy they can get ;) Would it be easier to check for |
Yep that sounds good. Could you guide me towards an implementation of the automatic broadcasting / inference? Def gonna be messy if I try to come up with something myself :D Checking for |
Given that we do not have to worry about unit support on |
Ok so I have overhauled this whole PR as discussed above. I think this solution is way better than the old one. Now we don't have competing kwargs, and |
@tacaswell, do you mind having a look at the recent changes? I think that could be a small PR that doesn't break anything else but solves the issue |
67e3753
to
f47d057
Compare
@tlkaufmann I apologize that this fell off my radar! I took the liberty of rebasing and squashing this down to one commit (twice ❤️ flake8). If you want to push additional commits to this branch, before you do anything else do: git remote update
git checkout stackplot_PR
# where YOUR_REMOTE_NAME is the name of the remote that points to your fork
git reset --hard YOUR_REMOTE_NAME/stackplot_PR which will discard (😱) any local commits and replace them with the current state of this branch. If you do not do this the old commits will be resurrected! I'm going to leave one comment about the docstring which my request some additional work. |
lib/matplotlib/stackplot.py
Outdated
regions from being filled. The filled regions are defined by the | ||
coordinates `x[where]`. Can be either a single bool, an array of shape | ||
(N,) or an array of shape (M, N). | ||
Should be used together with `interpolate=True`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we check if interpolate=True
is set and raise/warn or if the user does not pass this, correctly track interpolate?
Is this comment still relevant or is this no longer needed with the new simpler approach?
@tacaswell , thanks for getting back to me.] Rebasing was a good idea as the git history was kind of a mess :D I added a warning to check whether |
The errors in the above checks seem to be irrespective of my changes |
Also reordered imports in test_axes to fit main branch
43e5765
to
c93a33d
Compare
I took the liberty of rebasing this. |
This probably needs a whats new and/or an example of how to use this to get the desired effect. |
@@ -58,6 +58,15 @@ def stackplot(axes, x, *args, | |||
data : indexable object, optional | |||
DATA_PARAMETER_PLACEHOLDER | |||
|
|||
where: bool or array of bool, optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm struggling with the where
parameter. While it is reasonable solve the issue internally by passing where
to fill_between
, is there any reasonable value for where other than y!=0
?
- The generality of fill_between (where=) is not needed. It does not make sense to not plot regions in a stack plot where there are finite values. That would leave gaps.
- I assume
where
needs the shape ofy
; i.e. must be 2D. Not having a resolution in x does not make sense. Using the same where array for all datasets should be the very exception; usually you need.
It seems like a simpler high-level API such as a boolean parameter clip_zero_regions
should be enough, and that would internally use where=(y!=0)
.
Pushed to 3.8 as Tim makes a good point about the API and the tests are failing. |
@tlkaufmann any interest in taking this up. It seems your implementation was close, but maybe we do not want to expose all of the complication of |
PR Summary
As detailed in Issue #22393, when the height of an input to
stackplot
is zero, the anti-aliasing of the underlyingfill_between
calls create artifacts (thin lines). These thin lines are not removed when settinglw=0
but are gone when settingaa=False
, therefore definitely being an unwanted anti-aliasing issue.Minimal example
before the fix
after the fix
PR Checklist
Note:
There currently does not exist a test for
stackplot
, so I didn't add anything.Tests and Styling
pytest
passes).flake8-docstrings
and runflake8 --docstring-convention=all
).Documentation
doc/users/next_whats_new/
(follow instructions in README.rst there).doc/api/next_api_changes/
(follow instructions in README.rst there).