New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Large file size when using fill_between() #22803
Comments
Yes, |
Thanks for the additional information! From looking at the performance page and the code of |
I don't think so - Polygons derive from Patch and do not have the Path logic in them. |
Ah, I see; in this case, maybe the documentation should be updated to reflect that? It currently states (emphasis mine):
|
Maybe? I could be wrong, but I don't think it applies to polygons. |
Do you mean the outline of a polygon is different than a polygon? |
Well there are "polygons" and then there are instances of the |
I agree with Jody, I think in that prose we are using "polygon" in the colloquial sense (in that we all look at the screen and agree it is "the outline of a ploygon") rather than a claim "that is a It is probably worth looking into if we do not use the path simplifiaction logic in drawing patches because we can not (at least with clipping there are good reasons we can not clip filled shapes) or because no one has tried to write the code to use the simplification logic. I think the steps here are:
I'm going to label this a good first issue as it should involve no new API design (public API has to stay the same, we already have the path simplification code used else where in the code base, use the same rcparams), but medium difficulty because it will require reading and understanding where we do the path simplification in Line2D and than adapting it for use in |
Hey @tacaswell, could I be assigned to this issue? I am a first-year computer science student looking to take on a project to gain so experience and for fun of course. Just note, It may take me a bit of time however as I have to balance with uni. |
@Nishantppanchal We do not assign or reserve issues. Thanks for your understanding. |
@jklymak No problem. |
Hi, I wanted to give this issue a try but wanted to clarify the tasks. My understanding with the comment @tacaswell has written. Is this simply a documentation issue or should the implementation of the simplification logic completed as well? |
@wannieman98 The first step is to sort out if we need to do any implementation or just document how to opt-into the optimizations. If we already have the functionality, then it is just documentation. If not, then it needs to be both implemented and documented. |
Hi, I've tried to investigate why the First, the matplotlib/lib/matplotlib/path.py Lines 193 to 198 in e26efa5
I'm not too familiar with the matplotlib internals, but I believe that tweaking this snippet to allow paths that end with a CLOSEPOLY command should make sense, as it could essentially be replaced by a LINETO back to the first vertex.
Second, when a backend is drawing the path, it decides whether it should really simplify it, by looking at the matplotlib/lib/matplotlib/backends/backend_svg.py Lines 677 to 678 in e26efa5
And the PDF backend: matplotlib/lib/matplotlib/backends/backend_pdf.py Lines 1846 to 1852 in e26efa5
matplotlib/lib/matplotlib/backends/backend_pdf.py Lines 1983 to 1986 in e26efa5
I'm not too sure why a path should not be simplified if it has an In any case, by both making Let me know if I should try to turn this into a proper patch! |
Not entirely because at the next layer down a It looks like we actually already did the c++ side work: #21387 and just missed enabling it everywhere?
At least with clipping, being filled makes everything harder (because you have to care about the area "inside" the path, not just the points in the path). It is possible that there is a matching issues with simplification or we may have just been being cautious. This will probably require some investigation, but conceptually seems less problematic than clipping.... If you look through git-blame on those lines is there anything suggestive? The next step is to make sure the test suite passes and open a PR with your changes! |
Following your suggestion, I've tried to split the work by first enabling paths that finish with a When doing only this, the Original image: However, for the But when running the corresponding function directly from a script, I get a resulting image where the ellipse is correctly positioned and not cropped, so I'm not sure what is causing the problem in this case? Finally, when also changing the backends as in my previous message, I get the following image for the |
Are you running with the right style? We set a bunch of rcparams to "old" values so we do not need to change the images! That looks like auto-scaling going funny.... |
I've sorted out the problem for the Do you think it's worth doing a first PR with this bit of work before sorting out how to work with the |
Do we have enough dev feedback on whether we want path simplification here? I tend to think path simplification is a bad idea, leading to aliasing and weird effects. If the user cannot handle all the data in the plot then they should either simplify or rasterize. |
We do simplification by default on Line2D, independent if that is a good idea or not, if we do it in one place it makes sense we should do it in the other. The path simplification we are talking about here is done at a very low level in the backends to drop points that "have no effect" (with a knob for what "no effect" means). We only know how to evaluate this at the last possible moment (the "right" simplification for a given data set depends on the number actual output pixel density which we are working on hiding the even from the user!) and I do not think we should (or really can) push that back up to the user. At the bottom this is fundamentally a performance feature (it can massively speed up Agg rendering / reduce file size) and should be an available option everywhere we can technically implement it.
This is worrying that the tick selection is that sensitive. It may be the case that we are applying this simplification too high in the stack (the renderer should see the simplified path, the auto limit code should not). |
OK< I guess that is right for zoom-ability. I remove my objection! |
In this particular case, I think that the settings used in this specific test do not help:
The path is being simplified through a call to
In this case, a workaround could be to set the |
Knee-jerk that makes sense to me, but that may not survive contact with code...
This may come back to what DPI it things is using. If it is doing this at 72 then it very likely is being too aggressively simplified. @tgrohens Thank you for your work so far on this :) |
OK, I think I've understood the issue here. When computing the bounding box for a patch (in However, when calling Interestingly, when calling I think the best course of action is therefore to switch off path simplification in matplotlib/lib/matplotlib/axes/_base.py Line 2398 in 357276b
|
👍🏻 |
So, this bit sorted out the problem with tick selection. I've tried removing the I'm not sure that not considering the |
Bug summary
Hi,
I am trying to save a plot to PDF (or SVG) that uses the fill_between function between two time series, and the resulting file is much larger than when plotting each series individually.
Code for reproduction
Actual outcome
Expected outcome
The file size of the file drawn with
fill_between()
should stay roughly the same as the file drawing the min and the max only.Additional information
I have plotted the resulting file sizes for different numbers of data points.
The file size of the file generated with
fill_between()
grows linearly with the number of data points (as could be expected), but not the file size of the file plotting the min and the max directly: maybe there's an optimization inplot()
that's not being used infill_between()
?Operating system
macOS
Matplotlib Version
3.4.3
Matplotlib Backend
module://matplotlib_inline.backend_inline
Python version
Python 3.9.7
Jupyter version
No response
Installation
pip
The text was updated successfully, but these errors were encountered: