Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster image generation in WebAgg/NbAgg backends #5389

Merged
merged 2 commits into from Nov 4, 2015

Conversation

mdboom
Copy link
Member

@mdboom mdboom commented Nov 3, 2015

  • Write the PNG data to string buffer, rather than an io.BytesIO
    to eliminate the cost of memory reallocation
  • Add compression and filter arguments to _png.write_png
  • Use compression=3 and filter=NONE, which seems to be a sweet spot
    for processing time vs. file size

Most of the speed increase comes from this last point. There are two knobs that libpng provides: compression level, which is just the zlib lossless compression level 0-9, and filter, which is described here: http://www.w3.org/TR/PNG-Filters.html

While the PNG standard says that the filter is to prepare the image for optimum compression, for typical plotting images which have lots of solid colors, it actually appears to make compression worse. (I suspect the filtering helps mainly on photographic images). Worse yet, the algorithm that automatically determines which filtering algorithm will be optimal takes a significant amount of runtime overhead. By selecting an algorithm explicitly, we can eliminate that overhead entirely.

To benchmark, I used a series of 16 difference images collected directly from a WebAgg session. I then ran these images through different combinations of compression and filter and compared the runtime and resulting file sizes. These benchmark images are available at mdboom/webagg-speed.

The results are below:

figure_1

The behavior prior to this PR was compression of 6, filter of "auto", which corresponds to the "X" at time ~2.2s in the plot. After this PR, I've changed it to compression of 6, filter of "NONE", which seems like a good low risk choice. It is not significantly worse in terms of data size, and more than 4x better in runtime. One could be more aggressive on runtime by decreasing compression to 1 or 2 at the expense of a little file size, but that could be detrimental over a low bandwidth connection.

@@ -227,8 +274,13 @@ static PyObject *Py_write_png(PyObject *self, PyObject *args, PyObject *kwds)
}

if (PyErr_Occurred()) {
Py_XDECREF(buff.str);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you check that buff.str is not NULL?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Py_XDECREF already does that (as opposed to Py_DECREF which does not).

@WeatherGod
Copy link
Member

In principle, this all makes sense, and the changes looks good to me. My C/C++ skills are rusty, though, so we probably want another +1 on this.

Maybe we should use the new memcheck script and exercise the webagg/nbagg backends with it, somehow?

@mdboom
Copy link
Member Author

mdboom commented Nov 3, 2015

WebAgg is darn-tootin' snappy after this change. This was the major bottleneck.

It's a tricky one to automatically test. We'd either have to use something like Selenium or write a stubby websocket "browser-like-thing". Though I agree would would be nice long term.

- Write the PNG data to string buffer, rather than an io.BytesIO
  to eliminate the cost of memory reallocation

- Add compression and filter arguments to `_png.write_png`

- Use compression=3 and filter=NONE, which seems to be a sweet spot
  for processing time vs. file size
@@ -131,7 +164,14 @@ static PyObject *Py_write_png(PyObject *self, PyObject *args, PyObject *kwds)
py_file = filein;
}

if ((fp = mpl_PyFile_Dup(py_file, (char *)"wb", &offset))) {
if (filein == Py_None) {
buff.size = width * height * 4 + 1024;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably, this is large enough for uncompressed + PNG header and other overhead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly.

@tacaswell
Copy link
Member

Does the 'normal' save to png go through the same code path? We should probably tweak those parameters / expose them out there as well.

Do we document these internal functions at all?

It is a bit of a code smell that by passing in None you can make this function return the string buffer, but if you pass in a buffer it returns None.

@tacaswell tacaswell added this to the proposed next point release (2.1) milestone Nov 4, 2015
@mdboom
Copy link
Member Author

mdboom commented Nov 4, 2015

Does the 'normal' save to png go through the same code path? We should probably tweak those parameters / expose them out there as well.

Yes, though it's not passing any compression or filter parameters. It probably does make sense to expose that up to the surface. I don't know if we have a standard way to pass file-format-specific parameters down through savefig, though. If not, we should: JPEG is another format where people like to tweak.

All that said, though, I think for normal use, this kind of speed savings isn't important. Though it is interesting that the automatic filter selection doesn't really help on this particular data set. It might make sense to turn off filtering by default.

Do we document these internal functions at all?

No. _png is explicitly privatized. Internal documentation wouldn't be a bad idea, but I like the idea of keeping the low-level C extension private and building porcelain on top.

It is a bit of a code smell that by passing in None you can make this function return the string buffer, but if you pass in a buffer it returns None.

Yes. But not sure how else to do it. The speed trick here is that the extension creates the buffer itself so that it can resize it more efficiently. The docs explicitly say you can't do that if passed in from Python or known about from anywhere else:

https://docs.python.org/3/c-api/bytes.html#c._PyBytes_Resize

Could make a new method that returns a buffer, I suppose, that just does exactly this underneath. But given that _png is already private, maybe not worth it.

@tacaswell
Copy link
Member

I know @efiring has talked about running scripts that generate many output figures so there might be use-cases where this helps.

Main reason for internal docs is bus-factor. Completely agree about keeping it private. The exposed c-api seems to be one of the most painful parts of moving numpy forward. [this is unrelated but true]

Fair enough re code-smell.

@mdboom
Copy link
Member Author

mdboom commented Nov 4, 2015

I've added docstrings to _png.cpp

@tacaswell
Copy link
Member

Great, I'll give this a try and merge it once travis passes again (looks like unrelated network failures).

@mdboom
Copy link
Member Author

mdboom commented Nov 4, 2015

Should I expose this to the outside world (in savefig and friends) in this PR or in a follow-on?

@jenshnielsen
Copy link
Member

I think it's fine to leave to a follow up PR

jenshnielsen added a commit that referenced this pull request Nov 4, 2015
Faster image generation in WebAgg/NbAgg backends
@jenshnielsen jenshnielsen merged commit cb3c6f3 into matplotlib:master Nov 4, 2015
@mdboom mdboom deleted the webagg-speed branch November 10, 2015 02:46
tacaswell pushed a commit to tacaswell/matplotlib that referenced this pull request Jun 30, 2016
Faster image generation in WebAgg/NbAgg backends

Conflicts:
	src/_png.cpp
	  Due to having previously backported matplotlib#5910 ( 3675841 ) as
	  ( 99ad89d )
tacaswell added a commit that referenced this pull request Jun 30, 2016
Merge pull request #5389 from mdboom/webagg-speed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants