Avoid temporaries when preparing step plots. #7454

Merged
merged 1 commit into from Nov 15, 2016

Conversation

Projects
None yet
4 participants
Contributor

anntzer commented Nov 14, 2016

The previous implementation of pts_to_{pre,mid,post}step was fairly
unefficient: it allocated a large array during validation (vstack),
then a second array to build the return tuple, then converted the
columns to a tuple, then immediately converted the tuple back to a new
array at the call site (to construct a Path). Instead, just create a
single array and work with it all along.

Also some minor related cleanups (moving imports in lines.py, in
particular suggesting to not expose the individual pts_to_*step
functions anymore (they are not directly used, only via the
STEP_LOOKUP_MAP mapping).

lib/matplotlib/cbook.py
+ steps = np.zeros((1 + len(args), 2 * len(x) - 1))
+ steps[0, 0::2] = x
+ steps[0, 1::2] = steps[0, 0::2][:-1]
+ for out, y in zip(steps[1:], args):
@tacaswell

tacaswell Nov 14, 2016

Owner

This is trading memory duplication for a python for-loop. Did you bench mark this it terms of time?

@anntzer

anntzer Nov 14, 2016

Contributor

Actually the previous implementation also had a for loop over the arguments during validation (line 2314 args = tuple(np.asanyarray(y) for y in args)). So we're even on that.

@anntzer

anntzer Nov 14, 2016

Contributor

But in fact I can even do without that loop -- see latest version.

- x = np.asanyarray(x)
- if x.ndim != 1:
- raise ValueError("x must be 1 dimensional")
- if len(args) == 0:
@tacaswell

tacaswell Nov 14, 2016

Owner

Looks like this no longer raises if there are no y-values. I think I am 👍 on this, but just checking that it was intentional.

@anntzer

anntzer Nov 14, 2016

Contributor

Yes, I even removed the test for that (test_step_fails).

tacaswell added this to the 2.1 (next point release) milestone Nov 14, 2016

- cbook._step_validation(np.arange(12), 'a')
-
- with pytest.raises(ValueError):
- cbook._step_validation(np.arange(12))
@tacaswell

tacaswell Nov 14, 2016

Owner

This test case got dropped.

@anntzer

anntzer Nov 14, 2016

Contributor

See comment above re don't error when no y is passed.

Owner

tacaswell commented Nov 14, 2016

I am mostly 👍 on this, tad concerned about the immutable -> mutable change in the return type.

Owner

tacaswell commented Nov 14, 2016

Looks like the tuple return is legacy of how the step logic got factored out the line code. I wrote it and no longer remember why.

+0.5 on removing the pts_to_* function from the lines.py namespace (with an API changes note), but leaving them does little harm.

All of the widows test are failing on conda-build related issues.

tacaswell changed the title from Avoid temporaries when preparing step plots. to [MRG+1] Avoid temporaries when preparing step plots. Nov 14, 2016

lib/matplotlib/cbook.py
- return tuple(steps)
+ steps = np.zeros((1 + len(args), 2 * len(x) - 1))
+ steps[0, 0::2] = x
+ steps[0, 1::2] = steps[0, 0::2][:-1]
@QuLogic

QuLogic Nov 14, 2016

Member

Is there any reason to use two slices here?

@anntzer

anntzer Nov 14, 2016

Contributor

I find this easier to read -- otherwise can you tell immediately whether it should be steps[0, 0:-1:2] or steps[0, 0:-2:2]? Also, I'm not writing x[-1] because that slicing may be not so cheap (I'm looking at you xarray). Same reasons apply below.

@QuLogic

QuLogic Nov 14, 2016

Member

Either one works, so I guess that's just up to you...

I'm not sure to which x[-1] you are referring in the second sentence?

lib/matplotlib/cbook.py
- return tuple(steps)
+ steps = np.zeros((1 + len(args), 2 * len(x) - 1))
+ steps[0, 0::2] = x
+ steps[0, 1::2] = steps[0, 0::2][1:]
@QuLogic

QuLogic Nov 14, 2016

Member

Again, any reason for two slices?

lib/matplotlib/lines.py
@@ -7,32 +7,29 @@
from __future__ import (absolute_import, division, print_function,
unicode_literals)
-import six
@QuLogic

QuLogic Nov 14, 2016

Member

This belongs up here.

@anntzer

anntzer Nov 14, 2016

Contributor

The PEP8 order is stdlib (warnings), third-party (numpy, six) and internals (everything else). Do we prefer always putting six first? (I don't really care.)

@QuLogic

QuLogic Nov 14, 2016

Member

Yes, I know; six is analogous to a future-import in our ordering.

@anntzer

anntzer Nov 14, 2016

Contributor

put it back there.

Contributor

anntzer commented Nov 14, 2016

Can we remove them without a deprecation period? :-)
(There's actually quite a few such imports that could be removed; something such as https://pypi.python.org/pypi/metamodule/ could be useful.)

Owner

tacaswell commented Nov 14, 2016

They have only been there from 1.5.0 on so relatively young. If @anntzer is concerned, I am perfectly happy to leave the imports in place 😉

Contributor

anntzer commented Nov 14, 2016

Too late, they're dead now :-)

@@ -0,0 +1,6 @@
+Functions removed from the `lines` module
@tacaswell

tacaswell Nov 14, 2016

Owner

This seems more 'api_changes' than 'whats_new'

@anntzer

anntzer Nov 14, 2016

Contributor

moved.

Owner

tacaswell commented Nov 14, 2016

One major issue with this is that it is going to strip units and cast everything to a float.

attn @dopplershift

Contributor

anntzer commented Nov 14, 2016

That (casting everything to a float) was already the case before?

Owner

tacaswell commented Nov 14, 2016

This is true. Should not hold the PR over that, but we will probably have to address the units sooner or later.

Contributor

dopplershift commented Nov 14, 2016

So long as it's no worse than before, it's fine by me. I'll just keep this in the back of my mind until I need a step plot.

lib/matplotlib/cbook.py
+ steps[0, 0::2] = x
+ steps[0, 1::2] = steps[0, 0::2][:-1]
+ steps[1:, 0::2] = args
+ steps[1:, 1::2] = steps[1:, ::2][:, 1:]
@QuLogic

QuLogic Nov 14, 2016 edited

Member

This is one place where I find two slices more confusing.
Start at 2 and take every other one, is much clearer than take every other one and start at 1 (which is really 2 from the original array).

@anntzer

anntzer Nov 14, 2016

Contributor

OK, I squashed the two slices together.

@anntzer anntzer Avoid temporaries when preparing step plots.
The previous implementation of `pts_to_{pre,mid,post}`step was fairly
unefficient: it allocated a large array during validation (`vstack`),
then a second array to build the return tuple, then converted the
columns to a tuple, then immediately converted the tuple back to a new
array at the call site (to construct a Path).  Instead, just create a
single array and work with it all along.

Also some minor related cleanups (moving imports in lines.py, in
particular not exposing the individual `pts_to_*step` functions anymore
(they are not directly used, only via the `STEP_LOOKUP_MAP` mapping).
cd2608e

@tacaswell tacaswell merged commit 901562f into matplotlib:master Nov 15, 2016

3 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage decreased (-0.007%) to 61.835%
Details

QuLogic changed the title from [MRG+1] Avoid temporaries when preparing step plots. to Avoid temporaries when preparing step plots. Nov 15, 2016

anntzer deleted the anntzer:step-converters branch Nov 15, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment