Some general cleanups #6573

Merged
merged 3 commits into from Jun 17, 2016

Conversation

Projects
None yet
6 participants
Contributor

anntzer commented Jun 11, 2016 edited

  • Remove some very old compatibility layers (sets module, md5 module, etc.)
  • Use astype(float) instead of astype(np.float) and astype(np.float_)
  • Use isinstance(..., MaskedArray) instead of isMaskedArray / isMA.

Some other cleanup ideas:

  • Get rid of if debug prints (mostly present in backend implementations just to indicate that a method has been called): I think that a tool such as https://pypi.python.org/pypi/hunter is typically more appropriate anyways.
  • Reimplement the verbose class on top of the stdlib logging and read off an environment variable such as MPLVERBOSITY instead of command-line arguments which are likely to be set for other purposes.
@anntzer anntzer General cleanups.
- Remove some very old compatibility layers (sets module, md5 module,
etc.)
- Use `astype(float)` instead of `astype(np.float)` and
`astype(np.float_)`
- Use `isinstance(..., MaskedArray)` instead of `isMaskedArray` /
`isMA`.
71d9386

mdboom added the needs_review label Jun 11, 2016

Owner

efiring commented Jun 11, 2016

On 2016/06/10 8:56 PM, Antony Lee wrote:

Use |isinstance(..., MaskedArray)| instead of |isMaskedArray| / |isMA|.

What is the advantage of this change?

@jenshnielsen jenshnielsen commented on the diff Jun 11, 2016

lib/matplotlib/transforms.py
from .path import Path
DEBUG = False
-# we need this later, but this is very expensive to set up
-MINFLOAT = np.MachAr(float).xmin
@jenshnielsen

jenshnielsen Jun 11, 2016

Owner

Have you benchmarked this change. It was done like this for performance reasons. See
2d0cc7d

@jenshnielsen

jenshnielsen Jun 11, 2016

Owner

Ok read the docs it seems like finfo is a cached version of MachAr so it is probably fine

@anntzer

anntzer Jun 11, 2016

Contributor

It is a slower than just reading off a global but still much faster than MachAr and even than allocating a new array of size 1.

@tacaswell

tacaswell Jun 17, 2016

Owner

I am also mildly concerned about this, but this is almost certainly not the bottle neck any more.

Contributor

anntzer commented Jun 11, 2016

re: isinstance(..., MaskedArray). I always felt a bit uneasy about isMaskedArray / isMA: when I read a function like this my first instinct is "oh, I guess if there's such a function it must do a bit more than checking the type (does it check exact type equality ignoring subclassing? does it do something with array-likes? etc.)". But in fact it's just implemented (after a page-long docstring...) just as a call to isinstance.
Say Python suddenly decided to add an isfloat() function... I think it would widely be considered as unpythonic. I think the same of isMaskedArray.
On the other hand I hardly feel strongly about this change.

Owner

efiring commented Jun 11, 2016

On 2016/06/11 5:48 AM, Antony Lee wrote:

re: |isinstance(..., MaskedArray)|. I always felt a bit uneasy about
|isMaskedArray| / |isMA|: when I read a function like this my first
instinct is "oh, I guess if there's such a function it must do a bit
more than checking the type (does it check exact type equality ignoring
subclassing? does it do something with array-likes? etc.)". But in fact
it's just implemented (after a page-long docstring...) just as a call to
|isinstance|.
Say Python suddenly decided to add an |isfloat()| function... I think it
would widely be considered as unpythonic. I think the same of
|isMaskedArray|.
On the other hand I hardly feel strongly about this change.

I have no objection to making the change; your rationale is fine.

@tacaswell tacaswell and 2 others commented on an outdated diff Jun 12, 2016

lib/matplotlib/cbook.py
@@ -2344,7 +2343,7 @@ def pts_to_poststep(x, *args):
# do normalization
vertices = _step_validation(x, *args)
# create the output array
- steps = ma.zeros((vertices.shape[0], 2 * len(x) - 1), np.float)
+ steps = np.zeros((vertices.shape[0], 2 * len(x) - 1), np.float)
@tacaswell

tacaswell Jun 12, 2016

Owner

These got missed in the np.float -> float conversion.

@QuLogic

QuLogic Jun 12, 2016

Member

It's also not the masked array zeros, if that makes a difference.

@anntzer

anntzer Jun 13, 2016 edited

Contributor

There's quite a few np.float left; this one is safe to replace but I'd need to do some more research to figure out how this changes calls to np.issubdtype and others.
WRT to np.zeros vs ma.zeros, see below.

tacaswell added this to the 2.1 (next point release) milestone Jun 12, 2016

Owner

tacaswell commented Jun 12, 2016

Moving to use logging is yet another of the big projects that I think everyone agrees should be done, but no one has yet found the time to do.

@QuLogic QuLogic and 2 others commented on an outdated diff Jun 12, 2016

lib/matplotlib/__init__.py
@@ -126,6 +126,7 @@
cycler)
import numpy
+import numpy.ma
@QuLogic

QuLogic Jun 12, 2016

Member

What effect does this have?

@anntzer

anntzer Jun 13, 2016

Contributor

Ensure that everyone can directly use np.ma.<xyz> without importing numpy.ma.

@efiring

efiring Jun 13, 2016

Owner

Actually, it doesn't do anything at all. It's superfluous.

@anntzer

anntzer Jun 13, 2016

Contributor

Indeed, removed.

@efiring efiring and 2 others commented on an outdated diff Jun 12, 2016

lib/matplotlib/cbook.py
@@ -2385,7 +2384,7 @@ def pts_to_midstep(x, *args):
# do normalization
vertices = _step_validation(x, *args)
# create the output array
- steps = ma.zeros((vertices.shape[0], 2 * len(x)), np.float)
+ steps = np.zeros((vertices.shape[0], 2 * len(x)), np.float)
@efiring

efiring Jun 12, 2016

Owner

This and the similar change in pts_to_poststep is changing the behavior, and I suspect it will break something. _step_validation accepts subclasses including masked arrays.

@anntzer

anntzer Jun 13, 2016

Contributor

pts_to_prestep was already using np.zeros so if there was a bug it was already there, if any; but in fact Line2D handles replacing masked arrays by nan-filled arrays (see Line2D.recache).

@QuLogic

QuLogic Jun 13, 2016

Member

The only change on this line is ma -> np, so I don't understand what you mean, unless you are referring to somewhere else.

@anntzer

anntzer Jun 13, 2016

Contributor

Before this patch, pts_to_prestep was using np.zeros whereas pts_to_midstep and pts_to_poststep were using ma.zeros, so the behavior was inconsistent to start with. I guess that the use of ma.zeros was to handle the hypothetical case where a masked array would be passed to a step-plotting function, but Line2D.recache (which ultimately dispatches to pts_to_*step) already replaces masked arrays by nan-filled regular arrays, so there's no reason to return a masked array here.

@efiring efiring commented on an outdated diff Jun 12, 2016

lib/matplotlib/colors.py
else:
result = result.astype(np.float32)
else:
is_scalar = True
- result = ma.array([value]).astype(np.float)
+ result = ma.array([value]).astype(float)
@efiring

efiring Jun 12, 2016 edited

Owner

If you are standardizing on np.ma, here is another opportunity for it. I used to use ma, but now I always use np.ma. (But after looking at things farther along, I am wondering whether the extra verbosity does more harm than good.)

@efiring efiring and 1 other commented on an outdated diff Jun 12, 2016

lib/matplotlib/mlab.py
@@ -3824,8 +3817,8 @@ def poly_below(xmin, xs, ys):
xv, yv = poly_below(0, x, y)
ax.fill(xv, yv)
"""
- if ma.isMaskedArray(xs) or ma.isMaskedArray(ys):
- numpy = ma
+ if isinstance(xs, np.ma.MaskedArray) or isinstance(ys, np.ma.MaskedArray):
+ numpy = np.ma
@efiring

efiring Jun 12, 2016

Owner

This (previous and present) use of numpy for np.ma or ma looks odd. Maybe switch it to nx or num? Long ago, in the days when we had to support Numeric, numarray, and numpy, we used nx as the shim module.

@anntzer

anntzer Jun 13, 2016

Contributor

The problem is that there's another variable called Nx in the same function ("size of xs"), which makes things a bit awkward... and num sounds too generic, doesn't it?

@efiring efiring and 2 others commented on an outdated diff Jun 12, 2016

lib/matplotlib/mlab.py
@@ -3853,9 +3846,10 @@ def poly_between(x, ylower, yupper):
Return value is *x*, *y* arrays for use with
:meth:`matplotlib.axes.Axes.fill`.
"""
- if (ma.isMaskedArray(ylower) or ma.isMaskedArray(yupper) or
- ma.isMaskedArray(x)):
- numpy = ma
+ if (isinstance(ylower, np.ma.MaskedArray) or
+ isinstance(yupper, np.ma.MaskedArray) or
+ isinstance(x, np.ma.MaskedArray)):
@efiring

efiring Jun 12, 2016

Owner

This is getting pretty clunky. I'm almost ready to argue for standardizing on isMA as a readability win. Also, there is nothing wrong with making code a bit more concise and readable by importing frequently used symbols at the top so as to avoid so much np.ma. repetition in a case like this.

@QuLogic

QuLogic Jun 12, 2016

Member

Something like:

    if any(isinstance(var, np.ma.MaskedArray) for var in (ylower, yupper, x)):  

will fit on one line.

@anntzer

anntzer Jun 13, 2016

Contributor

Fixed following @QuLogic's suggestion.

anntzer added some commits Jun 13, 2016

@anntzer anntzer Finish swapping np.float{,_} -> float; ma -> np.ma
plus other comments on the PR.
91034ee
@anntzer anntzer Remove a reference to Python 2.6.
850ef93
Owner

tacaswell commented Jun 17, 2016

👍 if @efiring is happy with the way the MA stuff ended up.

@efiring efiring merged commit 7ae1b06 into matplotlib:master Jun 17, 2016

2 of 3 checks passed

continuous-integration/appveyor/pr AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.7%) to 70.264%
Details

efiring removed the needs_review label Jun 17, 2016

anntzer deleted the anntzer:cleanups branch Jun 28, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment