Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: plotting with Numpy array subclasses is slow with Matplotlib 3.1.0 (regression) #14274

Closed
astrofrog opened this issue May 20, 2019 · 6 comments
Labels
Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. topic: units and array ducktypes
Milestone

Comments

@astrofrog
Copy link
Contributor

astrofrog commented May 20, 2019

Bug summary

Astropy implements a ConversionInterface so as to be able to plot quantities with units (represented by the astropy Quantity class, which is a Numpy array subclass). However, this functionality is broken with Matplotlib 3.1 as it looks like Matplotlib now tries to iterate over every element of the array we are trying to plot.

Code for reproduction

import numpy as np
from astropy import units as u
from matplotlib import pyplot as plt
from astropy.visualization.units import quantity_support

values = np.arange(1000000) * u.m

with quantity_support():
    plt.hist(values, bins=50)

plt.savefig('test.png')

Actual outcome

Python hangs.

Expected outcome

This was previously very efficient.

The commit that broke this was 9eac832 (sorry @jklymak!) - and in particular the issue is that Quantity is an instance of collections.abc.Iterable, so the first 'if' statement doesn't get executed and instead Matplotlib iterates over all values. Actually what is strange is that this should also cause slowdown for plain numpy arrays, which are also instances of Iterable, and this might happen, but isn't as noticeable. What is different here is that the Quantity initializer is slow enough that this is noticeable.

We didn't notice this sooner as this was only really tested with large arrays in the docs, which aren't run against Matplotlib dev...

@jklymak
Copy link
Member

jklymak commented May 20, 2019

Ping @anntzer who suggested this change :-). I’m away from my computer for a few days so I can’t look at this. Hopefully it’s an easy fix.

@jklymak
Copy link
Member

jklymak commented May 20, 2019

It’s a bit strange that the first element of your quantity is an iterable. Note the check is for X[0] which is mean to see if this is a list of lists. Does each element have to be an iterable?

@astrofrog
Copy link
Contributor Author

Ah I see the issue, indeed Quantity(...)[0] is also a Quantity which is still an array subclass, hence why it is still considered an iterable.

@anntzer
Copy link
Contributor

anntzer commented May 20, 2019

I guess the offending line could be changed to X.ndim == 1 and np.ndim(X[0]) == 0. Sure, you'll pay an (implicit) conversion of X[0] to ndarray but we already converted X to an ndarray the line just above so it's not a big deal...

@astrofrog
Copy link
Contributor Author

See #14289 for a fix

@dstansby dstansby added this to the v3.1.1 milestone May 21, 2019
@dstansby dstansby added Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. topic: units and array ducktypes labels May 21, 2019
@anntzer
Copy link
Contributor

anntzer commented May 21, 2019

Closed by #14289 (AFAICS).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. topic: units and array ducktypes
Projects
None yet
Development

No branches or pull requests

4 participants