Axes instantiation is very slow #6664

Kojoley · 2016-06-28T18:31:21Z

> python -V
Python 2.7.11
> pip freeze | grep matplotlib
matplotlib==1.5.1
> python -m platform -s "print platform()"
Windows-7-6.1.7601-SP2

I have discovered that my application spends a lot of time in Axes creation. Results of class instantiation profiling is below:

> python -m timeit -s "from matplotlib.figure import Figure; from matplotlib.axes import Axes; fig = Figure()" "Axes(fig, (0, 0, 1 ,1))"
10 loops, best of 3: 33.2 msec per loop

Inside Axes.__init__ most of the execution time takes self.cla() call and about the half of it is
this loop below

        for name, spine in six.iteritems(self.spines):
            spine.cla()

After commenting this two lines I have got this:

> python -m timeit -s "from matplotlib.figure import Figure; from matplotlib.axes import Axes; fig = Figure()" "Axes(fig, (0, 0, 1 ,1))"
100 loops, best of 3: 17.9 msec per loop

So we have 33 ms per axes creation, and the half of this time it is used for spines creation that actually may be not used (in my case I have bunch of sharex axes and only one of them have spines).
I see a possible solution where creation of spines, scales (you can see creation of logarithmic scale, which I do not use too) and other stuff is delayed to the point when it actually needed/initialized.

And the fun part. For 20 created Axes we have 60115 calls to Rectangle.stale (3005 call per Axes instantiation)

The text was updated successfully, but these errors were encountered:

tacaswell · 2016-06-28T19:06:37Z

Any help improving the performance would be greatly appreciated!

It is a bit odd that the stale that gets clobbered is the one on rectangle. Can you tell which rectangle artists that is on?

In the orginal implementation of stale the the propagation was short circuited, but that lead to the figure auto-refresh getting stuck.

Kojoley · 2016-06-28T19:20:22Z

Here is the test and profiler diagram

import timeit
timeit.timeit(
    "Axes(fig, (0, 0, 1 ,1))"
    ,
    "from matplotlib.figure import Figure;" \
    "from matplotlib.axes import Axes;" \
    "fig = Figure();"
    ,
    number=20
)

efiring · 2016-06-28T19:39:17Z

How did you generate that profiler diagram?

Kojoley · 2016-06-28T19:44:38Z

It is generated in PyCharm IDE which uses yFiles and yappi

Kojoley · 2016-06-28T20:17:38Z

The bottleneck is reset_ticks() call in the Axis.cla() call

After commenting it out I have got only 2 ms instead of 33 ms per Axes instantiation

100 loops, best of 3: 2.41 msec per loop

Call graph becomes much simpler too

anntzer · 2016-06-28T22:45:08Z

I think stale is a red herring, mostly there because of the overhead of using an instrumenting profiler (?). On the other hand reset_ticks is indeed a bottleneck, see http://vmprof.com/#/207c1fb01e6cedab167a82487d425da1 for the vmprof (statistical profiler) analysis of

from matplotlib.figure import Figure; from matplotlib.axes import Axes

for _ in range(1000):
    fig = Figure()
    Axes(fig, (0, 0, 1 ,1))

(matplotlib 2.0b1)

WeatherGod · 2016-06-30T13:44:12Z

Wow... that is a very informative graph. There seems to be several
degeneracies going on here. First and foremost, I think it is obvious that
we are calling cla() a bit indiscriminately. This may partly be due to
multiple entry points that may require calls to cla(), but we probably
could cut back on it a bit. Second, each call to cla() does a lot of work
that may not be necessary. We might want to have a sentinel that indicates
whether the axis is still in a cleared state, thereby some calls to cla()
would short-circuit.

On Tue, Jun 28, 2016 at 6:45 PM, Antony Lee notifications@github.com
wrote:

I think stale is a red herring, mostly there because of the overhead of
using an instrumenting profiler (?). On the other hand reset_ticks is
indeed a bottleneck, see
http://vmprof.com/#/207c1fb01e6cedab167a82487d425da1 for the vmprof
(statistical profiler) analysis of

from matplotlib.figure import Figure; from matplotlib.axes import Axes

for _ in range(1000):
fig = Figure()
Axes(fig, (0, 0, 1 ,1))

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#6664 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AARy-Lju_LQYg4yakaWgElU0butJ-jn3ks5qQaP1gaJpZM4JAZZR
.

efiring · 2016-06-30T16:58:39Z

Every possible grid and tick line is being instantiated; in practice, most of them don't get used. Lazy instantiation would solve that, at the cost of a little more complexity.

Kojoley · 2016-06-30T20:55:43Z

On newly created axis we do not even have ticks, but here we have creation of single tick for the internal purposes. This should not be problem, but it is, because for single Axes we have multiple reset_ticks calls (in Axes.__init__, XAxis.cla, YAxis.cla). It is definitely some kind of recursion.

I have tried to figure out if it is possible to simplify reset_tick method by removing that tick creation (and entirely remove _lastNumMajorTicks and _lastNumMinorTicks variables). I have not found things that will prevent such optimization, result of this refactoring will be about 11 times performance boost to Axes instantiation (my application runs 30% faster after that) and cleaner code too.
But even after that optimization multiple reset_ticks will not gone. Yes them will be much less resource heavy than now, but it is still not good.

As this optimization will affect multiple places and include rewriting of get_ticks_position, get_minor_ticks, get_major_ticks, set_tick_params methods, I'm going to setup a test environment to develop a PR.

larsoner · 2017-04-27T13:58:59Z

result of this refactoring will be about 11 times performance boost to Axes instantiation (my application runs 30% faster after that) and cleaner code too.
But even after that optimization multiple reset_ticks will not gone. Yes them will be much less resource heavy than now, but it is still not good.

It seems like using such changes to get a performance boost is still worth it, even if it's not the best solution. After all, these changes were proposed about 10 months ago, so that's 10 months where things could have been faster...

@efiring would you be okay with such a workaround in the meantime until lazy instantiation can be implemented?

efiring · 2017-04-28T07:26:05Z

@Kojoley, any more thoughts on this? Or did you run into a roadblock?

We don't need any Spines since GeoAxes uses its own outline_patch to draw the outline. We also don't need any Axis drawing, so override its generation of a "proto-Tick". This is expensive [1] and it's not used to draw any ticks anyway. [1] matplotlib/matplotlib#6664

QuLogic · 2017-05-25T22:03:48Z

Somehow, Ticks are inexplicably tied with Spines. If I apply this patch to defer Tick creation till they're used:

diff --git a/lib/matplotlib/axis.py b/lib/matplotlib/axis.py
index 870c5e8..307b425 100644
--- a/lib/matplotlib/axis.py
+++ b/lib/matplotlib/axis.py
@@ -764,10 +764,8 @@ class Axis(artist.Artist):
         del self.majorTicks[:]
         del self.minorTicks[:]
 
-        self.majorTicks.extend([self._get_tick(major=True)])
-        self.minorTicks.extend([self._get_tick(major=False)])
-        self._lastNumMajorTicks = 1
-        self._lastNumMinorTicks = 1
+        self._lastNumMajorTicks = 0
+        self._lastNumMinorTicks = 0
 
     def set_tick_params(self, which='major', reset=False, **kw):
         """

then the first test that fails looks like this:

which is really odd, as I'd have expected the ticks to be the ones that break.

timhoffm · 2017-11-07T18:06:44Z

What is the status here? I need to create plots with some tenths of axes and this issue results in instatiation times of seconds, which is a pain in interactive gui applications.

If there are some problems with the above solutions, one could use a LazyDefaultTickList that replaces itself with the default list on first use. This basically does only need changes in reset_ticks(). In a test, I could speed up Axes creation by a factor of 3 using this lazy tick list instanciation.

If there is interest, I can provide a pull request.

efiring · 2017-11-08T04:05:14Z

We have two overlapping PRs, #8626 and #8752, that will improve the situation quite a bit by eliminating most of the unnecessary cla calls. Based on timing in the former, that will give a speedup of a factor of 4. I hope we can get this functionality merged soon.

We would certainly be interested to see a PR with your proposal as well, which is orthogonal to the cla-elimination PRs so it should provide additional speedup in cases where ticks are not used at all.

michalkahle · 2018-01-05T13:41:17Z

I have run into the same problem with 384 subplots. Solved it with this hack (thanks @tacaswell ). Got down from 20 minutes to 2 seconds. Not sure if it helps anybody but I have measured that the time to produce figure grows exponentially(!) with the number of subplots.

efiring · 2018-01-05T16:30:01Z

I'm seeing linear, not exponential, growth in the time to create multiple subplots (with mpl very close to current master):

In [1]: import matplotlib.pyplot as plt

In [2]: %timeit plt.subplots(2, 6); plt.close()
342 ms ± 4.22 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [3]: %timeit plt.subplots(2, 12); plt.close()
1.09 s ± 4.76 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [4]: %timeit plt.subplots(2, 24); plt.close()
2.2 s ± 6.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [5]: %timeit plt.subplots(4, 24); plt.close()
4.42 s ± 9.75 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [6]: %timeit plt.subplots(8, 24); plt.close()
8.98 s ± 53.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

efiring · 2018-01-05T16:36:31Z

@michalkahle, can you provide a minimal and self-contained test case that reproduces something like your curve?

jklymak · 2018-01-05T17:18:59Z

2.0.2:

In [2]:  %timeit plt.subplots(2, 24); plt.close()
1.72 s ± 17.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [2]:  %timeit plt.subplots(4, 24); plt.close()
3.6 s ± 32.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [3]:  %timeit plt.subplots(8, 24); plt.close()
10.3 s ± 746 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In 2.1.1

In [4]:  %timeit plt.subplots(2, 24); plt.close()
1.28 s ± 139 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [2]: %timeit plt.subplots(4, 24); plt.close()
2.23 s ± 32.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [3]:  %timeit plt.subplots(8, 24); plt.close()
4.61 s ± 57.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

master is the same as 2.1.1 for this test...

My guess would be that @michalkahle is using 2.0.2?

michalkahle · 2018-01-05T17:50:39Z

Hi, my mpl is 2.1.0.

import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
from time import time

s = pd.Series(np.random.randn(10).cumsum(), index=np.arange(0, 10))

def test_grid(n, s):
    nrows = int(math.sqrt(n))
    ncols = math.ceil(n / nrows)
    fig, ax = plt.subplots(
        nrows=nrows, ncols=ncols, figsize=(24, 16), sharex=True, sharey=True, squeeze=False)
    for a in range(n):
        row, col = divmod(a, ncols)
        s.plot(ax=ax[row,col])
    plt.close()

results = []
for n in range(1, 150, 10):
    print(n)
    start = time()
    test_grid(n, s)
    results.append((n, time() - start))

pd.DataFrame(results, columns=['n', 'time [s]']).set_index('n').plot()

jklymak · 2018-01-05T18:05:47Z

Its the sharex and sharey thats doing it. If I take those out:

efiring · 2018-01-05T21:38:24Z

It's not so much the sharing, it's the way Pandas is handling the sharing. Removing Pandas and leaving the sharing speeds it up greatly. Removing the sharing speeds it up even more.

import matplotlib.pyplot as plt
import numpy as np
from time import time


def test_grid(n, share):
    start = time()
    nrows = int(np.sqrt(n))
    ncols = int(np.ceil(n / nrows))
    kw = dict(sharex=True, sharey=True) if share else {}
    fig, ax = plt.subplots(
        nrows=nrows, ncols=ncols, figsize=(24, 16), squeeze=False, **kw)
    for i in range(nrows):
        for j in range(ncols):
            ax[i, j].plot([1,2,3])
    plt.close()
    return nrows * ncols, time() - start

nums = list(range(1, 151, 10))
times = np.zeros((len(nums), 2), dtype=float)
naxes = np.zeros((len(nums),), dtype=int)
for i, share in enumerate((True, False)):
    for j, n in enumerate(nums):
        na, duration = test_grid(n, share)
        times[j, i] = duration
        naxes[j] = na
        print("%5s   %3d   %7.2f" % (share, na, duration))

fig, ax = plt.subplots()
lines = ax.plot(naxes, times)
ax.set_ylabel('time (s)')
ax.set_xlabel('number of Axes')
ax.legend(lines, ['shared', 'unshared'])
fig.savefig('many_axes_plain.png')

michalkahle · 2018-01-06T11:31:58Z

Yes, you are right.

array = np.random.randn(10).cumsum()
series = pd.Series(array)

def test_subplots(n, data):
    start = time()
    fig, ax = plt.subplots(nrows=1, ncols=n, sharex=True, sharey=True, squeeze=False)
    for i in range(n):
        if type(data) is pd.core.series.Series:
            data.plot(ax=ax[0,i])
        else:
            ax[0,i].plot(data)
    plt.close()
    return n, time() - start

QuLogic · 2018-02-05T04:37:16Z

So #8626 was applied and then reverted due to some issues with polar stuff. For 1-, 2-, and 10-square subplots, timings are 118.42ms, 374.95ms, and 3.91s, respectively. In the meantime, #10302 was merged, which drops time to 79.00ms, 284.09ms, and 2.90s, respectively. You can see that change in the benchmarks. master also includes #10327, which nets 77.55ms, 279.91ms, and 2.84s.

If we re-apply #8626, we could perhaps save another 2-4% at 77.03ms, 268.03ms, and 2.77s. But unfortunately, it's still breaking polar plots somehow.

TomNicholas · 2019-01-16T13:12:25Z

What is the status with this? I'm using matplotlib in a Jupyter notebook and it seems that plt.subplots() is taking longer than processing GBs of data is! Is there another reason why jupyter in particular would be very slow?

timhoffm · 2019-01-16T22:22:01Z

We have a speedup of a factor 4 by lazily instantiating ticks ( #9727 / #10302 ). However, the essential problems still persist. One is the tick handling, which handles each tick as a separate object. The second is the layered architecture and initialization / update mechanism. Both are hard to change.

efiring · 2019-01-16T22:51:07Z

There is certainly more work to do on this, but it is usually not a noticeable when working interactively unless you are making a plot with a large number of axes. Is this your use case, @TomNicholas? If not, I wonder whether you might be running into some other problem.

rth · 2019-01-22T15:00:14Z

There is certainly more work to do on this, but it is usually not a noticeable when working interactively unless you are making a plot with a large number of axes.

Another place where this might have a significant impact is animations / interactive plot updates. Blitting is a workaround, but a user perspective it's sometimes easier to just redraw everything in which case axis instantiation time will be non-negligible.

aasensio · 2019-02-12T13:01:45Z

I'm finding the same slow behavior when doing a bunch of imshow in a figure in jupyter and I don't know if the issue belongs here or to jupyter. Something like this simple takes almost 18 s in my computer. The larger the image, the longer it takes:

f, ax = plt.subplots(nrows=3, ncols=3, figsize=(13,8))
for i in range(3):
    for j in range(3):
        ax[i,j].imshow(np.random.randn(100,100))

Kojoley · 2019-02-12T13:35:47Z

Another place where this might have a significant impact is animations / interactive plot updates. Blitting is a workaround, but a user perspective it's sometimes easier to just redraw everything in which case axis instantiation time will be non-negligible.

The original issue is about axes creation, redrawing is a different problem. There is RGBA<->BGRA conversion which takes a considerable amount of time if you make hundreds redraws per second. (I have a patch to make it faster but it actually three patches for every of GCC, Clang, and MSVC compilers because their optimizers and vectorizers are too far from being good).

The larger the image, the longer it takes

I tried your code and it takes 600ms for me.
The same comment as for the case above applies here. In other words the original issue was about plt.subplots call of your example. Axes instantiation time is not dependent on plot size, so you experience some other issue.

timhoffm · 2019-02-13T23:26:45Z

@aasensio which matplotlib version? Also please time with

%%timeit -n1 -r1 f, ax = plt.subplots(nrows=3, ncols=3, figsize=(13,8))
for i in range(3):
    for j in range(3):
        ax[i,j].imshow(np.random.randn(100,100))

and

%%timeit -n1 -r1
f, ax = plt.subplots(nrows=3, ncols=3, figsize=(13,8))
for i in range(3):
    for j in range(3):
        ax[i,j].imshow(np.random.randn(100,100))

to see the which fraction of the time is covered by the subplot creation.

aasensio · 2019-02-14T08:20:43Z

@timhoffm My version of matplotlib is 3.0.2, running with notebook server 5.7.4. The timing results are 684 ms for the subplots creation only and 12.7 s when the imshows are added. I'm starting to suspect that it is somehow related to the server-client I/O in my system, which might be firewalled. I'll investigate on that.

dstansby · 2021-12-30T12:47:46Z

I think the original issue is solved now - at least for me the performance bottlnecks mentioned at the top are not present in the currently main branch. I'm going to close this to avoid it becoming a general posting of performance issues, and recommend others open new and more specific issues if they notice any performance problems.

Kojoley · 2021-12-30T14:11:58Z

I would not say it is solved, but the situation is indeed improved, from 'very slow' it became just 'slow'. The #6664 (comment) implied that there could be 13-14x improvement, the #10302 gave 3-4x, so there still should be some bottlenecks. Also there were performance regressions since 2.2.5 totaling in 1.6x slowdown:

CPython	matplotlib	Axes creation
2.7	1.5.1	22ms
2.7	1.5.3	22.1ms
2.7	1.5.3	22.1ms
2.7	2.0.2	25.2ms
2.7	2.1.2	17.9ms
2.7	2.2.0	4.9ms
2.7	2.2.2	5.95ms
2.7	2.2.5	5.98ms
3.8	2.2.5	4.65ms
3.8	3.1.3	5.48ms
3.8	3.2.2	5.78ms
3.8	3.3.4	8.51ms
3.8	3.5.1	7.29ms
3.10	3.5.1	7.37ms

jklymak · 2021-12-30T14:58:09Z

@Kojoley i think the point here is that it's pretty hard to track this in a really old issue. For sure, if we think there are current bottlenecks that could be fixed we should fix them.

tacaswell added this to the 2.1 (next point release) milestone Jun 28, 2016

Kojoley mentioned this issue Sep 22, 2016

pyplot.subplots() is slow #7163

Closed

tacaswell mentioned this issue Nov 16, 2016

API/FIX: mark figure as not stale on draw attempt #7451

Merged

larsoner mentioned this issue Apr 27, 2017

MRG: Faster topomaps mne-tools/mne-python#4224

Merged

QuLogic mentioned this issue May 16, 2017

Axes.__init__ speedup #8626

Merged

1 task

tacaswell modified the milestones: 2.1 (next point release), 2.2 (next next feature release) Oct 3, 2017

timhoffm mentioned this issue Nov 9, 2017

Lazy initialization of axis default tick list #9727

Closed

1 task

timhoffm mentioned this issue Dec 21, 2017

Fix empty plot with drawstyle="steps" #9997

Merged

anntzer added the Performance label Mar 19, 2018

efiring mentioned this issue Nov 12, 2019

Plots with many subplots can be slow #867

Closed

QuLogic mentioned this issue Apr 4, 2020

Unnecessary drawing with NbAgg #13971

Closed

anntzer mentioned this issue Apr 21, 2020

Excessive rugplot memory usage #4695

Closed

mwaskom mentioned this issue Jun 29, 2021

ENH: speeding up pairplot and alike mwaskom/seaborn#2609

Closed

dstansby closed this as completed Dec 30, 2021

story645 removed this from the future releases milestone Oct 6, 2022

larsoner mentioned this issue Oct 7, 2023

ENH: Only create ticks if required #27027

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Axes instantiation is very slow #6664

Axes instantiation is very slow #6664

Kojoley commented Jun 28, 2016

tacaswell commented Jun 28, 2016

Kojoley commented Jun 28, 2016

efiring commented Jun 28, 2016

Kojoley commented Jun 28, 2016

Kojoley commented Jun 28, 2016

anntzer commented Jun 28, 2016 •

edited

WeatherGod commented Jun 30, 2016

efiring commented Jun 30, 2016

Kojoley commented Jun 30, 2016

larsoner commented Apr 27, 2017

efiring commented Apr 28, 2017

QuLogic commented May 25, 2017 •

edited

timhoffm commented Nov 7, 2017 •

edited

efiring commented Nov 8, 2017

michalkahle commented Jan 5, 2018

efiring commented Jan 5, 2018

efiring commented Jan 5, 2018

jklymak commented Jan 5, 2018

michalkahle commented Jan 5, 2018

jklymak commented Jan 5, 2018

efiring commented Jan 5, 2018

michalkahle commented Jan 6, 2018

QuLogic commented Feb 5, 2018

TomNicholas commented Jan 16, 2019

timhoffm commented Jan 16, 2019 •

edited

efiring commented Jan 16, 2019

rth commented Jan 22, 2019 •

edited

aasensio commented Feb 12, 2019

Kojoley commented Feb 12, 2019

timhoffm commented Feb 13, 2019

aasensio commented Feb 14, 2019

dstansby commented Dec 30, 2021 •

edited

Kojoley commented Dec 30, 2021 •

edited

jklymak commented Dec 30, 2021

Axes instantiation is very slow #6664

Axes instantiation is very slow #6664

Comments

Kojoley commented Jun 28, 2016

tacaswell commented Jun 28, 2016

Kojoley commented Jun 28, 2016

efiring commented Jun 28, 2016

Kojoley commented Jun 28, 2016

Kojoley commented Jun 28, 2016

anntzer commented Jun 28, 2016 • edited

WeatherGod commented Jun 30, 2016

efiring commented Jun 30, 2016

Kojoley commented Jun 30, 2016

larsoner commented Apr 27, 2017

efiring commented Apr 28, 2017

QuLogic commented May 25, 2017 • edited

timhoffm commented Nov 7, 2017 • edited

efiring commented Nov 8, 2017

michalkahle commented Jan 5, 2018

efiring commented Jan 5, 2018

efiring commented Jan 5, 2018

jklymak commented Jan 5, 2018

michalkahle commented Jan 5, 2018

jklymak commented Jan 5, 2018

efiring commented Jan 5, 2018

michalkahle commented Jan 6, 2018

QuLogic commented Feb 5, 2018

TomNicholas commented Jan 16, 2019

timhoffm commented Jan 16, 2019 • edited

efiring commented Jan 16, 2019

rth commented Jan 22, 2019 • edited

aasensio commented Feb 12, 2019

Kojoley commented Feb 12, 2019

timhoffm commented Feb 13, 2019

aasensio commented Feb 14, 2019

dstansby commented Dec 30, 2021 • edited

Kojoley commented Dec 30, 2021 • edited

jklymak commented Dec 30, 2021

anntzer commented Jun 28, 2016 •

edited

QuLogic commented May 25, 2017 •

edited

timhoffm commented Nov 7, 2017 •

edited

timhoffm commented Jan 16, 2019 •

edited

rth commented Jan 22, 2019 •

edited

dstansby commented Dec 30, 2021 •

edited

Kojoley commented Dec 30, 2021 •

edited