Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

matplotlib.path.contains_points is a LOT slower in 1.51 #6444

Closed
Fafa87 opened this issue May 18, 2016 · 17 comments
Closed

matplotlib.path.contains_points is a LOT slower in 1.51 #6444

Fafa87 opened this issue May 18, 2016 · 17 comments
Assignees
Milestone

Comments

@Fafa87
Copy link

Fafa87 commented May 18, 2016

Hi!

I have a code using matplotlib.path to create a mask of points inside a polygon. Lately we moved from version 1.4 to 1.51 and at first I thought that our program started to hang but it turned out that it just took about 30x more time to create masks.

Profiler showed that the problem is with path.contains_points(pts).
I've extracted the problem to a unit test and did tests 1.4 vs 1.51:
1.4 - 70 ms
1.51 - 2215 ms

import unittest
import numpy as np
import math

import matplotlib
from matplotlib.path import Path

print matplotlib.__version__

class MyTestCase(unittest.TestCase):
    def get_polygon_path(self, polygon_x, polygon_y):
        vertices = zip(polygon_x, polygon_y)
        p = Path(vertices, None, closed=True, readonly=True)
        return p

    def get_in_polygon(self, x1, x2, y1, y2, path):
        x, y = np.meshgrid(np.arange(x1, x2), np.arange(y1, y2))
        x, y = x.flatten(), y.flatten()
        pts = np.vstack((x, y)).T

        # Find points that belong to snake in minimal rectangle
        grid = path.contains_points(pts)

    def test_matplotlib(self):
        for k in range(10):
            angles = np.linspace(0, 2 * math.pi, 90)
            px = 150 + 100 * np.cos(angles)
            py = 150 + 100 * np.sin(angles)

            poly = self.get_polygon_path(px, py)
            self.get_in_polygon(50, 250, 50, 250, poly)

        self.assertEqual(True, True)


if __name__ == '__main__':
    unittest.main()

Am I doing or setting something wrong or is it indeed a problem with a new version?

Regards
Filip Mróz

@tacaswell tacaswell added this to the 2.0 (style change major release) milestone May 18, 2016
@tacaswell
Copy link
Member

attn @mdboom

Exactly which version of 1.4.x were you using?

refactored the code a bit to remove unittest overhead

import numpy as np
import math

import matplotlib
from matplotlib.path import Path

print(matplotlib.__version__)


def get_polygon_path(polygon_x, polygon_y):
    vertices = list(zip(polygon_x, polygon_y))
    p = Path(vertices, None, closed=True, readonly=True)
    return p


def get_in_polygon(x1, x2, y1, y2, path):
    x, y = np.meshgrid(np.arange(x1, x2), np.arange(y1, y2))
    x, y = x.flatten(), y.flatten()
    pts = np.vstack((x, y)).T

    # Find points that belong to snake in minimal rectangle
    path.contains_points(pts)


def test_matplotlib():
    for k in range(10):
        angles = np.linspace(0, 2 * math.pi, 90)
        px = 150 + 100 * np.cos(angles)
        py = 150 + 100 * np.sin(angles)

        poly = get_polygon_path(px, py)
        get_in_polygon(50, 250, 50, 250, poly)

I see

#  1.5.1.post1832+g76a884c

In [39]: %timeit test_matplotlib()
1 loop, best of 3: 487 ms per loop
# 1.4.3

In [4]: %timeit test_matplotlib()
10 loops, best of 3: 147 ms per loop

so definitely a slow down, but not 30x

@Fafa87
Copy link
Author

Fafa87 commented May 18, 2016

I use the 1.4.0 version.
Could you test it on that version too because on my side it is still 30x?

@tacaswell
Copy link
Member

Can you also test 1.4.3 and current master?

On Wed, May 18, 2016 at 6:38 PM Fafa87 notifications@github.com wrote:

I use the 1.4.0 version.
Could you test it on that version too because on my side it is still 30x?


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#6444 (comment)

@Fafa87
Copy link
Author

Fafa87 commented May 18, 2016

This is like the first time that I play with virtualenv but these are my results:

-- End pasted text --

1.4.0
In [2]: %timeit test_matplotlib()
10 loops, best of 3: 68.9 ms per loop

-- End pasted text --

1.4.3
In [6]: %timeit test_matplotlib()
10 loops, best of 3: 118 ms per loop

-- End pasted text --

1.5.1
In [2]: %timeit test_matplotlib()
1 loops, best of 3: 2.17 s per loop

I did not manage to build master (freetype, png failed).

@tacaswell
Copy link
Member

Python 3.5.1 |Continuum Analytics, Inc.| (default, Dec  7 2015, 11:16:01) 
Type "copyright", "credits" or "license" for more information.

IPython 4.2.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: python.el: native completion setup loaded

In [2]: 1.4.0

In [3]: %timeit test_matplotlib()
10 loops, best of 3: 49.6 ms per loop

In [4]: 
Do you really want to exit ([y]/n)? y

Process Python finished
Python 3.5.1 |Continuum Analytics, Inc.| (default, Dec  7 2015, 11:16:01) 
Type "copyright", "credits" or "license" for more information.

IPython 4.2.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: python.el: native completion setup loaded

In [2]: 1.4.3

In [3]: %timeit test_matplotlib()
10 loops, best of 3: 82.1 ms per loop

In [4]: 
Do you really want to exit ([y]/n)? 

Process Python finished
Python 3.5.1 |Continuum Analytics, Inc.| (default, Dec  7 2015, 11:16:01) 
Type "copyright", "credits" or "license" for more information.

IPython 4.2.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: 1.5.1

In [2]: %timeit test_matplotlib()
1 loop, best of 3: 268 ms per loop

In [4]: 
Do you really want to exit ([y]/n)? 

Process Python finished
Python 3.5.1 |Continuum Analytics, Inc.| (default, Dec  7 2015, 11:16:01) 
Type "copyright", "credits" or "license" for more information.

IPython 4.2.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: 1.5.1.post1958+g273da36

In [2]: python.el: native completion setup loaded

In [3]: %timeit test_matplotlib()
1 loop, best of 3: 259 ms per loop

In [4]: 

so a factor of ~5x

@QuLogic
Copy link
Member

QuLogic commented May 19, 2016

I bisected 1.4.0 to 1.4.3 (about 1.6x slower for me):

4b63f75925c14b87f8b41dffb1eeefdcecf33bb6 is the first bad commit
commit 4b63f75925c14b87f8b41dffb1eeefdcecf33bb6
Author: Michael Droettboom <mdboom@gmail.com>
Date:   Wed Nov 5 18:54:24 2014 -0500

    Skip points with NaNs in point_in_path and friends

:040000 040000 9e12d2c8fa030d66fe7bdefdefd72f3e08a79d6d e6068c4e187b6760899f6ecb823df31ccd2fe8b2 M  lib
:040000 040000 761240a9ed258ca1498003adbad54650e1e94fe8 8204b2e95bec59b6adca6738596adb67bac0565a M  src

Haven't done the 1.4.3->1.5.1 bisection yet. Master runs about the same speed as 1.5.1.

@tacaswell
Copy link
Member

Comparing point_in_path_impl between 1.4.0 and 1.5.1 (which involved a major re-write to remove CXX) it looks like the inputs changed from a double pointer to (what I assume) is a numpy array object (as it has a size method and supports double [][] indexing).

Internally there is one other change, going from a malloced int array to a vector<bool> for keeping track of if points are inside or not.

@tacaswell tacaswell mentioned this issue May 19, 2016
@Fafa87
Copy link
Author

Fafa87 commented May 19, 2016

I've noticed that you use Python 3.5.1 and I am stuck on Python 2.7.8 (should have mentioned that in the begining).
Maybe thats why for you it is just x5 but for me contains_points eats almost entire profiler cake.

@QuLogic
Copy link
Member

QuLogic commented May 19, 2016

Here are some remaining bisections for additional performance drops. Apparently, the C++ stdlib/STL is not very performant...

For me, 1.4.0 runs in ~0.059s; there is an increase to ~0.096s at:

4b63f75925c14b87f8b41dffb1eeefdcecf33bb6 is the first bad commit
commit 4b63f75925c14b87f8b41dffb1eeefdcecf33bb6
Author: Michael Droettboom <mdboom@gmail.com>
Date:   Wed Nov 5 18:54:24 2014 -0500

    Skip points with NaNs in point_in_path and friends

There is another increase to ~0.195s between dbeed94 and d8c1d0d (this is the CXX removal, I believe.) Performance varies a little up to ~0.216-0.220s around 6c33b0d, but then jumps to ~0.291s due to:

2c1bc6b057fef01c20cd1a47da9c35bd6264b140 is the first bad commit
commit 2c1bc6b057fef01c20cd1a47da9c35bd6264b140
Author: Michael Droettboom <mdboom@gmail.com>
Date:   Fri Nov 21 10:41:04 2014 -0500

    points_in_path etc. should return bool.

    Fix #3824

Finally, there is another jump to ~0.335s at:

91b51137259bc78a2d8dcf13f77188ab7fef299b is the first bad commit
commit 91b51137259bc78a2d8dcf13f77188ab7fef299b
Author: Michael Droettboom <mdboom@gmail.com>
Date:   Mon Jun 15 10:44:55 2015 -0400

    Use C++ stdlib for isfinite etc.

Using the proposed changes from #6446 (which mostly address the first commit) drops runtime to ~0.160s.

@QuLogic
Copy link
Member

QuLogic commented May 19, 2016

All these tests were run with 3.5.1, but as all the changes appear to originate in the C++ code, I don't think it really makes a difference if you are using 2.7.8.

@Fafa87
Copy link
Author

Fafa87 commented May 19, 2016

I've checked on Python 3.4.1 then I get to your x5:
3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:38:22) [MSC v.1600 32 bit (Intel)]
1.4.0 -> 1.5.1
0.3 -> 1.6

My Python2 also x64 (2.7.8 (default, Jun 30 2014, 16:08:48) [MSC v.1500 64 bit (AMD64)]).
I would really love to get to x5.

@mdboom
Copy link
Member

mdboom commented May 19, 2016

@QuLogic: Thanks for the very helpful analysis. I think 4b63f75 we have to keep, otherwise the results are just bogus. 2c1bc6b is easy enough to revert. 91b5113 also needs to be kept or another solution found, see #4525.

@QuLogic
Copy link
Member

QuLogic commented May 19, 2016

Performance varies a little up to ~0.216-0.220s around 6c33b0d,

The cause of this jump seems to be from d7cc9eb (runtime of ~0.214s), but that is a merge commit. The parents are e2061f7 and c31ffe8, which have runtimes of ~0.193s and ~0.092s, respectively. If I understood the diffs correctly, the left side is the CXX changes, and the right side contains 4b63f75, so this one would probably have been covered by #6446?

Using the new #6450, the runtime is ~0.158s, so about the same as #6446.

@mdboom
Copy link
Member

mdboom commented May 19, 2016

For me, with #6450, I get better runtimes than 1.4.3. Can you confirm that, @QuLogic?

I think #6450 is preferable to #6446, despite them being relatively the same on this benchmark because it doesn't require allocating additional memory (which probably matters as the data size gets larger).

@QuLogic
Copy link
Member

QuLogic commented May 19, 2016

For me, with #6450, I get better runtimes than 1.4.3. Can you confirm that, @QuLogic?

Oops, yes, I forgot to switch branches. With #6450, runtime is ~0.072s which is as you say, better than 1.4.3.

@QuLogic
Copy link
Member

QuLogic commented May 20, 2016

@Fafa87 Can you try out the v1.5.x branch?

@Fafa87
Copy link
Author

Fafa87 commented May 21, 2016

Building matplotlib on windows turned out not to be that easy.
I will check it when it is released.

Thanks for the quick work!

tacaswell pushed a commit to tacaswell/matplotlib that referenced this issue May 22, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants