New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import menpo is now twice as fast #405

Merged
merged 4 commits into from Aug 27, 2014

Conversation

Projects
None yet
3 participants
@jabooth
Member

jabooth commented Jul 26, 2014

Based on a few experiments I ran this PR inlines a few expensive imports to speed up import menpo by around a factor of 2. There is no change to the API and usage of Menpo - we are simply being a little more lazy in loading code.

Master:

%%time
import menpo
CPU times: user 320 ms, sys: 64.2 ms, total: 384 ms
Wall time: 401 ms

This PR:

%%time
import menpo
CPU times: user 101 ms, sys: 39.4 ms, total: 141 ms
Wall time: 157 ms
@@ -1088,6 +1086,7 @@ def gaussian_pyramid(self, n_levels=3, downscale=2, sigma=None,
image_pyramid:
Generator yielding pyramid layers as menpo image objects.
"""
from skimage.transform import pyramid_gaussian # expensive

This comment has been minimized.

@jalabort

jalabort Jul 28, 2014

Member

Does this mean we pay a penalty here everytime we use Image.gaussian_pyramid?

This comment has been minimized.

@jabooth

jabooth Jul 28, 2014

Member

The first time the function is invoked the import penalty is paid (~170ms). Subsequent calls pay a tiny penalty (something like ~4μs, will check). This should be in the noise for Gaussian Pyramid computation...

This comment has been minimized.

@jalabort

jalabort Jul 28, 2014

Member

This is important knowledge for real-time applications using Gaussian Pyramid then...

@@ -29,6 +28,7 @@ def scipy_interpolation(pixels, points_to_sample, mode='constant', order=1):
sampled_image : ndarray
The pixel information sampled at each of the points.
"""
from scipy.ndimage import map_coordinates # expensive

This comment has been minimized.

@jalabort

jalabort Jul 28, 2014

Member

Same question here... This is highly used...

This comment has been minimized.

@jabooth

jabooth Jul 28, 2014

Member

Yeah this one is maybe more bold, although I'm pretty sure there is still a good order of magnitude difference between the import statement and any warp of non-trivial size. I'll do the timings with line profiler to check.

This comment has been minimized.

@patricksnape

patricksnape Jul 28, 2014

Contributor

This is just a lookup in the module dictionary once loaded. It should be seriously fast, I can't imagine it ever being a problem.

@patricksnape

This comment has been minimized.

Contributor

patricksnape commented Jul 28, 2014

Looks good to me as long as all tests pass +1

@jabooth

This comment has been minimized.

Member

jabooth commented Jul 28, 2014

Here's some numbers:

import menpo
takeo = menpo.io.import_builtin_asset('takeo.ppm').as_greyscale()

first time:

%lprun -f menpo.image.interpolation.scipy_interpolation takeo.rescale(0.3)
Timer unit: 1e-06 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    31         2         6318   3159.0     92.4      from scipy.ndimage import map_coordinates  # expensive
    32         2            3      1.5      0.0      sampled_pixel_values = []
    33                                               # Loop over every channel in image - we know last axis is always channels
    34                                               # Note that map_coordinates uses the opposite (dims, points) convention
    35                                               # to us so we transpose
    36         2            5      2.5      0.1      points_to_sample_t = points_to_sample.T
    37         4            8      2.0      0.1      for i in xrange(pixels.shape[-1]):
    38         2            9      4.5      0.1          sampled_pixel_values.append(map_coordinates(pixels[..., i],
    39         2            2      1.0      0.0                                                      points_to_sample_t,
    40         2            2      1.0      0.0                                                      mode=mode,
    41         2          465    232.5      6.8                                                      order=order))
    42         4            9      2.2      0.1      sampled_pixel_values = [v.reshape([-1, 1]) for v in sampled_pixel_values]
    43         2           17      8.5      0.2      return np.concatenate(sampled_pixel_values, axis=1)

subsequently:

%lprun -f menpo.image.interpolation.scipy_interpolation takeo.rescale(0.3)
Timer unit: 1e-06 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    31         2           17      8.5      2.8      from scipy.ndimage import map_coordinates  # expensive
    32         2            2      1.0      0.3      sampled_pixel_values = []
    33                                               # Loop over every channel in image - we know last axis is always channels
    34                                               # Note that map_coordinates uses the opposite (dims, points) convention
    35                                               # to us so we transpose
    36         2            2      1.0      0.3      points_to_sample_t = points_to_sample.T
    37         4            9      2.2      1.5      for i in xrange(pixels.shape[-1]):
    38         2            7      3.5      1.2          sampled_pixel_values.append(map_coordinates(pixels[..., i],
    39         2            0      0.0      0.0                                                      points_to_sample_t,
    40         2            2      1.0      0.3                                                      mode=mode,
    41         2          538    269.0     89.2                                                      order=order))
    42         4            9      2.2      1.5      sampled_pixel_values = [v.reshape([-1, 1]) for v in sampled_pixel_values]
    43         2           17      8.5      2.8      return np.concatenate(sampled_pixel_values, axis=1)

so it's 8.5 micro seconds. That is larger than I expected - thoughts?

@patricksnape

This comment has been minimized.

@jabooth

This comment has been minimized.

Member

jabooth commented Jul 28, 2014

Very interesting, I'll try that in a mo and report back

@jabooth

This comment has been minimized.

Member

jabooth commented Jul 28, 2014

first:

    32                                               global map_coordinates
    33         2            7      3.5      0.1      if map_coordinates is None:
    34         1         6993   6993.0     93.1          from scipy.ndimage import map_coordinates  # expensive
    35         2            2      1.0      0.0      sampled_pixel_values = []
    36                                               # Loop over every channel in image - we know last axis is always channels
    37                                               # Note that map_coordinates uses the opposite (dims, points) convention
    38                                               # to us so we transpose
    39         2            5      2.5      0.1      points_to_sample_t = points_to_sample.T
    40         4            7      1.8      0.1      for i in xrange(pixels.shape[-1]):
    41         2            7      3.5      0.1          sampled_pixel_values.append(map_coordinates(pixels[..., i],
    42         2            2      1.0      0.0                                                      points_to_sample_t,
    43         2            1      0.5      0.0                                                      mode=mode,
    44         2          464    232.0      6.2                                                      order=order))
    45         4            9      2.2      0.1      sampled_pixel_values = [v.reshape([-1, 1]) for v in sampled_pixel_values]
    46         2           17      8.5      0.2      return np.concatenate(sampled_pixel_values, axis=1)

second:

    32                                               global map_coordinates
    33         2            4      2.0      0.7      if map_coordinates is None:
    34                                                   from scipy.ndimage import map_coordinates  # expensive
    35         2            2      1.0      0.3      sampled_pixel_values = []
    36                                               # Loop over every channel in image - we know last axis is always channels
    37                                               # Note that map_coordinates uses the opposite (dims, points) convention
    38                                               # to us so we transpose
    39         2            4      2.0      0.7      points_to_sample_t = points_to_sample.T
    40         4            8      2.0      1.4      for i in xrange(pixels.shape[-1]):
    41         2            6      3.0      1.0          sampled_pixel_values.append(map_coordinates(pixels[..., i],
    42         2            2      1.0      0.3                                                      points_to_sample_t,
    43         2            1      0.5      0.2                                                      mode=mode,
    44         2          533    266.5     91.3                                                      order=order))
    45         4            9      2.2      1.5      sampled_pixel_values = [v.reshape([-1, 1]) for v in sampled_pixel_values]
    46         2           15      7.5      2.6      return np.concatenate(sampled_pixel_values, axis=1)

I'm happy with that overhead, thoughts?

@jabooth

This comment has been minimized.

Member

jabooth commented Jul 28, 2014

Moved towards the global style for performance critical imports

@jabooth

This comment has been minimized.

Member

jabooth commented Aug 26, 2014

has anyone got any more thoughts on this? Would be good to make a decision one way or another...

@jabooth jabooth added the in progress label Aug 27, 2014

@jabooth jabooth self-assigned this Aug 27, 2014

Merge remote-tracking branch 'upstream/master' into importopt
Conflicts:
	menpo/io/input/mesh/base.py
@jabooth

This comment has been minimized.

Member

jabooth commented Aug 27, 2014

Going to bring this in following discussions with @patricksnape. It can be tweaked later on, but is a nice win in general.

jabooth added a commit that referenced this pull request Aug 27, 2014

Merge pull request #405 from jabooth/importopt
import menpo is now twice as fast

@jabooth jabooth merged commit 3bda08d into menpo:master Aug 27, 2014

1 of 2 checks passed

continuous-integration/appveyor AppVeyor build failed
Details
continuous-integration/travis-ci The Travis CI build passed
Details

@jabooth jabooth deleted the jabooth:importopt branch Aug 27, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment