New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

6% speedup by using hypot ufunc #4643

Merged
merged 1 commit into from Feb 28, 2016

Conversation

Projects
None yet
3 participants
@mhvk
Copy link
Contributor

mhvk commented Feb 28, 2016

This is a copy of #4636 which corrects the PEP8 issue.

mhvk added a commit that referenced this pull request Feb 28, 2016

Merge pull request #4643 from mhvk/pr-4636
6% speedup by using hypot ufunc

@mhvk mhvk merged commit f6bd9c6 into astropy:master Feb 28, 2016

0 of 2 checks passed

continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details

@mhvk mhvk added this to the v1.2.0 milestone Feb 28, 2016

@mhvk mhvk self-assigned this Feb 28, 2016

@mhvk

This comment has been minimized.

Copy link
Contributor

mhvk commented Feb 28, 2016

Merged, since tests passed in #4636 and that one was based on current master.

@juliantaylor

This comment has been minimized.

Copy link

juliantaylor commented Feb 29, 2016

note on linux this will typically be significantly slower, the glibc hypot function is very accurate but also very slow.

@drhirsch

This comment has been minimized.

Copy link
Contributor

drhirsch commented Feb 29, 2016

@juliantaylor I tested this on Ubuntu 14.04 to note the speedup of hypot over sqrt(x2 + y2)

@drhirsch

This comment has been minimized.

Copy link
Contributor

drhirsch commented Feb 29, 2016

@mhvk thanks for adding the PEP8 space, I was offline for a bit.

@juliantaylor

This comment has been minimized.

Copy link

juliantaylor commented Feb 29, 2016

that is very strange, on an ubuntu system it should be about half as fast, it is on mine
typically this is only faster on mac.

@drhirsch

This comment has been minimized.

Copy link
Contributor

drhirsch commented Feb 29, 2016

@juliantaylor
cat /proc/cpuinfo | grep "model name"
Intel(R) Core(TM) i7-4910MQ CPU @ 2.90GHz
uname -a
Linux wpad-pc 4.2.0-30-generic #36-Ubuntu SMP Fri Feb 26 00:58:07 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Python 3.5.1
numpy.version '1.10.4'
mkl.version '1.1.2'

@juliantaylor

This comment has been minimized.

Copy link

juliantaylor commented Feb 29, 2016

the cpu should not matter much hypot has no specializations.
what is your benchmark?

@mhvk

This comment has been minimized.

Copy link
Contributor

mhvk commented Feb 29, 2016

I also ran tests on my Debian system and for a 100000-element array, np.hypot is indeed slower, while for a million-element array, it wins. So, it seems that storing and retrieving the intermediate copies are what make the explicit expression slower for large numbers. Overall, the differences are not that large, so if this improves accuracy in some corner cases, I think it is still a win.

In [2]: a = np.arange(1000000)

In [3]: b = np.arange(1000000)

In [4]: %timeit np.hypot(a, b)
10 loops, best of 3: 26.8 ms per loop

In [5]: %timeit np.sqrt(a**2+b**2)
10 loops, best of 3: 27.6 ms per loop

In [6]: a =  np.arange(100000)

In [7]: b = np.arange(100000)

In [8]: %timeit np.hypot(a, b)
100 loops, best of 3: 2.54 ms per loop

In [9]: %timeit np.sqrt(a**2+b**2)
1000 loops, best of 3: 1.6 ms per loop
@drhirsch

This comment has been minimized.

Copy link
Contributor

drhirsch commented Feb 29, 2016

import numpy as np
x=np.array([1.,2,3,4]) #float64
y=np.array([5.,6,7,8])
%timeit np.sqrt(x**2+y**2)
100000 loops, best of 3: 1.95 µs per loop
 %timeit hypot(x,y)
1000000 loops, best of 3: 413 ns per loop
@drhirsch

This comment has been minimized.

Copy link
Contributor

drhirsch commented Feb 29, 2016

from astropy.coordinates.angle_utilities import angular_separation  # 1.1.1

%timeit angular_separation(1,2,3,4)
100000 loops, best of 3: 8.32 µs per loop

(modify angle_utilities.py angular_separation line 659 to use np.hypot)

%timeit angular_separation(1,2,3,4)
100000 loops, best of 3: 7.56 µs per loop
@drhirsch

This comment has been minimized.

Copy link
Contributor

drhirsch commented Feb 29, 2016

These are from Python 3.5.1, Ipython 4.1.1.

ldd --version
ldd (Ubuntu GLIBC 2.21-0ubuntu4.1) 2.21
@drhirsch

This comment has been minimized.

Copy link
Contributor

drhirsch commented Feb 29, 2016

@mhvk @juliantaylor
For the edge case of a.dtype=b.dtype=int64, sqrt(a**2+b**2) can sometimes be slightly faster.

However, for the typical use case of a.dtype=b.dtype=float64, hypot(a,b) was sometimes faster than sqrt(a2+b2) for the cases I tried as in your tests above as well.

If you say
a=np.arange(100000,dtype=float) that will show np.hypot is faster.

@mhvk

This comment has been minimized.

Copy link
Contributor

mhvk commented Feb 29, 2016

@scienceopen - that's odd: I confirm np.hypot is faster also for small arrays, but for intermediate arrays I still get that it is somewhat slower, even when I change to float. Anyway, probably not worth getting to the bottom of that...

In [20]: a =  np.arange(100000.)

In [21]: b = np.arange(100000.)

In [22]: %timeit np.hypot(a, b)
100 loops, best of 3: 2.25 ms per loop

In [23]: %timeit np.sqrt(a**2+b**2)
1000 loops, best of 3: 1.86 ms per loop

In [24]: a =  np.arange(100.)

In [25]: b = np.arange(100.)

In [26]: %timeit np.hypot(a, b)
100000 loops, best of 3: 3.02 µs per loop

In [27]: %timeit np.sqrt(a**2+b**2)
100000 loops, best of 3: 4.73 µs per loop
@drhirsch

This comment has been minimized.

Copy link
Contributor

drhirsch commented Mar 2, 2016

@mhvk looks like the breakpoint where hypot becomes slower is at a few hundred, scroll down to plots at

https://github.com/scienceopen/python-performance

As noted in bench_hypot.f90, Fortran is the same speed at either, despite Fortran also using a more numerically stable algorithm for hypot()

@mhvk mhvk deleted the mhvk:pr-4636 branch Jun 15, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment