Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

not helping the np.sign/np.abs/np.power #4190

Open
isaac-you opened this issue Jun 17, 2019 · 2 comments
Open

not helping the np.sign/np.abs/np.power #4190

isaac-you opened this issue Jun 17, 2019 · 2 comments
Labels
bug performance performance related issue

Comments

@isaac-you
Copy link

isaac-you commented Jun 17, 2019

from numba import njit
import numpy as np

def signedpower(x,a):

	'''
	x:the vector 
	a: the power 
	'''
	signedV = np.sign(x)	#signe vector: 1 -1 1 -1 ...
	absV = np.abs(x)
	powerV = np.power(absV,a)
	return signedV * powerV

signedpower_ = njit(signedpower)

v1 = np.arange(-500,500)
%timeit signedpower(v1,10)   
8.53 µs ± 28.7 ns per loop
%timeit signedpower_(v1,10)
75.6 µs ± 1.14 µs per loop

njit actually slowing down the function

@stuartarchibald
Copy link
Contributor

NOTE: edited code to include imports and updated markdown.

@stuartarchibald
Copy link
Contributor

Thanks for the report, I can reproduce. I think that what is observed is due to a number of issues including:

  1. There's not a huge amount of data or work in the function, so dispatch will have some cost.
  2. Numba dosen't by default do multi-statement shortcut deforestation, so each of those ufunc calls is creating new memory. See Performance hit with local temporary variables #3980 for discussion.
  3. Anaconda NumPy ufuncs are heavily optimised and backed by Intel MLK VML library, which is very fast.

However, I'm going to mark this as a bug needing more investigation because even throwing all the optimisations at the function, including developer-only ones, NumPy is still winning by a suspicious amount:

from numba import njit
import numpy as np
from IPython import get_ipython
ipython = get_ipython()

from numba import parfor
parfor.sequential_parfor_lowering=True

@njit(error_model="numpy", parallel=True, fastmath=True)
def signedpower(x,a):

        '''
        x:the vector
        a: the power
        '''
        signedV = np.sign(x)    #signe vector: 1 -1 1 -1 ...
        absV = np.abs(x)
        powerV = np.power(absV, a)
        ret = signedV * powerV
        return ret

v1 = np.arange(-50000, 50000)
p = 10
signedpower(v1, p)

print("numpy : %s" % ipython.magic("timeit -o -q signedpower.py_func(v1, p)").best)
print("numba : %s" % ipython.magic("timeit -o -q signedpower(v1, p)").best)


signedpower.parallel_diagnostics(level=3)

gives:

numpy : 0.0029059834200052138
numba : 0.01115928922999955

If however, the dtype of v1 is changed to float64 Numba wins by a considerable amount:

numpy : 0.012640262770000845
numba : 0.009875944289997279

and that it is faster than the int64 variant suggests there is something perhaps unusual going on.

@stuartarchibald stuartarchibald added bug performance performance related issue labels Jun 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug performance performance related issue
Projects
None yet
Development

No branches or pull requests

2 participants