In this post I will teach myself how to speed up iterative computations with numpy's ufunc internals!

In [1]:
import numpy as np

In [3]:
help(np.frompyfunc)

Help on built-in function frompyfunc in module numpy.core.umath:

frompyfunc(...)
    frompyfunc(func, nin, nout)
    
    Takes an arbitrary Python function and returns a Numpy ufunc.
    
    Can be used, for example, to add broadcasting to a built-in Python
    function (see Examples section).
    
    Parameters
    ----------
    func : Python function object
        An arbitrary Python function.
    nin : int
        The number of input arguments.
    nout : int
        The number of objects returned by `func`.
    
    Returns
    -------
    out : ufunc
        Returns a Numpy universal function (``ufunc``) object.
    
    Notes
    -----
    The returned ufunc always returns PyObject arrays.
    
    Examples
    --------
    Use frompyfunc to add broadcasting to the Python function ``oct``:
    
    >>> oct_array = np.frompyfunc(oct, 1, 1)
    >>> oct_array(np.array((10, 30, 100)))
    array([012, 036, 0144], dtype=object)
    >>> np.array((oct(10), oct(30), oct(100))) # for

A simple example:

In [36]:
def myloop(arr, number):
    for i in range(arr.shape[0]):
        for j in range(arr.shape[1]):
            arr[i, j] = arr[i, j] + number

In [37]:
myarr = np.zeros([100, 100])

In [38]:
%timeit myloop(myarr, 5)

100 loops, best of 3: 4.55 ms per loop


In [26]:
def myufunc(element, number):
    return element + number

In [30]:
loop2 = np.frompyfunc(myufunc, 2, 1)

In [32]:
%timeit loop2(myarr, 5)

1000 loops, best of 3: 2.07 ms per loop


In [39]:
a = 5

In [40]:
%timeit a*a

The slowest run took 25.10 times longer than the fastest. This could mean that an intermediate result is being cached 
10000000 loops, best of 3: 63.7 ns per loop


In [41]:
%timeit a**2

The slowest run took 8.04 times longer than the fastest. This could mean that an intermediate result is being cached 
1000000 loops, best of 3: 246 ns per loop


In [42]:
import math

In [43]:
%timeit np.sqrt(a)

The slowest run took 33.64 times longer than the fastest. This could mean that an intermediate result is being cached 
1000000 loops, best of 3: 820 ns per loop


In [44]:
%timeit math.sqrt(5)

The slowest run took 32.11 times longer than the fastest. This could mean that an intermediate result is being cached 
10000000 loops, best of 3: 141 ns per loop
