## Universal Functions

Computation on NumPy arrays can be very fast, or it can be very slow. The key to making it fast is to use Vectorized operations. And it is done through Numpy's universal functions (ufuncs)

### The Slowness of Loops

the default implementation of python (known as CPython) does some operations very slow.

Recently there have been verious attempts to aggress this weakness:
well known are

    PyPy project
    Cython project
    Numba project

which converts snippets of Python code to fast LLVM bytecode.

In [2]:
# A straightforward approach might look like this:

import numpy as np
np.random.seed(0)

def compute_reciprocals(values):
    output = np.empty(len(values))
    for i in range(len(values)):
        output[i] = 1 / values[i]
    return output

In [3]:
values = np.random.randint(1, 10, size = 5)

values

array([6, 1, 4, 4, 8])

In [4]:
compute_reciprocals(values)

array([0.16666667, 1.        , 0.25      , 0.25      , 0.125     ])

In [5]:
compute_reciprocals([1,2])

array([1. , 0.5])

This feels fairly natural to someone form, say, a C or Java background. But it is too slow with large input.

    we can check it using %timeit

In [6]:
big_array = np.random.randint(1, 100, size =  1000000)

%timeit compute_reciprocals(big_array)

348 ms ± 6.47 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


it takes many seconds to compute these operations and store results.

## Introducing UFuncs

NumPy provides a convenient interface into just this kind of statically typed, compiled routine. This is known as a Vectorized operation.

You can accomplish this by simply performing an operation on the array, which will then be applied to each element.

This vectorized approach is designed to push the loop into the compiled layer that underlies NumPy, leading to much faster execution.

In [7]:
# Compare the results of the following two:

compute_reciprocals(values), print(1 / values)

[0.16666667 1.         0.25       0.25       0.125     ]


(array([0.16666667, 1.        , 0.25      , 0.25      , 0.125     ]), None)

In [8]:
%time (1 / big_array)

CPU times: total: 15.6 ms
Wall time: 8 ms


array([0.1       , 0.01190476, 0.04545455, ..., 0.01428571, 0.01098901,
       0.01149425])

In [9]:
%time (compute_reciprocals(big_array))

CPU times: total: 406 ms
Wall time: 506 ms


array([0.1       , 0.01190476, 0.04545455, ..., 0.01428571, 0.01098901,
       0.01149425])

> Vectorized operations in NumPy are implemented via ufuncs, whose main purpose is to quickly execute repreated operations on values in NumPy arrays.

In [10]:
# These are too flexible

np.arange(5) / np.arange(1, 6)

array([0.        , 0.5       , 0.66666667, 0.75      , 0.8       ])

> These are not limited to 1D arrays

In [11]:
x = np.arange(9).reshape((3,3))

x, 2**x

(array([[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]]),
 array([[  1,   2,   4],
        [  8,  16,  32],
        [ 64, 128, 256]], dtype=int32))

#### Computations using Vectorization is always more efficent than through python loops.

### Exploring NumPy's UFuncs

There are two types of UFuncs

    1. unary ufuncs, which operate on a single input
    2. binary ufuncs, which operate in two inputs

### Array Arithmetic

NumPy's ufuncs feel very natural to use (+,-,/,*) because they make use of Python's native arithmetic operators.

In [12]:
print('x =\n', x)
print('\nx + 5 =\n', x + 5)
print('\nx - 5 =\n', x - 5)
print('\nx * 5 =\n', x * 5)
print('\nx / 5 =\n', x / 5)
print('\nx // 5 =\n', x // 5)
print('\n-x =\n', -x)
print('\nx ** 5 =\n', x ** 5)
print('\nx % 5 =\n', x % 5)

x =
 [[0 1 2]
 [3 4 5]
 [6 7 8]]

x + 5 =
 [[ 5  6  7]
 [ 8  9 10]
 [11 12 13]]

x - 5 =
 [[-5 -4 -3]
 [-2 -1  0]
 [ 1  2  3]]

x * 5 =
 [[ 0  5 10]
 [15 20 25]
 [30 35 40]]

x / 5 =
 [[0.  0.2 0.4]
 [0.6 0.8 1. ]
 [1.2 1.4 1.6]]

x // 5 =
 [[0 0 0]
 [0 0 1]
 [1 1 1]]

-x =
 [[ 0 -1 -2]
 [-3 -4 -5]
 [-6 -7 -8]]

x ** 5 =
 [[    0     1    32]
 [  243  1024  3125]
 [ 7776 16807 32768]]

x % 5 =
 [[0 1 2]
 [3 4 0]
 [1 2 3]]


These Functions (ufuncs) can be string together however you wish, and the standard order of operations in respected.

In [13]:
-(5.0 * np.arange(4) +1) **2

array([  -1.,  -36., -121., -256.])

> All of these artithmetic operations are simply convenient wrappers around specific functions built into NumPy; e.g. the + operator in a wrapper for the add function.

In [14]:
np.add(np.arange(4, 8) , 1)

array([5, 6, 7, 8])

![Arithmetic_operators_implemented_in_NumPy](../../Pictures/Arithmetic_operators_implemented_in_NumPy.jpg)

#### Absolute Value

Just as NumPy understands Python's built-in arithmetic operators, it also understands Python's built-in absolute value function:

In [15]:
x = np.array([1,2,-3,4,-5,-6.3])

abs(x) # make all values positive

array([1. , 2. , 3. , 4. , 5. , 6.3])

> The corresponding NumPy ufunc is np.absolute, which is also available under the alias np.abs:

In [16]:
np.absolute(x)

array([1. , 2. , 3. , 4. , 5. , 6.3])

In [17]:
np.abs(x)

array([1. , 2. , 3. , 4. , 5. , 6.3])

    This ufunc can also handle complex data, in which the absolute value returns the magnitude:

In [18]:
x = np.array([3 - 4j, 4 - 3j, 2 + 0j, 0 + 1j, 1 - 1j])

np.abs(x)

array([5.        , 5.        , 2.        , 1.        , 1.41421356])

#### Trigonometric Functions

NumPy provides a large numver of useful ufuncs, and some of the most useful for the data scientist are the trigonometric functions.

    We'll start by defining an array of angles.

In [19]:
theta = np.linspace(0, np.pi, 3)

In [20]:
print('theta = ', theta)
print('sin(theta) =', np.sin(theta))
print('cos(theta) =', np.cos(theta))
print('tan(theta) =', np.tan(theta))
print('sinh(theta) =', np.sinh(theta))
print('cosh(theta) =', np.cosh(theta))
print('tanh(theta) =', np.tanh(theta))

theta =  [0.         1.57079633 3.14159265]
sin(theta) = [0.0000000e+00 1.0000000e+00 1.2246468e-16]
cos(theta) = [ 1.000000e+00  6.123234e-17 -1.000000e+00]
tan(theta) = [ 0.00000000e+00  1.63312394e+16 -1.22464680e-16]
sinh(theta) = [ 0.          2.3012989  11.54873936]
cosh(theta) = [ 1.          2.50917848 11.59195328]
tanh(theta) = [0.         0.91715234 0.99627208]


Inverse trigonometric functions are:

In [21]:
x = [-1, 0, 1]

print('x    =', x)
print('arcsin(x)', np.arcsin(x))
print('arccos(x)', np.arccos(x))
print('arctan(x)', np.arctan(x))

x    = [-1, 0, 1]
arcsin(x) [-1.57079633  0.          1.57079633]
arccos(x) [3.14159265 1.57079633 0.        ]
arctan(x) [-0.78539816  0.          0.78539816]


#### Exponents and Logarithms

Another common type of operation available in a NumPy ufunc are the exponentials:

In [22]:
x = [1, 2, 3]

print('x    =', x)
print('e^x =', np.exp(x))
print('2^x =', np.exp2(x))
print('3^x =', np.power(3, x))

x    = [1, 2, 3]
e^x = [ 2.71828183  7.3890561  20.08553692]
2^x = [2. 4. 8.]
3^x = [ 3  9 27]


The inverse of the exponentials, the Logarithms, are also available. The basic np.log gives the natural logarithm. Just to prefer to compute the base-2 log or base-10.

In [23]:
x = [1,2,4,10]

print('x =', x)
print('ln(x) =', np.log(x))
print('log(x) =', np.log2(x))
print('log10(x) =', np.log10(x))

x = [1, 2, 4, 10]
ln(x) = [0.         0.69314718 1.38629436 2.30258509]
log(x) = [0.         1.         2.         3.32192809]
log10(x) = [0.         0.30103    0.60205999 1.        ]


### Specialized ufuncs

NumPy has many more ufuncs available, including hyperbolic trig functions, bitwise arithmetic, comparison operators, conversions from radians to degrees, rounding and remainders, and much more.

Another excellent source for more specialized ufuncs is the submodule scipy.special.

In [24]:
from scipy import special

In [25]:
# Gamma functions (generalized factorials) and related functions
x = [1, 5, 10]
print("gamma(x) =", special.gamma(x))
print("ln|gamma(x)| =", special.gammaln(x))
print("beta(x, 2) =", special.beta(x, 2))

gamma(x) = [1.0000e+00 2.4000e+01 3.6288e+05]
ln|gamma(x)| = [ 0.          3.17805383 12.80182748]
beta(x, 2) = [0.5        0.03333333 0.00909091]


In [26]:
# Error function (integral of Gaussian)
# its complement, and its inverse
x = np.array([0, 0.3, 0.7, 1.0])
print("erf(x) =", special.erf(x))
print("erfc(x) =", special.erfc(x))
print("erfinv(x) =", special.erfinv(x))

erf(x) = [0.         0.32862676 0.67780119 0.84270079]
erfc(x) = [1.         0.67137324 0.32219881 0.15729921]
erfinv(x) = [0.         0.27246271 0.73286908        inf]


## Advanced Ufunc Features

Here are some features of NumPy.

## 1. Specifying output

For large calculations, it is sometimes useful to be able to specidy the array where the results of the calculation will be stored. Rather than createing a tremprory array, you can use this to write computation results directly to the memory location where you'd like them to be.

For all Ufuncs, you can do this using the out argoument of the function:

In [29]:
x = np.arange(5)

y = np.empty(5)

np.multiply(x, 10, out = y)

# the value of the result is stored now in y

array([ 0., 10., 20., 30., 40.])

This can even be used with array views. For example, we can wirte the results of the computation to every other element of a specified array:

In [34]:
y = np.zeros(10)

np.power(2, x, out= y[::2])

y

array([ 1.,  0.,  2.,  0.,  4.,  0.,  8.,  0., 16.,  0.])

## 2. Aggregates

For binary ufuncs, there are some interersing aggregates that can be computed directly form the object. For example, if we'd like to reduce an array with a particular operations, we can use the reduce method of any ufunc. A reuduce repeatedly applies a given operation to the elements of an array until only a single reuslt remains. 

For example, calling reduce on the add ufunc returns the sum of al elements in the array:

In [43]:
x = np.arange(1, 6)

print(x)

np.add.reduce(x)

# For example, add.reduce() is equivalent to sum().


[1 2 3 4 5]


15

Similarly, calling reduce on the multiply ufunc results in the product of all array elements:

In [45]:
np.multiply.reduce(x)

120

In [46]:
help(np.multiply.reduce)

Help on built-in function reduce:

reduce(...) method of numpy.ufunc instance
    reduce(array, axis=0, dtype=None, out=None, keepdims=False, initial=<no value>, where=True)
    
    Reduces `array`'s dimension by one, by applying ufunc along one axis.
    
    Let :math:`array.shape = (N_0, ..., N_i, ..., N_{M-1})`.  Then
    :math:`ufunc.reduce(array, axis=i)[k_0, ..,k_{i-1}, k_{i+1}, .., k_{M-1}]` =
    the result of iterating `j` over :math:`range(N_i)`, cumulatively applying
    ufunc to each :math:`array[k_0, ..,k_{i-1}, j, k_{i+1}, .., k_{M-1}]`.
    For a one-dimensional array, reduce produces results equivalent to:
    ::
    
     r = op.identity # op = ufunc
     for i in range(len(A)):
       r = op(r, A[i])
     return r
    
    For example, add.reduce() is equivalent to sum().
    
    Parameters
    ----------
    array : array_like
        The array to act on.
    axis : None or int or tuple of ints, optional
        Axis or axes along which a reduction is performed.
 

If we'd like to store all the intermediate results of the computation, we can instead use accumulate:

In [55]:
np.add.accumulate(x) # x = 1,2,3,4,5

array([1, 2, 3, 4, 5])

In [58]:
# also to mulitply

np.multiply.accumulate(x)

# 1*2 = 2 then 2*3 = 6 then 6 * 4 = 24 then 4*5 = 120

array([  1,   2,   6,  24, 120])

> Note that for these particular cases, there are dedicated NumPy functions to 
compute the results (np.sum, np.prod, np.cumsum, np.cumprod)

## 3. Outer products

Finally, any ufunc can compute the output of all pairs of two different inputs using the outer method. This allows you, in one line, to do things like create a multiplication table:

In [59]:
x = np.arange(1, 6)

np.multiply.outer(x, x)

array([[ 1,  2,  3,  4,  5],
       [ 2,  4,  6,  8, 10],
       [ 3,  6,  9, 12, 15],
       [ 4,  8, 12, 16, 20],
       [ 5, 10, 15, 20, 25]])

Ufuncs: Learning More


More information on universal functions (including the full list of available 
func‐tions) can be found on the NumPy and SciPy documentation websites.