# Computation on NumPy Arrays: Universal Functions

The reasons that NumPy is so improtant in the Python data science, computation on NumPy arrays can be very fast, or it cant be very slow. The key to making it fast
is to use vectorized operations, generally implemented through NumPy's universal functions(ufuncs).

This section motivates the need for NumPy's ufuncs, which can be used to make repeated calculations on array elements much more efficient.

## The Slowness of Loops

Python's default implementation (known as CPython) foes some operatiosn very slowly.



In [2]:
import numpy as np
np.random.seed(0)

def compute_reciprocals(values):
    output = np.empty(len(values))
    for i in range(len(values)):
        output[i] = 1.0 / values[i]
    return output

values = np.random.randint(1, 10, size=5)
compute_reciprocals(values)

array([0.16666667, 1.        , 0.25      , 0.25      , 0.125     ])

In [7]:
big_array = np.random.randint(1, 100, size=1_000_000)
%timeit compute_reciprocals(big_array)

2.12 s ± 107 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


Even cells phones have processing speeds measured in Giga-FLOPS, this seem almost absurdly slow. The bottleneck here is no the operation
but the typecheking and function dispatches taht CPython must do at each cycle of the loop.

Python first examines the object's type and does dynamic lookup of the correct function to use for that type. If we were working in 
compiled code instead, this type specification would be known before the code executes and the result could be computed much more efficiently.

## Introducing UFuncs

For many types of operations, NumPy provides a convenient interface into just his kind of statically typed, compiled routine. This is known as a vectorized operation. This vectorized approach is designed to push the loop into the compiled layer that underlies NumPy, leading to much faster execution.

In [None]:
print(compute_reciprocals((values)))
print(1.0 / values)

In [None]:
%timeit (1.0 / big_array)

Vectorized operations in NumPy are implemented via ufuncs, whose main purpose is to quickly execute repeated oepratiosn on values in NumPy arrays. Ufuncs are extremely flexible - before we saw an opeartion between a scalar and an array, but we can also operate between two arrays:

In [11]:
np.arange(5) / np.arange(1,6)

array([0.        , 0.5       , 0.66666667, 0.75      , 0.8       ])

And ufunc operations are not limited to be one-dimensional arrays-they can also act on multi-dimensional arrays as well:

In [12]:
x = np.arange(9).reshape(3,3)
3 ** x

array([[   1,    3,    9],
       [  27,   81,  243],
       [ 729, 2187, 6561]], dtype=int32)

computations using vectorization through ufuncs are nearly always more efficient
than their conterpart implemented using Python loops.

Any time you see such a loop in a Python script, you should consider whether it can be
replaced with a vectorized expression

# Exploring NumPy's Ufuncs
Ufuncs exist in two flavors: unary ufuncs, which operate on a single input, and binary ufuncs, which operate on two inputs. We'll see examples of both these types
of functions here.

## Array arithmetic

NumPy's ufuncs feel very natural to use because they make use of Python's narive arithmetic operatos. The standard addition, substraction, multiplication, and division can all be used:

In [15]:
x = np.arange(4)
print("x    =", x)
print("x + 5=", x + 5)
print("x - 5=", x - 5)
print("x * 5=", x * 5)
print("x / 5=", x / 5)
print("x // 2=", x // 2)

x    = [0 1 2 3]
x + 5= [5 6 7 8]
x - 5= [-5 -4 -3 -2]
x * 5= [ 0  5 10 15]
x / 5= [0.  0.2 0.4 0.6]
x // 2= [0 0 1 1]


There is also a unary ufunc for negation, and a `**` operator for exponentiation, and a `%` operator for modulus

In [16]:
print("-x   =", -x)
print("x ** 2 = ", x ** 2)
print("x % 2 = ", x % 2)

-x   = [ 0 -1 -2 -3]
x ** 2 =  [0 1 4 9]
x % 2 =  [0 1 0 1]


In addition, these can be strung together however you wish, and the standard order of operations is respected:

In [17]:
-(0.5 * x + 1) ** 2

array([-1.  , -2.25, -4.  , -6.25])

Each of these arithmetic operations are simply convenient wrappers around specific functions built into NumPy; for example, the `+` operator is a wrapper for the `add` function

In [18]:
np.add(x ,2)

array([2, 3, 4, 5])

The following table lists the arithmetic operators implemented in NumPy:


```
Operator 	Equivalent ufunc 	Description
+ 	np.add 	Addition (e.g., 1 + 1 = 2)
- 	np.subtract 	Subtraction (e.g., 3 - 2 = 1)
- 	np.negative 	Unary negation (e.g., -2)
* 	np.multiply 	Multiplication (e.g., 2 * 3 = 6)
/ 	np.divide 	Division (e.g., 3 / 2 = 1.5)
// 	np.floor_divide 	Floor division (e.g., 3 // 2 = 1)
** 	np.power 	Exponentiation (e.g., 2 ** 3 = 8)
% 	np.mod 	Modulus/remainder (e.g., 9 % 4 = 1)
```

## Absolute Value

In [20]:
x = np.array([-2, -1, 0, 1, 2])
abs(x)

array([2, 1, 0, 1, 2])

In [21]:
np.absolute(x)

array([2, 1, 0, 1, 2])

## Trigonometric functions

NumPy provides a large number of useful ufuncs, and some of the most useful for the data scientist are the trigonometric functions.
We'll start by defining an array of angles:

In [22]:
theta = np.linspace(0, np.pi, 3)

Now we can compute some trigonometric functions on these values. The values are computed to within machine precision, which is why values that should be
zero do not always hit exactly zero.

In [23]:
print("theta    = ", theta)
print("sin(theta) = ", np.sin(theta))
print("cos(theta) = ", np.cos(theta))
print("tan(theta) = ", np.tan(theta))

theta    =  [0.         1.57079633 3.14159265]
sin(theta) =  [0.0000000e+00 1.0000000e+00 1.2246468e-16]
cos(theta) =  [ 1.000000e+00  6.123234e-17 -1.000000e+00]
tan(theta) =  [ 0.00000000e+00  1.63312394e+16 -1.22464680e-16]


Inverse function are available:

In [24]:
x = [-1, 0, 1]
print("x    =", x)
print("arcsing(x) = ", np.arcsin(x))
print('arcos(x) = ', np.arccos(x))
print('arctan(x) = ', np.arctan(x))


x    = [-1, 0, 1]
arcsing(x) =  [-1.57079633  0.          1.57079633]
arcos(x) =  [3.14159265 1.57079633 0.        ]
arctan(x) =  [-0.78539816  0.          0.78539816]


There are also some specialized version that are useful for maintaining precision with very small input, `np.expm1` (exp(x) + 1) and `no.log1p()` (log(1 + x)). When `x` is very small, these functions give more precise values than if the raw `np.log` or `np.exp`

In [33]:
x = [0, 0.0000000001, 0.01, 0.1]
print('exp(x) - 1 =', np.expm1(x))
print("log(1 + x)", np.log1p(x))

exp(x) - 1 = [0.00000000e+00 1.00000000e-10 1.00501671e-02 1.05170918e-01]
log(1 + x) [0.00000000e+00 1.00000000e-10 9.95033085e-03 9.53101798e-02]


In [34]:
print('exp(x) - 1 = ', np.exp(x) -1)
print('log(1 + x)', np.log(1 + np.array(x)))

exp(x) - 1 =  [0.00000000e+00 1.00000008e-10 1.00501671e-02 1.05170918e-01]
log(1 + x) [0.00000000e+00 1.00000008e-10 9.95033085e-03 9.53101798e-02]


## Specialized ufuncs

NumPy has many more ufuncs available, including hyperbolic trig functions, bitwise arithmetic, comparison operators, conversion from radians to degrees,
rounding and remainders, and much more. A look through the NumPy documentation reveals a lot of interesting functionality.

Another excellent source for more specialized and obscure ufuncs is the submodule `scipy.special`. If you want ot compute some obscure mathematical function on
your data, chances are it is implemented in `scipy.special`. There are far too many functions to list them all, but the following snippet shows a couple that might come up in a statistics context:

In [35]:
from scipy import special

In [38]:
# Gamma functions ( generalized factorials) and related functions
x = [1, 5, 10]
print("gamma(x) =", special.gamma(x))
print('ln|gamma(x)| =', special.gammaln(x))
print("beta(x, 2) =", special.beta(x, 2))

gamma(x) = [1.0000e+00 2.4000e+01 3.6288e+05]
ln|gamma(x)| = [ 0.          3.17805383 12.80182748]
beta(x, 2) = [0.5        0.03333333 0.00909091]


In [39]:
# Error function (integral of Gaussian)
x = np.array([0, 0.3, 0.7, 1.0])
print("erf(x) = ", special.erf(x))
print('erfc(x) =', special.erfc(x))
print('erfinv(x) =', special.erfinv(x))

erf(x) =  [0.         0.32862676 0.67780119 0.84270079]
erfc(x) = [1.         0.67137324 0.32219881 0.15729921]
erfinv(x) = [0.         0.27246271 0.73286908        inf]


## Advanced Ufunc Features

Many NumPy users make use of ufuncs without ever learning theur full set of features. We'll outline a few specialized features of ufuncs here

## Specifying output

This can be used to write computation results directly to the memory location where you'd like them to be. For all ufuncs, this can be done using the `out` argument of the function:

In [43]:
x = np.arange(5)
y = np.empty(5)

np.multiply(x, 10, out=y)
print(y)

[ 0. 10. 20. 30. 40.]


This can even can be used with array views. For example, we can write the results of a computation to every other element of a specified array:

In [54]:
y = np.zeros(10)
np.power(2, x, out=y[::2])
print(y)

[ 1.  0.  2.  0.  4.  0.  8.  0. 16.  0.]


This create a temporary array to hold the results of `2**x`, followed by a second operation copying those values into the `y` array. For very large arrays the
memory savings from careful use of the `out` argument can be significant.

In [57]:
y[::2] = 2 ** x
print(y)

[ 1.  0.  2.  0.  4.  0.  8.  0. 16.  0.]


## Aggregates
For binary ufuncs, there are some interesting aggregates that can be computed
directly from the object. For exaple, if we'd like to reduce an array with a particular operation, we can use the `reduce` method for any ufunc. A reduce repeatedly applies a given operation to the elements of an array until only a single result remains.

For example, calling `reduce` on the `add` ufunc returns the sum of all elements in the array:

In [58]:
x = np.arange(1, 6)
np.add.reduce(x)

15

Similarly, calling `reduce` on the `multiply` ufunc results in the product of all array elements:

In [60]:
np.multiply.reduce(x)

120

If we'd like to store all the intermediate results of the computation, we can instead use `accumulate`:

In [61]:
np.add.accumulate(x)

array([ 1,  3,  6, 10, 15], dtype=int32)

In [63]:
np.multiply.accumulate(x)

array([  1,   2,   6,  24, 120], dtype=int32)

For this cases there are dedicated NumPy functions to compute the same results

In [64]:
np.sum(x)

15

In [65]:
np.prod(x)

120

In [66]:
np.cumsum(x)

array([ 1,  3,  6, 10, 15], dtype=int32)

In [67]:
np.cumprod(x)

array([  1,   2,   6,  24, 120], dtype=int32)

## Outer products

Finally, any ufunc can compute the output of all pairs of two different inputs using the `outer` method. This allows you, in one line, to do things like create a multiplication table:

In [68]:
x = np.arange(1, 6)

In [71]:
np.multiply.outer(x, x)

array([[ 1,  2,  3,  4,  5],
       [ 2,  4,  6,  8, 10],
       [ 3,  6,  9, 12, 15],
       [ 4,  8, 12, 16, 20],
       [ 5, 10, 15, 20, 25]])