# Computation on NumPy Arrays: Universal Functions

Up until now, we have been discussing some of the basic nuts and bolts of NumPy; in the next few sections, we will dive into the reasons that NumPy is so important in the Python data science world.
`Namely, it provides an easy and flexible interface to optimized computation with arrays of data`.

Computation on NumPy arrays can be very fast, or it can be very slow.
The key to making it fast is to use *`vectorized`* operations, generally implemented through NumPy's *`universal functions`* (ufuncs).
This section motivates the need for NumPy's ufuncs, which can be used to make repeated calculations on array elements much more efficient.
It then introduces many of the most common and useful arithmetic ufuncs available in the NumPy package.

## The Slowness of Loops

Python's default implementation (known as CPython) does some operations very slowly.
This is in part due to the dynamic, interpreted nature of the language: the fact that types are flexible, so that sequences of operations cannot be compiled down to efficient machine code as in languages like C and Fortran.

The relative sluggishness of Python generally manifests itself in situations where many small operations are being repeated – for instance looping over arrays to operate on each element.
For example, imagine we have an array of values and we'd like to compute the reciprocal of each.
A straightforward approach might look like this:

In [52]:
import numpy as np
np.random.seed(0)

def compute_reciprocals(values):
    output = np.empty(len(values))
    for i in range(len(values)):
        output[i] = 1.0 / values[i]
    return output

In [53]:
values = np.random.randint(1, 10, size=5)
values

array([6, 1, 4, 4, 8])

In [54]:
compute_reciprocals(values)

array([0.16666667, 1.        , 0.25      , 0.25      , 0.125     ])

This implementation probably feels fairly natural to someone from, say, a C or Java background.
But if we measure the execution time of this code for a large input, we see that this operation is very slow, perhaps surprisingly so!
We'll benchmark this with IPython's ``%timeit`` magic (discussed in [Profiling and Timing Code](01.07-Timing-and-Profiling.ipynb)):

In [56]:
big_array = np.random.randint(1, 100, size=1000000)
big_array

array([83, 29, 82, ..., 25, 44, 36])

In [59]:
%timeit 
compute_reciprocals(big_array)

array([0.01204819, 0.03448276, 0.01219512, ..., 0.04      , 0.02272727,
       0.02777778])

In [58]:
%timeit 1/big_array

17.4 ms ± 1.84 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


It takes several seconds to compute these million operations and to store the result!
When even cell phones have processing speeds measured in Giga-FLOPS (i.e., billions of numerical operations per second), this seems almost absurdly slow.
It turns out that the bottleneck here is not the operations themselves, but the type-checking and function dispatches that CPython must do at each cycle of the loop.
Each time the reciprocal is computed, Python first examines the object's type and does a dynamic lookup of the correct function to use for that type.
If we were working in compiled code instead, this type specification would be known before the code executes and the result could be computed much more efficiently.

## Introducing UFuncs

For many types of operations, NumPy provides a convenient interface into just this kind of statically typed, compiled routine. This is known as a *vectorized* operation.
This can be accomplished by simply performing an operation on the array, which will then be applied to each element.
This vectorized approach is designed to push the loop into the compiled layer that underlies NumPy, leading to much faster execution.

Compare the results of the following two:

In [60]:
values

array([6, 1, 4, 4, 8])

In [62]:
print(compute_reciprocals(values))
print(1.0 / values)

[0.16666667 1.         0.25       0.25       0.125     ]
[0.16666667 1.         0.25       0.25       0.125     ]


In [63]:
1/ [6, 1, 4, 4, 8]

TypeError: unsupported operand type(s) for /: 'int' and 'list'

In [64]:
1 / np.array([6, 1, 4, 4, 8])

array([0.16666667, 1.        , 0.25      , 0.25      , 0.125     ])

Looking at the execution time for our big array, we see that it completes orders of magnitude faster than the Python loop:

Vectorized operations in NumPy are implemented via *ufuncs*, whose main purpose is to quickly execute repeated operations on values in NumPy arrays.
Ufuncs are extremely flexible – before we saw an operation between a scalar and an array, but we can also operate between two arrays:

In [65]:
np.arange(5) 

array([0, 1, 2, 3, 4])

In [66]:
 np.arange(1, 6)

array([1, 2, 3, 4, 5])

In [67]:
np.arange(5) / np.arange(1, 6)

array([0.        , 0.5       , 0.66666667, 0.75      , 0.8       ])

And ufunc operations are not limited to one-dimensional arrays–they can also act on multi-dimensional arrays as well:

In [128]:
x = np.arange(9).reshape((3, 3))
x

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [70]:
x/2

array([[0. , 0.5, 1. ],
       [1.5, 2. , 2.5],
       [3. , 3.5, 4. ]])

Computations using vectorization through ufuncs are nearly always more efficient than their counterpart implemented using Python loops, especially as the arrays grow in size.
Any time you see such a loop in a Python script, you should consider whether it can be replaced with a vectorized expression.

In [9]:
y = np.arange(9)
y

array([0, 1, 2, 3, 4, 5, 6, 7, 8])

In [10]:
sum(y)

36

In [66]:
np.sum(y)

36

In [129]:
x

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [130]:
sum(x)

array([ 9, 12, 15])

In [132]:
np.sum(x)

36

In [74]:
np.sum(x, axis = 0)

array([ 9, 12, 15])

In [76]:
np.sum(x, axis = 1)

array([ 3, 12, 21])

In [11]:
x + x

array([[ 0,  2,  4],
       [ 6,  8, 10],
       [12, 14, 16]])

In [12]:
np.add(x,x)

array([[ 0,  2,  4],
       [ 6,  8, 10],
       [12, 14, 16]])

In [13]:
x.sum(axis = 1)

array([ 3, 12, 21])

In [14]:
x.sum(axis = 0)

array([ 9, 12, 15])

## Exploring NumPy's UFuncs

Ufuncs exist in two flavors: *unary ufuncs*, which operate on a single input, and *binary ufuncs*, which operate on two inputs.
We'll see examples of both these types of functions here.

### Array arithmetic

NumPy's ufuncs feel very natural to use because they make use of Python's native arithmetic operators.
The standard addition, subtraction, multiplication, and division can all be used:

In [71]:
lis = [0, 1, 2, 3]
lis

[0, 1, 2, 3]

In [72]:
[i+5 for i in lis]

[5, 6, 7, 8]

In [74]:
x = np.arange(4)
x

array([0, 1, 2, 3])

In [75]:
x + 5

array([5, 6, 7, 8])

In [76]:
print("x     =", x)
print("x + 5 =", x + 5)
print("x - 5 =", x - 5)
print("x * 2 =", x * 2)
print("x / 2 =", x / 2)
print("x // 2 =", x // 2)  # floor division

x     = [0 1 2 3]
x + 5 = [5 6 7 8]
x - 5 = [-5 -4 -3 -2]
x * 2 = [0 2 4 6]
x / 2 = [0.  0.5 1.  1.5]
x // 2 = [0 0 1 1]


There is also a unary ufunc for negation, and a ``**`` operator for exponentiation, and a ``%`` operator for modulus:

In [77]:
print("-x     = ", -x)
print("x ** 2 = ", x ** 2)
print("x % 2  = ", x % 2)

-x     =  [ 0 -1 -2 -3]
x ** 2 =  [0 1 4 9]
x % 2  =  [0 1 0 1]


In addition, these can be strung together however you wish, and the standard order of operations is respected:

In [78]:
-(0.5*x + 1) ** 2

array([-1.  , -2.25, -4.  , -6.25])

Each of these arithmetic operations are simply convenient wrappers around specific functions built into NumPy; for example, the ``+`` operator is a wrapper for the ``add`` function:

In [79]:
x + x

array([0, 2, 4, 6])

In [80]:
x + 3

array([3, 4, 5, 6])

In [81]:
np.add(x, 3)

array([3, 4, 5, 6])

In [82]:
np.add(x, 2)

array([2, 3, 4, 5])

In [25]:
x + x

array([0, 2, 4, 6])

In [83]:
np.add(x, x)

array([0, 2, 4, 6])

In [84]:
np.multiply(x, x)

array([0, 1, 4, 9])

The following table lists the arithmetic operators implemented in NumPy:

| Operator	    | Equivalent ufunc    | Description                           |
|---------------|---------------------|---------------------------------------|
|``+``          |``np.add``           |Addition (e.g., ``1 + 1 = 2``)         |
|``-``          |``np.subtract``      |Subtraction (e.g., ``3 - 2 = 1``)      |
|``-``          |``np.negative``      |Unary negation (e.g., ``-2``)          |
|``*``          |``np.multiply``      |Multiplication (e.g., ``2 * 3 = 6``)   |
|``/``          |``np.divide``        |Division (e.g., ``3 / 2 = 1.5``)       |
|``//``         |``np.floor_divide``  |Floor division (e.g., ``3 // 2 = 1``)  |
|``**``         |``np.power``         |Exponentiation (e.g., ``2 ** 3 = 8``)  |
|``%``          |``np.mod``           |Modulus/remainder (e.g., ``9 % 4 = 1``)|

Additionally there are Boolean/bitwise operators; we will explore these in [Comparisons, Masks, and Boolean Logic](02.06-Boolean-Arrays-and-Masks.ipynb).

### Absolute value

Just as NumPy understands Python's built-in arithmetic operators, it also understands Python's built-in absolute value function:

In [85]:
x = np.array([-2, -1, 0, 1, 2])
abs(x)

array([2, 1, 0, 1, 2])

The corresponding NumPy ufunc is ``np.absolute``, which is also available under the alias ``np.abs``:

In [86]:
np.absolute(x)

array([2, 1, 0, 1, 2])

In [87]:
np.abs(x)

array([2, 1, 0, 1, 2])

This ufunc can also handle complex data, in which the absolute value returns the magnitude:

In [88]:
x = np.array([3 - 4j, 4 - 3j, 2 + 0j, 0 + 1j])
np.abs(x)

array([5., 5., 2., 1.])

### Trigonometric functions

NumPy provides a large number of useful ufuncs, and some of the most useful for the data scientist are the trigonometric functions.
We'll start by defining an array of angles:

In [89]:
theta = np.linspace(0, np.pi, 3)
theta

array([0.        , 1.57079633, 3.14159265])

Now we can compute some trigonometric functions on these values:

In [91]:
print("theta      = ", theta)
print("sin(theta) = ", np.sin(theta))
print("cos(theta) = ", np.cos(theta))
print("tan(theta) = ", np.tan(theta))

theta      =  [0.         1.57079633 3.14159265]
sin(theta) =  [0.0000000e+00 1.0000000e+00 1.2246468e-16]
cos(theta) =  [ 1.000000e+00  6.123234e-17 -1.000000e+00]
tan(theta) =  [ 0.00000000e+00  1.63312394e+16 -1.22464680e-16]


The values are computed to within machine precision, which is why values that should be zero do not always hit exactly zero.
Inverse trigonometric functions are also available:

In [92]:
x = [-1, 0, 1]
print("x         = ", x)
print("arcsin(x) = ", np.arcsin(x))
print("arccos(x) = ", np.arccos(x))
print("arctan(x) = ", np.arctan(x))

x         =  [-1, 0, 1]
arcsin(x) =  [-1.57079633  0.          1.57079633]
arccos(x) =  [3.14159265 1.57079633 0.        ]
arctan(x) =  [-0.78539816  0.          0.78539816]


### Exponents and logarithms

Another common type of operation available in a NumPy ufunc are the exponentials:

In [105]:
x = [1, 2, 3]
print("x     =", x)
print("e^x   =", np.exp(x))
print("2^x   =", np.exp2(x))
print("3^x   =", np.power(3, x))

x     = [1, 2, 3]
e^x   = [ 2.71828183  7.3890561  20.08553692]
2^x   = [2. 4. 8.]
3^x   = [ 3  9 27]


In [112]:
np.pi

3.141592653589793

The inverse of the exponentials, the logarithms, are also available.
The basic ``np.log`` gives the natural logarithm; if you prefer to compute the base-2 logarithm or the base-10 logarithm, these are available as well:

In [94]:
x = [1, 2, 4, 10]
print("x        =", x)
print("ln(x)    =", np.log(x))
print("log2(x)  =", np.log2(x))
print("log10(x) =", np.log10(x))

x        = [1, 2, 4, 10]
ln(x)    = [0.         0.69314718 1.38629436 2.30258509]
log2(x)  = [0.         1.         2.         3.32192809]
log10(x) = [0.         0.30103    0.60205999 1.        ]


There are also some specialized versions that are useful for maintaining precision with very small input:

In [95]:
x = [0, 0.001, 0.01, 0.1]
print("exp(x) - 1 =", np.expm1(x))
print("log(1 + x) =", np.log1p(x))

exp(x) - 1 = [0.         0.0010005  0.01005017 0.10517092]
log(1 + x) = [0.         0.0009995  0.00995033 0.09531018]


When ``x`` is very small, these functions give more precise values than if the raw ``np.log`` or ``np.exp`` were to be used.

## Ufuncs: Learning More

More information on universal functions (including the full list of available functions) can be found on the [NumPy](http://www.numpy.org) and [SciPy](http://www.scipy.org) documentation websites.

Recall that you can also access information directly from within IPython by importing the packages and using IPython's tab-completion and help (``?``) functionality, as described in [Help and Documentation in IPython](01.01-Help-And-Documentation.ipynb).

In [96]:
x

[0, 0.001, 0.01, 0.1]

In [99]:
x1 = np.array([4,5,6,8,9,10])

In [100]:
x1

array([ 4,  5,  6,  8,  9, 10])

In [101]:
x2 = np.array([8,7,3,5,6,1])
x2

array([8, 7, 3, 5, 6, 1])

In [102]:
np.greater(x1,x2)

array([False, False,  True,  True,  True,  True])

In [103]:
np.less(x1,x2)

array([ True,  True, False, False, False, False])

In [104]:
np.equal(x1,x2)

array([False, False, False, False, False, False])

`np.logical_and`, `np.logical_or`, `np.logical_not.`

In [113]:
import numpy as np


array1 = np.array([[1, 2], [3, 4]])
array2 = np.array([[5, 6], [7, 8]])

In [114]:

addition_result = array1 + array2
addition_result

array([[ 6,  8],
       [10, 12]])

In [115]:
division_result = array1 / array2
division_result

array([[0.2       , 0.33333333],
       [0.42857143, 0.5       ]])

In [116]:
x = np.array([[1, 2], [3, 4]] )

y = np.array([[2, 2], [2, 2]])

print(np.greater(x,y))
print(np.less(x,y))

[[False False]
 [ True  True]]
[[ True False]
 [False False]]


In [117]:
arr1 = np.array([1, 2, 3]) 
arr2 = np.array([4, 5, 6])

print(np.add(arr1,arr2))
print(np.multiply(arr1,arr2))
print(np.sqrt(arr1))

[5 7 9]
[ 4 10 18]
[1.         1.41421356 1.73205081]


In [119]:
array1 =np.array([range(1, 4)])
array2 =np.array([range(4, 7)])
array_add=array1+array2
array_add

array([[5, 7, 9]])

In [120]:
x1 = np.array([1,2,3])
np.sqrt(x1)

array([1.        , 1.41421356, 1.73205081])

In [121]:
arr1 = np.array([[1,2],[3,4]])
arr2 = np.array([[2,2],[2,2]])
np.greater(arr1,arr2)
np.logical_and(arr1 > 2, arr2 > 2)

array([[False, False],
       [False, False]])

In [127]:
import numpy as np

# Create two 1D arrays
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

# Perform element-wise operations
addition = arr1 + arr2
multiplication = arr1 * arr2

# Compute the square root
sqrt_arr1 = np.sqrt(arr1)

# Create a 2D array
arr_2d = np.array([[1, 2], [3, 4]])

# Multiply by a scalar
scaled_arr = arr_2d * 3

# Compute the exponential
exp_arr = np.exp(arr_2d)

# Print results
print("Addition:", addition, '\n')
print("Multiplication:", multiplication, '\n')
print("Square Root:", sqrt_arr1, '\n')
print("Scaled 2D Array:\n", scaled_arr, '\n')
print("Exponential 2D Array:\n", exp_arr)

Addition: [5 7 9] 

Multiplication: [ 4 10 18] 

Square Root: [1.         1.41421356 1.73205081] 

Scaled 2D Array:
 [[ 3  6]
 [ 9 12]] 

Exponential 2D Array:
 [[ 2.71828183  7.3890561 ]
 [20.08553692 54.59815003]]


## __Coding Exercises__

#### __1. Basic Exercise__

1. Create two 1D arrays [1, 2, 3] and [4, 5, 6].

    * Perform element-wise addition and multiplication.

    * Compute the square root of the first array.

2. Create a 2D array [[1, 2], [3, 4]].

    * Multiply the array by a scalar value of 3.

    * Compute the exponential of each element.

#### __2. Intermediate Exercise__

1. Create two 2D arrays [[1, 2], [3, 4]] and [[5, 6], [7, 8]].

    * Perform element-wise addition and division.

    * Compute the natural logarithm of the first array.

2. Create a 1D array [0, np.pi/4, np.pi/2].

    * Compute the sine and cosine of each angle.

#### __3. Advanced Exercise__

1. Create two 2D arrays [[1, 2], [3, 4]] and [[2, 2], [2, 2]].

    * Perform element-wise comparison (greater than).

    * Use logical ufuncs to find elements where both arrays are greater than 2.

2. Create a 2D array [[1, 2], [3, 4]].

    * Normalize the array by subtracting the mean and dividing by the standard deviation.

## __Solutions to Exercises__

#### __1. Basic Solution__

In [1]:
import numpy as np

# Basic Exercise 1
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
print("Addition:", np.add(arr1, arr2))  # Output: [5, 7, 9]
print("Multiplication:", np.multiply(arr1, arr2))  # Output: [4, 10, 18]
print("Square root of arr1:", np.sqrt(arr1))  # Output: [1., 1.414, 1.732]

# Basic Exercise 2
arr_2d = np.array([[1, 2], [3, 4]])
print("Multiplied by scalar:\n", np.multiply(arr_2d, 3))  # Output: [[3, 6], [9, 12]]
print("Exponential:\n", np.exp(arr_2d))  # Output: [[2.718, 7.389], [20.085, 54.598]]

Addition: [5 7 9]
Multiplication: [ 4 10 18]
Square root of arr1: [1.         1.41421356 1.73205081]
Multiplied by scalar:
 [[ 3  6]
 [ 9 12]]
Exponential:
 [[ 2.71828183  7.3890561 ]
 [20.08553692 54.59815003]]


#### __2. Intermediate Solution__

In [2]:
# Intermediate Exercise 3
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])
print("Addition:\n", np.add(arr1, arr2))  # Output: [[6, 8], [10, 12]]
print("Division:\n", np.divide(arr1, arr2))  # Output: [[0.2, 0.333], [0.428, 0.5]]
print("Natural logarithm:\n", np.log(arr1))  # Output: [[0., 0.693], [1.098, 1.386]]

# Intermediate Exercise 4
angles = np.array([0, np.pi/4, np.pi/2])
print("Sine:", np.sin(angles))  # Output: [0., 0.707, 1.]
print("Cosine:", np.cos(angles))  # Output: [1., 0.707, 0.]

Addition:
 [[ 6  8]
 [10 12]]
Division:
 [[0.2        0.33333333]
 [0.42857143 0.5       ]]
Natural logarithm:
 [[0.         0.69314718]
 [1.09861229 1.38629436]]
Sine: [0.         0.70710678 1.        ]
Cosine: [1.00000000e+00 7.07106781e-01 6.12323400e-17]


#### __3. Advanced Solution__

In [3]:
# Advanced Exercise 5
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[2, 2], [2, 2]])
print("Greater than:\n", np.greater(arr1, arr2))  # Output: [[False, False], [True, True]]
print("Logical AND:\n", np.logical_and(arr1 > 2, arr2 > 2))  # Output: [[False, False], [False, False]]

# Advanced Exercise 6
arr_2d = np.array([[1, 2], [3, 4]])
mean = np.mean(arr_2d)
std = np.std(arr_2d)
print("Normalized array:\n", (arr_2d - mean) / std)
# Output: [[-1.264, -0.632], [0.632, 1.264]]

Greater than:
 [[False False]
 [ True  True]]
Logical AND:
 [[False False]
 [False False]]
Normalized array:
 [[-1.34164079 -0.4472136 ]
 [ 0.4472136   1.34164079]]
