# Performing Array Calculations in NumPy

From: K.A.

Performing calculations on arrays in the NumPy library can be very fast and very slow. The key to speeding them up is to use vectorized operations, usually implemented through Python universal functions (ufuncs). NumPy provides basic methods for manipulating large arrays and matrices.
To get started

1. You need to import the library into python:




In [1]:
import numpy as np

2. Create an array

In [2]:
x = np.array([1,2,3])

## Universal functions
The NumPy library provides a convenient interface to a compiled procedure with static typing for many types of operations . It is known as a vectorized operation. It simply requires you to perform an operation on an array, which is then applied to each of its elements. The vectorized approach is designed to carry the loop to the compiled layer underlying the NumPy library, which provides much better performance.

There are two kinds of universal functions: unary universal functions, with one argument, and binary functions, with two arguments. We will look at examples of both types of functions.

### Arithmetic functions over arrays
The universal functions in the NumPy library are very easy to use because they use the native Python arithmetic operators. You can perform the usual addition, subtraction, multiplication, and division:

In [3]:
x = np.array([1,2,3])     
print("x     =", x)       
print("x + 5 =", x + 5)       
print("x - 5 =", x - 5)       
print("x * 2 =", x * 2)       
print("x / 2 =", x / 2)       
print("x // 2 =", x // 2) #division with rounding down

x     = [1 2 3]
x + 5 = [6 7 8]
x - 5 = [-4 -3 -2]
x * 2 = [2 4 6]
x / 2 = [0.5 1.  1.5]
x // 2 = [0 1 1]


There is also a unary universal function for the sign change operation, the ** operator for multiplication, and the % operator for division by modulo:

In [4]:
x = np.array([1,2,3])
print("-x     = ", -x)       
print("x ** 2 = ", x ** 2)       
print("x % 2  = ", x % 2)

-x     =  [-1 -2 -3]
x ** 2 =  [1 4 9]
x % 2  =  [1 0 1]


In addition, these operations can be combined in any way, respecting the standard priority of operations:

In [5]:
x = np.array([1,2,3])
print(-(0.5*x + 1) ** 2)

[-2.25 -4.   -6.25]


### Example of using arithmetic functions

Create two 2D arrays

In [8]:
a = np.array([[2,5,1],[3,7,5],[1,9,9]])
b = np.array([[4,1,1],[3,9,4],[3,2,5]])

We use the standard arithmetic functions on the two matrices a and b

In [9]:
print("a + b = \n", a + b)       
print("a - b = \n", a - b)       
print("a * b = \n", a * b)       
print("a / b = \n", a / b)

a + b = 
 [[ 6  6  2]
 [ 6 16  9]
 [ 4 11 14]]
a - b = 
 [[-2  4  0]
 [ 0 -2  1]
 [-2  7  4]]
a * b = 
 [[ 8  5  1]
 [ 9 63 20]
 [ 3 18 45]]
a / b = 
 [[0.5        5.         1.        ]
 [1.         0.77777778 1.25      ]
 [0.33333333 4.5        1.8       ]]


### **Absolute value**


In [10]:
x = np.array([-2, -1, 0, 1, 2])        
abs(x)

array([2, 1, 0, 1, 2])

This universal function can also handle complex values by returning their modulus:

In [11]:
x = np.array([3 - 4j, 4 - 3j, 2 + 0j, 0 + 1j])
np.abs(x)

array([5., 5., 2., 1.])

### **Trigonometric functions**
The NumPy library provides many universal functions, some of the most important are trigonometric functions.

In [12]:
theta = np.linspace(0, np.pi, 3)
print("theta      = ", theta)        
print("sin(theta) = ", np.sin(theta))        
print("cos(theta) = ", np.cos(theta))        
print("tan(theta) = ", np.tan(theta))

theta      =  [0.         1.57079633 3.14159265]
sin(theta) =  [0.0000000e+00 1.0000000e+00 1.2246468e-16]
cos(theta) =  [ 1.000000e+00  6.123234e-17 -1.000000e+00]
tan(theta) =  [ 0.00000000e+00  1.63312394e+16 -1.22464680e-16]


Inverse trigonometric functions are also available for use:

In [13]:
x = [-1, 0, 1]        
print("x         = ", x)        
print("arcsin(x) = ", np.arcsin(x))        
print("arccos(x) = ", np.arccos(x))        
print("arctan(x) = ", np.arctan(x))

x         =  [-1, 0, 1]
arcsin(x) =  [-1.57079633  0.          1.57079633]
arccos(x) =  [3.14159265 1.57079633 0.        ]
arctan(x) =  [-0.78539816  0.          0.78539816]


### **Exponential functions and logarithms**
Exponential functions are another common type of operation available in NumPy:

In [14]:
x = [1, 2, 3]        
print("x     =", x)        
print("e^x   =", np.exp(x))        
print("2^x   =", np.exp2(x))        
print("3^x   =", np.power(3, x))

x     = [1, 2, 3]
e^x   = [ 2.71828183  7.3890561  20.08553692]
2^x   = [2. 4. 8.]
3^x   = [ 3  9 27]


Logarithms

In [15]:
x = [1, 2, 4, 10]        
print("x        =", x)        
print("ln(x)    =", np.log(x))        
print("log2(x)  =", np.log2(x))        
print("log10(x) =", np.log10(x))

x        = [1, 2, 4, 10]
ln(x)    = [0.         0.69314718 1.38629436 2.30258509]
log2(x)  = [0.         1.         2.         3.32192809]
log10(x) = [0.         0.30103    0.60205999 1.        ]


There are some special versions of functions:

In [16]:
x = [0, 0.001, 0.01, 0.1]        
print("exp(x) - 1 =", np.expm1(x))        
print("log(1 + x) =", np.log1p(x))

exp(x) - 1 = [0.         0.0010005  0.01005017 0.10517092]
log(1 + x) = [0.         0.0009995  0.00995033 0.09531018]


For very small values of the elements of the vector x, these functions return much more accurate results than the usual functions np.log and np.exp.

## Advanced features of universal functions


### Specifying an array to output the result
For large calculations, it is convenient to specify an array in which to store the result of the calculation. Instead of creating a temporary array, you can write the results of calculations directly to the memory location you need. You can do this for any universal function with the ***out*** argument:

In [17]:
x = np.arange(5)        
y = np.empty(5)        
np.multiply(x, 10, out=y)        
print(y)

[ 0. 10. 20. 30. 40.]


This feature can be used in combination with array representations. For example, you can write calculation results to every second element of a given array:

In [18]:
y = np.zeros(10)
x = np.arange(5)         
np.power(2, x, out=y[::2])        
print(y)

[ 1.  0.  2.  0.  4.  0.  8.  0. 16.  0.]


If we wrote y[::2] = 2 ** x instead, we would create a temporary array to store the results of the 2 ** x operation and then copy those values into the y array. For such a small amount of computation there is not much difference, but for very large arrays the memory savings from careful use of the out argument can be significant.

In [19]:
# we can also use the possibilities of obtaining cumulative values
x = np.arange(1, 6)        
np.add.reduce(x)

15

In [20]:
x = np.arange(1, 6)
np.multiply.reduce(x)

120

## Vector products

All universal functions can output the result of applying the corresponding operation to all pairs of two arguments using the outer method. This makes it possible to create, for example, a multiplication table with one line of code:

In [21]:
x = np.arange(1, 6)        
np.multiply.outer(x, x)

array([[ 1,  2,  3,  4,  5],
       [ 2,  4,  6,  8, 10],
       [ 3,  6,  9, 12, 15],
       [ 4,  8, 12, 16, 20],
       [ 5, 10, 15, 20, 25]])

## Aggregation: minimum, maximum, average

Very often when working with large amounts of data, the first step is to calculate summary statistics for that data. NumPy has aggregation functions for working with arrays.

### Summing values from an array

As an example, consider calculating the sum of array values. In "pure" Python, this can be done using the built-in sum function:

In [22]:
L = np.random.random(100)       
sum(L)

50.759315020996425

Let's compare with the similar function of the Numpy library

In [23]:
big_array = np.random.rand(1000000)       
%timeit sum(big_array)       
%timeit np.sum(big_array)

10 loops, best of 5: 84 ms per loop
1000 loops, best of 5: 342 µs per loop


The functions sum and np.sum are not identical.

### **Min and Max**

In "pure" Python, there are built-in functions min and max used to calculate the minimum and maximum values of any given array:

In [24]:
big_array = np.random.rand(1000000)
min(big_array), max(big_array)

(1.0654156229472633e-06, 0.9999969650848898)

The syntax of the corresponding functions from the NumPy library is similar, and they also work much faster:

In [25]:
big_array = np.random.rand(1000000)
np.min(big_array), np.max(big_array)

(6.22653164583653e-08, 0.9999997582474399)

In [26]:
%timeit min(big_array)       
%timeit np.min(big_array)

10 loops, best of 5: 60 ms per loop
1000 loops, best of 5: 413 µs per loop


There is a short notation of the operation

In [27]:
print(big_array.min(), big_array.max(), big_array.sum())

6.22653164583653e-08 0.9999997582474399 499929.5077814219


### **The case of multidimensionality**

Aggregation by column or row is one of the frequently used types of aggregation operations.

In [28]:
M = np.random.random((3, 4))       
print(M)

[[0.68210579 0.10031222 0.86708552 0.74715353]
 [0.20032658 0.43263162 0.46879247 0.72008642]
 [0.55280628 0.69362647 0.63984686 0.19136443]]


By default, all aggregation functions of the NumPy library return a summary of the entire array:

In [29]:
M.sum()

6.296138183533368

But aggregation functions take as input an additional argument that allows you to specify the axis along which the aggregate value is calculated. For example, you can find the minimum value of each column by specifying axis=0:

In [32]:
M.min(axis=0)

array([0.20032658, 0.10031222, 0.46879247, 0.19136443])

In [33]:
# similarly
M.max(axis=1)

array([0.86708552, 0.72008642, 0.69362647])