## Numpy

NumPy provides an efficient way to store and manipulate multidimensional dense arrays in Python. The important features of NumPy are:
- It provides an ndarray structure, which allows efficient storage and manipulation of vectors, matrices, and higher-dimensional datasets.
- It provides a readable and efficient syntax for operating on this data, from simple element-wise arithmetic to more complicated linear algebraic operations.

In [1]:
import numpy as np

np.random.seed(0) # Seed to get the same results

In [54]:
x0 = np.zeros((5,5))
print(x0, '\n')

x1 = np.random.randint(10, size=6)
print(x1, x1.ndim, x1.shape, x1.size, '\n')

x2 = np.random.randint(10, size=(3,4)) # Rows by columns
print(x2, x2.shape, x2.shape, x2.size, '\n')

x3 = np.random.randint(10, size=(3,4,5)) # 3D array, Depth, Rows, columns
print(x3, x3.shape, x3.shape, x3.size, '\n')

[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]] 

[9 9 7 6 4 5] 1 (6,) 6 

[[4 8 1 0]
 [6 5 9 5]
 [1 1 0 2]] (3, 4) (3, 4) 12 

[[[6 4 8 7 0]
  [7 4 5 5 1]
  [3 8 2 3 8]
  [8 8 8 9 6]]

 [[1 5 4 1 3]
  [3 9 3 5 0]
  [5 7 9 3 4]
  [2 3 2 5 3]]

 [[6 9 2 4 2]
  [7 1 0 8 1]
  [9 1 8 0 1]
  [0 4 6 9 9]]] (3, 4, 5) (3, 4, 5) 60 



In [17]:
print("dtype:", x3.dtype)
print("itemsize:", x3.itemsize, "bytes")
print("nbytes:", x3.nbytes, "bytes")

dtype: int32
itemsize: 4 bytes
nbytes: 240 bytes


In [20]:
print(x1[0])
print(x1[-1])
print(x2[0,1]) # Rows by columns
print(x3[0,1,2])

8
7
0
4


In [21]:
x2[0,0] = 12
print(x2)

[[12  0  4  1]
 [ 1  6  6  0]
 [ 2  3  7  9]]


In [2]:
x = np.arange(1, 10)
print(x)
print(x**2)

[1 2 3 4 5 6 7 8 9]
[ 1  4  9 16 25 36 49 64 81]


## Multidimensional slicing

In [24]:
print(x)
print(x[:5])
print(x[4:7])

[1 2 3 4 5 6 7 8 9]
[1 2 3 4 5]
[5 6 7]


In [25]:
print(x2)

[[12  0  4  1]
 [ 1  6  6  0]
 [ 2  3  7  9]]


In [28]:
print(x2[:2, :3]) # two rows, 3 columns

[[12  0  4]
 [ 1  6  6]]


In [30]:
print(x2[:3, ::2]) # 3 rows, very other column

[[12  4]
 [ 1  6]
 [ 2  7]]


In [31]:
x2[::-1, ::-1]

array([[ 9,  7,  3,  2],
       [ 0,  6,  6,  1],
       [ 1,  4,  0, 12]])

In [33]:
x2_copy = np.copy(x2[:2, :2])
print(x2_copy)

[[12  0]
 [ 1  6]]


## Reshaping arrays

In [34]:
grid = np.arange(1, 10).reshape((3, 3))
print(grid)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [35]:
x = np.array([1, 2, 3])

# row vector via reshape
x.reshape((1, 3))

array([[1, 2, 3]])

In [36]:
# column vector via reshape
x.reshape((3, 1))

array([[1],
       [2],
       [3]])

## Computation on NumPy Arrays

In [50]:
def compute_reciprocals(values):
    output = np.empty(len(values))
    for i in range(len(values)):
        output[i] = 1.0 / values[i]
    return output

In [52]:
big_array = np.random.randint(1, 100, size=10000)
%timeit compute_reciprocals(big_array)

68 ms ± 2.42 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [55]:
%timeit 1.0 / big_array

65.4 µs ± 611 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [59]:
x = np.arange(9).reshape((3, 3))
print(x, '\n')
print(2 ** x, '\n')
print(x**2, '\n')

[[0 1 2]
 [3 4 5]
 [6 7 8]] 

[[  1   2   4]
 [  8  16  32]
 [ 64 128 256]] 

[[ 0  1  4]
 [ 9 16 25]
 [36 49 64]] 



In [61]:
x = np.arange(4)
print("x     =", x)
print("x + 5 =", x + 5)
print("x - 5 =", x - 5)
print("x * 2 =", x * 2)
print("x / 2 =", x / 2)
print("x // 2 =", x // 2)  # floor division
print("-x     = ", -x)
print("x ** 2 = ", x ** 2)
print("x % 2  = ", x % 2)

x     = [0 1 2 3]
x + 5 = [5 6 7 8]
x - 5 = [-5 -4 -3 -2]
x * 2 = [0 2 4 6]
x / 2 = [0.  0.5 1.  1.5]
x // 2 = [0 0 1 1]
-x     =  [ 0 -1 -2 -3]
x ** 2 =  [0 1 4 9]
x % 2  =  [0 1 0 1]


The following table lists the arithmetic operators implemented in NumPy:

| Operator\t    | Equivalent ufunc    | Description                           |
|---------------|---------------------|---------------------------------------|
|``+``          |``np.add``           |Addition (e.g., ``1 + 1 = 2``)         |
|``-``          |``np.subtract``      |Subtraction (e.g., ``3 - 2 = 1``)      |
|``-``          |``np.negative``      |Unary negation (e.g., ``-2``)          |
|``*``          |``np.multiply``      |Multiplication (e.g., ``2 * 3 = 6``)   |
|``/``          |``np.divide``        |Division (e.g., ``3 / 2 = 1.5``)       |
|``//``         |``np.floor_divide``  |Floor division (e.g., ``3 // 2 = 1``)  |
|``**``         |``np.power``         |Exponentiation (e.g., ``2 ** 3 = 8``)  |
|``%``          |``np.mod``           |Modulus/remainder (e.g., ``9 % 4 = 1``)|

Additionally there are Boolean/bitwise operators; we will explore these in [Comparisons, Masks, and Boolean Logic](02.06-Boolean-Arrays-and-Masks.ipynb).


In [62]:
x = np.array([-2, -1, 0, 1, 2])
np.absolute(x)

array([2, 1, 0, 1, 2])

In [70]:
L = np.random.random(100)
sum(L)

48.07540284704033

In [71]:
big_array = np.random.rand(1000)
%timeit sum(big_array)
%timeit np.sum(big_array)

305 µs ± 5.47 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
10.9 µs ± 119 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [72]:
np.min(big_array), np.max(big_array)

(0.0025779507390595313, 0.9962946991893353)

In [73]:
%timeit min(big_array)
%timeit np.min(big_array)

148 µs ± 4.18 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
10.5 µs ± 28.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [74]:
print(big_array.min(), big_array.max(), big_array.sum())

0.0025779507390595313 0.9962946991893353 512.3625932678347


### Trigonometry functions

In [65]:
theta = np.linspace(0, np.pi, 3) # Return evenly spaced numbers over a specified interval
print("theta      = ", theta)
print("sin(theta) = ", np.sin(theta))
print("cos(theta) = ", np.cos(theta))
print("tan(theta) = ", np.tan(theta))

theta      =  [0.         1.57079633 3.14159265]
sin(theta) =  [0.0000000e+00 1.0000000e+00 1.2246468e-16]
cos(theta) =  [ 1.000000e+00  6.123234e-17 -1.000000e+00]
tan(theta) =  [ 0.00000000e+00  1.63312394e+16 -1.22464680e-16]


In [66]:
x = [-1, 0, 1]
print("x         = ", x)
print("arcsin(x) = ", np.arcsin(x))
print("arccos(x) = ", np.arccos(x))
print("arctan(x) = ", np.arctan(x))

x         =  [-1, 0, 1]
arcsin(x) =  [-1.57079633  0.          1.57079633]
arccos(x) =  [3.14159265 1.57079633 0.        ]
arctan(x) =  [-0.78539816  0.          0.78539816]


### Exponents and logarithms

In [67]:
x = [1, 2, 3]
print("x     =", x)
print("e^x   =", np.exp(x))
print("2^x   =", np.exp2(x))
print("3^x   =", np.power(3, x))

x     = [1, 2, 3]
e^x   = [ 2.71828183  7.3890561  20.08553692]
2^x   = [2. 4. 8.]
3^x   = [ 3  9 27]


In [68]:
x = [1, 2, 4, 10]
print("x        =", x)
print("ln(x)    =", np.log(x))
print("log2(x)  =", np.log2(x))
print("log10(x) =", np.log10(x))

x        = [1, 2, 4, 10]
ln(x)    = [0.         0.69314718 1.38629436 2.30258509]
log2(x)  = [0.         1.         2.         3.32192809]
log10(x) = [0.         0.30103    0.60205999 1.        ]


### Aggregations: Min, Max, et.al  functions

In [75]:
M = np.random.random((3, 4))
print(M)

[[0.8925672  0.34194799 0.74399095 0.26325457]
 [0.0173899  0.93186845 0.44132285 0.12601083]
 [0.93272773 0.80479568 0.31515394 0.98550982]]


In [76]:
M.sum()

6.7965398940341855

In [78]:
M.min(axis=0) # find the minimum value within each column by specifying axis=0

array([0.0173899 , 0.34194799, 0.31515394, 0.12601083])

In [80]:
M.min(axis=1) # find the minimum value within each row by specifying axis=1

array([0.26325457, 0.0173899 , 0.31515394])

The way the axis is specified here can be confusing to users coming from other languages. The axis keyword specifies the dimension of the array that will be collapsed, rather than the dimension that will be returned. So specifying axis=0 means that the first axis will be collapsed: for two-dimensional arrays, this means that values within each column will be aggregated.

|Function Name      |   NaN-safe Version  | Description                                   |
|-------------------|---------------------|-----------------------------------------------|
| ``np.sum``        | ``np.nansum``       | Compute sum of elements                       |
| ``np.prod``       | ``np.nanprod``      | Compute product of elements                   |
| ``np.mean``       | ``np.nanmean``      | Compute mean of elements                      |
| ``np.std``        | ``np.nanstd``       | Compute standard deviation                    |
| ``np.var``        | ``np.nanvar``       | Compute variance                              |
| ``np.min``        | ``np.nanmin``       | Find minimum value                            |
| ``np.max``        | ``np.nanmax``       | Find maximum value                            |
| ``np.argmin``     | ``np.nanargmin``    | Find index of minimum value                   |
| ``np.argmax``     | ``np.nanargmax``    | Find index of maximum value                   |
| ``np.median``     | ``np.nanmedian``    | Compute median of elements                    |
| ``np.percentile`` | ``np.nanpercentile``| Compute rank-based statistics of elements     |
| ``np.any``        | N/A                 | Evaluate whether any elements are true        |
| ``np.all``        | N/A                 | Evaluate whether all elements are true        |