# Numpy crash course

Numpy is a cientific library for Python.

In [1]:
import numpy as np

Convert Python list into numpy array.

In [5]:
my_list = [1, 2, 3, 4]
print(my_list)
print(type(my_list))
my_list = np.array(my_list)
print(my_list)
print(type(my_list))

[1, 2, 3, 4]
<class 'list'>
[1 2 3 4]
<class 'numpy.ndarray'>


We can use the numpy `arange` function (pronunced *a range*)  to create arrays. This function takes two arguments: the start value (included), and the stop value (not included). We can also optionally supply a `step` argument.

In [7]:
a = np.arange(0, 10)
print(a)
print(type(a))

[0 1 2 3 4 5 6 7 8 9]
<class 'numpy.ndarray'>


In [8]:
a = np.arange(0, 10, 2)
print(a)
print(type(a))

[0 2 4 6 8]
<class 'numpy.ndarray'>


We can use this function to create arrays of zeros, or ones. We can also give the array a dimensions property by passing a tuple to the function. Each value is a dimension on the array. So, for example, the tuple `(2, 3)` creates a matrix with 2 rows and 3 columns.

In [9]:
np.zeros((5,5))

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [11]:
np.ones((2,6))

array([[1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1.]])

We can generate random numbers with numpy. For example, to create a random integer we use `np.random.randint(start, end)`. Numpy will return an integer value between zero and one hundred, picked form a uniform distribution.

In [47]:
np.random.randint(0, 100)

38

We can also generate entire arrays by supplying a tuple to the `size` argument.

In [48]:
np.random.randint(0, 100, (5,5))

array([[45, 75, 16, 28, 94],
       [20, 72, 35,  9, 85],
       [ 8, 41,  3, 73, 38],
       [43, 33, 15,  3,  1],
       [22, 31, 59, 96, 74]])

We can also use numpy to create linearly spaced arrays. For that we use the `linspace` function with three arguments: the lower bound, the upper bound, and the amount of numbers to be evenly spaced out between the bounds.

In [53]:
print(np.linspace(0, 10, 1))
print(np.linspace(0, 10, 2))
print(np.linspace(0, 10, 3))
print(np.linspace(0, 10, 6))
print(np.linspace(0, 10, 11))
print(np.linspace(0, 10, 20))

[0.]
[ 0. 10.]
[ 0.  5. 10.]
[ 0.  2.  4.  6.  8. 10.]
[ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10.]
[ 0.          0.52631579  1.05263158  1.57894737  2.10526316  2.63157895
  3.15789474  3.68421053  4.21052632  4.73684211  5.26315789  5.78947368
  6.31578947  6.84210526  7.36842105  7.89473684  8.42105263  8.94736842
  9.47368421 10.        ]


## Numpy operations

Because computers don't acctually know how to generate random numbers, we can set a seed and make our results reproducible.

In [77]:
np.random.seed(101)
np.random.randint(0, 100, 10)

array([95, 11, 81, 70, 63, 87, 75,  9, 77, 40])

Numpy arrays have several methods that we can use

In [104]:
arr = np.random.randint(0, 100, 10)
print(arr.min()) # min value
print(arr.max()) # max value
print(arr.mean()) # mean value
print(arr.argmax()) # index location of the max value
print(arr.argmin()) # index location of the min value
print(arr.reshape((2,5))) # reshaping the array

7
86
50.5
0
8
[[86 17 68 17 18]
 [60 83 82  7 67]]


Numpy supports indexing and slicing of arrays with the `[` operator. The return value will be a 1-dim array, but we can stack the reshape method.

In [105]:
arr = np.arange(0, 100).reshape(10, 10)
print(arr)

[[ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]
 [20 21 22 23 24 25 26 27 28 29]
 [30 31 32 33 34 35 36 37 38 39]
 [40 41 42 43 44 45 46 47 48 49]
 [50 51 52 53 54 55 56 57 58 59]
 [60 61 62 63 64 65 66 67 68 69]
 [70 71 72 73 74 75 76 77 78 79]
 [80 81 82 83 84 85 86 87 88 89]
 [90 91 92 93 94 95 96 97 98 99]]


In [109]:
print(arr[5,2]) # single element
print(arr[:,2]) # entire column
print(arr[:,2].reshape(5,2)) # entire column, reshaped into a matrix
print(arr[5,:]) # entire row

52
[ 2 12 22 32 42 52 62 72 82 92]
[[ 2 12]
 [22 32]
 [42 52]
 [62 72]
 [82 92]]
[50 51 52 53 54 55 56 57 58 59]


Masking allows us to use conditional filters to select elements. For example, `arr > 50` will return a numpy array of the same dimensions, but where every value is either `True` (if the original value in that cell is greater than 50), or `False` otherwise.

In [111]:
arr > 50

array([[False, False, False, False, False, False, False, False, False,
        False],
       [False, False, False, False, False, False, False, False, False,
        False],
       [False, False, False, False, False, False, False, False, False,
        False],
       [False, False, False, False, False, False, False, False, False,
        False],
       [False, False, False, False, False, False, False, False, False,
        False],
       [False,  True,  True,  True,  True,  True,  True,  True,  True,
         True],
       [ True,  True,  True,  True,  True,  True,  True,  True,  True,
         True],
       [ True,  True,  True,  True,  True,  True,  True,  True,  True,
         True],
       [ True,  True,  True,  True,  True,  True,  True,  True,  True,
         True],
       [ True,  True,  True,  True,  True,  True,  True,  True,  True,
         True]])

We can now use this new array object to filter the original matrix.

In [112]:
arr[arr > 50]

array([51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
       68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
       85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])