In [1]:
import numpy as np

# NumPy Introduction

**Lets start with creating and reshaping numpy arrays**

Numpy has a very useful function `arange`:

In [2]:
a = np.arange(15)
print(a)
a.shape

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14]


(15,)

Each numpy array has a function that we can `reshape` the array into the shape we want  
important: the shape has to be a multiplication of the array size  
for example: `np.arange(15)` can be reshaped into `array(3,5)`

In [3]:
a = a.reshape(3,5)
print(a)
a.shape

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]]


(3, 5)

The number of dimensions of the array can be extracted using the `ndim` attribute of the array

In [4]:
a.ndim

2

We can construct an array using the `np.fromfunction` function.  
The function will iterate over the indices of the array shape and assign value according to the function we defined, here is an example:

In [9]:
def f(x,y):
    return x*y

f = np.fromfunction(f, (5,4), dtype=np.float32)
f

array([[ 0.,  0.,  0.,  0.],
       [ 0.,  1.,  2.,  3.],
       [ 0.,  2.,  4.,  6.],
       [ 0.,  3.,  6.,  9.],
       [ 0.,  4.,  8., 12.]], dtype=float32)

**Creating adjustable arrays using other functions**

With `np.r_` we can create an adjustable array:

In [10]:
np.r_[1:4,0,4]

array([1, 2, 3, 0, 4])

In [11]:
np.r_[1:4,np.array([0]*10),4:10]

array([1, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 5, 6, 7, 8, 9])

In [12]:
np.r_[1:4,0,4:10]

array([1, 2, 3, 0, 4, 5, 6, 7, 8, 9])

**Casting the array into another type**

In [14]:
a_float = a.astype('float32')
print(a_float)
a_float.dtype

[[ 0.  1.  2.  3.  4.]
 [ 5.  6.  7.  8.  9.]
 [10. 11. 12. 13. 14.]]


dtype('float32')

**Creating numpy array with `zero` values**

In [21]:
b = np.zeros((3,4), dtype=np.int16)
b

array([[0, 0, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 0]], dtype=int16)

**Numpy can create another useful array**  
numpy creates a list with fixed space between the values that we set:
range: 0-1
number of values: 10

In [23]:
l_space = np.linspace(0,1,10)
l_space

array([0.        , 0.11111111, 0.22222222, 0.33333333, 0.44444444,
       0.55555556, 0.66666667, 0.77777778, 0.88888889, 1.        ])

**Creating 3D arrays**  
Just like before, when we reshaping an array we need to make sure that our dimentions are correct

In [28]:
a_3d = np.arange(60)
a_3d = a_3d.reshape((2,6,5))
a_3d

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
 48 49 50 51 52 53 54 55 56 57 58 59]


array([[[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14],
        [15, 16, 17, 18, 19],
        [20, 21, 22, 23, 24],
        [25, 26, 27, 28, 29]],

       [[30, 31, 32, 33, 34],
        [35, 36, 37, 38, 39],
        [40, 41, 42, 43, 44],
        [45, 46, 47, 48, 49],
        [50, 51, 52, 53, 54],
        [55, 56, 57, 58, 59]]])

If we want random values we can use the `random` function to create it:

In [7]:
rng = np.random.RandomState(47)
arr_3d = rng.randint(0,10,(2,5,6))
arr_3d

array([[[7, 6, 7, 8, 8, 3],
        [0, 7, 0, 7, 7, 1],
        [7, 2, 2, 1, 7, 4],
        [8, 9, 2, 9, 1, 5],
        [0, 9, 2, 0, 2, 1]],

       [[4, 3, 4, 5, 9, 9],
        [6, 6, 1, 5, 3, 9],
        [8, 9, 4, 3, 5, 6],
        [4, 3, 1, 4, 4, 2],
        [2, 0, 9, 0, 1, 8]]])

**Lets try some operations (built-in)**

Consider the following matrices:

In [31]:
c = np.arange(30).reshape((6,5))
c

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24],
       [25, 26, 27, 28, 29]])

In [33]:
d = np.arange(30).reshape(5,6)
d

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29]])

We can multiply those matrices so we will result in `6x6` matrix

In [35]:
c @ d

array([[ 180,  190,  200,  210,  220,  230],
       [ 480,  515,  550,  585,  620,  655],
       [ 780,  840,  900,  960, 1020, 1080],
       [1080, 1165, 1250, 1335, 1420, 1505],
       [1380, 1490, 1600, 1710, 1820, 1930],
       [1680, 1815, 1950, 2085, 2220, 2355]])

We can multiply the matrices element-wise so we will need to transform the matrix `d.T` for compatiblity  
resulting with `5x6` matrix

In [36]:
c * d.T

array([[  0,   6,  24,  54,  96],
       [  5,  42,  91, 152, 225],
       [ 20,  88, 168, 260, 364],
       [ 45, 144, 255, 378, 513],
       [ 80, 210, 352, 506, 672],
       [125, 286, 459, 644, 841]])

**Built-in function like sum, min and cumsum**

Sum of the entire matrix will result in a Scalar

In [42]:
d.sum()

435

Sum of each column can be achieved with argument `axis=0`, this will result in array of the columns sums

In [39]:
d.sum(axis=0)

array([60, 65, 70, 75, 80, 85])

Sum of each row can be achieved with argument `axis=1`, this will result in array of the rows sums

In [40]:
d.sum(axis=1)

array([ 15,  51,  87, 123, 159])

Just like the sum function we can find the minimum element in the entire matrix

In [41]:
d.min()

0

We can also find the minimum of each column, results with an array of the minimun element in each column

In [43]:
d.min(axis=0)

array([0, 1, 2, 3, 4, 5])

We can also find the minimum of each row, results with an array of the minimun element in each row

In [44]:
d.min(axis=1)

array([ 0,  6, 12, 18, 24])

**The `cumsum` function is very useful when you want to make an histogram of the matrix.**

Rows histogram

In [45]:
d.cumsum(axis=1) 

array([[  0,   1,   3,   6,  10,  15],
       [  6,  13,  21,  30,  40,  51],
       [ 12,  25,  39,  54,  70,  87],
       [ 18,  37,  57,  78, 100, 123],
       [ 24,  49,  75, 102, 130, 159]])

Columns histogram

In [46]:
d.cumsum(axis=0)

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  8, 10, 12, 14, 16],
       [18, 21, 24, 27, 30, 33],
       [36, 40, 44, 48, 52, 56],
       [60, 65, 70, 75, 80, 85]])

## Indexing and selecting on numpy arrays

In [47]:
e = np.arange(10)
e

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Reverse array with `e[::-1]`

In [49]:
e[::-1]

array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])

Selecting every 2 elements in the array, we can also specify the range of selection in the array

In [48]:
e[0:5:2]

array([0, 2, 4])

In [56]:
f = np.arange(16).reshape((4,4))
f

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

Selecting the whole second column in the 2D array

In [59]:
f[:,1]

array([ 1,  5,  9, 13])

Another way:

In [60]:
f[0:4,1]

array([ 1,  5,  9, 13])

Selecting the second and third row with all columns included:

In [61]:
f[1:3, :]

array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

### Operations on arrays

We can flat the matrix into one array, there are 2 built-in functions and one attribute that we can use:
`f.flatten()`, `f.ravel()`, `f.flat`

In [65]:
f.flatten()

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

In [66]:
f.ravel()

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

The attribute `f.flat` can be used and cast to numpy array

In [68]:
np.array(f.flat)

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

## Splitting and combining arrays

There are several functions that numpy is providing to wrangle.  
`vstack`, `hstack`, `column_stack`, `hsplit`, `vsplit`  
We will explore them here.

**Stacking arrays**

Consider the collowing arrays:

In [69]:
f

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [14]:
e = np.arange(16) * 5
e = e.reshape((4,4))
e

array([[ 0,  5, 10, 15],
       [20, 25, 30, 35],
       [40, 45, 50, 55],
       [60, 65, 70, 75]])

With `vstack` we can combine 2 arrays one on top of another:

In [77]:
v = np.vstack((f,e))
v

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [ 0,  5, 10, 15],
       [20, 25, 30, 35],
       [40, 45, 50, 55],
       [60, 65, 70, 75]])

With `hstack` we can combine 2 arrays one next to another:

In [78]:
h = np.hstack((f,e))
h

array([[ 0,  1,  2,  3,  0,  5, 10, 15],
       [ 4,  5,  6,  7, 20, 25, 30, 35],
       [ 8,  9, 10, 11, 40, 45, 50, 55],
       [12, 13, 14, 15, 60, 65, 70, 75]])

Another way to stack arrays is with the `column_stack` function:

In [87]:
np.column_stack((f,e))

array([[ 0,  1,  2,  3,  0,  5, 10, 15],
       [ 4,  5,  6,  7, 20, 25, 30, 35],
       [ 8,  9, 10, 11, 40, 45, 50, 55],
       [12, 13, 14, 15, 60, 65, 70, 75]])

The `column_stack` function behaves a bit different with 1D array, it transform the arrays and then stack them in a column in the result matrix

In [88]:
a = np.array([2,3])
b = np.array([4,5])
np.column_stack((a,b))

array([[2, 4],
       [3, 5]])

**Splitting arrays**

The `hsplit` function allows us to split the matrix by column into `n` parts

In [107]:
f_hsplit = np.hsplit(f,4)
f_hsplit

[array([[ 0],
        [ 4],
        [ 8],
        [12]]),
 array([[ 1],
        [ 5],
        [ 9],
        [13]]),
 array([[ 2],
        [ 6],
        [10],
        [14]]),
 array([[ 3],
        [ 7],
        [11],
        [15]])]

The `vsplit` function allows us to split the matrix by rows into `n` parts

In [104]:
f_vsplit = np.vsplit(f,4)
f_vsplit

[array([[0, 1, 2, 3]]),
 array([[4, 5, 6, 7]]),
 array([[ 8,  9, 10, 11]]),
 array([[12, 13, 14, 15]])]