### Numpy is the numerical computing package in Python

In [1]:
import numpy as np

The import statement is used to import external libraries. All functions within the imported library are now accesible through that library's namespace. Since we imported numpy as "np", we can now access the functions within the Numpy library through: np.\<function name\>

In [4]:
np.array( [1, 2, 3, 4] )

array([1, 2, 3, 4])

In [6]:
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [9]:
np.arange(0, 10, 1)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

The arange function takes in the "start" value, "stop" value, and the "step" size as inputs and returns an evenly spaced array. For more information, read the documentation: 

In [8]:
np.linspace?

[0;31mSignature:[0m
[0mnp[0m[0;34m.[0m[0mlinspace[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0mstart[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mstop[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mnum[0m[0;34m=[0m[0;36m50[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mendpoint[0m[0;34m=[0m[0;32mTrue[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mretstep[0m[0;34m=[0m[0;32mFalse[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mdtype[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0maxis[0m[0;34m=[0m[0;36m0[0m[0;34m,[0m[0;34m[0m
[0;34m[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Return evenly spaced numbers over a specified interval.

Returns `num` evenly spaced samples, calculated over the
interval [`start`, `stop`].

The endpoint of the interval can optionally be excluded.

.. versionchanged:: 1.16.0
    Non-scalar `start` and `stop` are now supported.

.. versionchanged:: 1.20.0
    Values are rounded towards ``-inf``

In [3]:
np.linspace(0, 9,10)

array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])

In [3]:
np.zeros(10)

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [12]:
np.array([ [1,1], [1,1] ])

array([[1, 1],
       [1, 1]])

In [3]:
#arr = np.ones( (2,2) )
arr = np.ones ( (3,4) )
arr

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

Above, we created a matrix with 3 rows and 4 columns. It has a shape of 3 x 4

In [4]:
arr.shape

(3, 4)

In [8]:
np.full_like(np.zeros(10), 10)

array([10., 10., 10., 10., 10., 10., 10., 10., 10., 10.])

In [16]:
np.ones(10) * 10

array([10., 10., 10., 10., 10., 10., 10., 10., 10., 10.])

In [19]:
x = 2 #integer
y = "this is a string"
x * y

'this is a stringthis is a string'

## Why use numpy?

Firstly, numpy provides access to efficient and optimized implementation of array handling operations. For instance, let's repeat the exercise from the previous notebook, where we created a list of 1000 elements and performed certain operations on each element.

In [21]:
%%timeit
x = []
for i in range(0, 1000, 1):
    x.append(i**2 + 0.5 * i + 2.5)

305 µs ± 3.13 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [22]:
%%timeit 
xn = np.arange(0, 1000, 1)
xn = xn**2 + 0.5 * xn + 2.5

8.98 µs ± 215 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


<br/><br/>
Numpy is ~2 orders of magnitude faster!

The pure Python way of doing things is slower because Python is a dynamically typed language and the compiler may not implement memory optimizations such as loading up the array elements into memory before the operation is carried out.

But the numpy way of doing things is faster because the underlying code is written in C, and we get all the optimization that comes with having defined types and compiler optimizations with memory management. We also avoid the overheads that come with storing the data type and checking it before every operation is carried out, leading to much faster running code!

### Indexing in numpy

In [2]:
x = np.arange(5, 10, 1)

In [5]:
x

array([5, 6, 7, 8, 9])

In [28]:
x.shape

(5,)

In [27]:
print(x[0], x[1], x[2], x[3], x[4])

5 6 7 8 9


In [29]:
print(x[5])

IndexError: index 5 is out of bounds for axis 0 with size 5

#### Index 5 threw an error because our array has elements in the 0th, 1st, 2nd, 3rd, and 4th positions only. Remember, indexing in Python begins from 0

In [22]:
x[ [0,1,2,3] ]

array([5, 6, 7, 8])

You can call elements of the array by passing an array of indices!

In [24]:
x[ np.arange(4) ]

array([5, 6, 7, 8])

The above two cells are equivalent to each other.

### Boolean indexing:
Pass an array of boolean (True/False) values and corresponding elements of the original array are selected

In [6]:
x[[True, False, True, False, True]]

array([5, 7, 9])

In [10]:
x[((x % 2) == 1)]

array([5, 7, 9])

What happened above?

In [11]:
x % 2

array([1, 0, 1, 0, 1])

The percentage sign is known as the modulo operator, which returns the remainder after division. So 5 divided by 2 is equal to 2 with a remainder of 1, 6 divided by 2 is equal to 3 with a remainder of 0 and so on...

In [12]:
(x % 2) == 1

array([ True, False,  True, False,  True])

The above is a logical test for each element of the array produced after the modulo operation. And finally, we pass the boolean array produced from this logical test as boolean indexers of the original array!

### Slicing

In [38]:
print(x[0: 5])

[5 6 7 8 9]


In [33]:
print(x[-1], x[-2], x[-3], x[-4], x[-5])

9 8 7 6 5


Numpy slicing operator is ":" the colon. Use the operator with the lower index and the higher index to access slices of the array

In [32]:
x[0:3]

array([5, 6, 7])

In [15]:
x[0:-1]

array([5, 6, 7, 8])

In [16]:
x[0:-2]

array([5, 6, 7])

You can step over elements of the array using double colons ::<step_size> followed by the step size

In [20]:
x[::2]

array([5, 7, 9])

In [18]:
x[::-1]

array([9, 8, 7, 6, 5])

It also works in reverse!

In [21]:
x[::-2]

array([9, 7, 5])

The indexing and slicing operations work on any N dimensional array. Just seperate the indices by commas:

In [31]:
twoDarray = np.array([ [1,2,3], 
              [4,5,6],
              [7,8,9]])

In [32]:
twoDarray

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

The same can be created using the reshape function in numpy

In [34]:
np.arange(1, 10, 1).reshape(3,3)

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [35]:
twoDarray[0,0]

1

In [36]:
twoDarray[0,1]

2

In [37]:
twoDarray[1,0]

4

In [38]:
twoDarray[ [0, 1, 2], [0, 0, 0] ]

array([1, 4, 7])

In [39]:
twoDarray[:, 0]

array([1, 4, 7])

The above two operations are equivalent, can you work out how? In the 1st case, an array of row indices and an array of column indices were passed. The corresponding elements located at (0,0), (1,0), and (2, 0) were retrieved. In the 2nd cell, the slicing operator selects all the rows and the 0th column.

To see the slicing operation more explicitly, see below:

In [40]:
twoDarray[0:3, 0]

array([1, 4, 7])

### Array operations

All operations on the numpy object are by default applied to every element in the array

In [41]:
x

array([5, 6, 7, 8, 9])

In [42]:
x + 10

array([15, 16, 17, 18, 19])

In [45]:
x

array([1.5, 1.6, 1.7, 1.8, 1.9])

In [44]:
x = (x + 10) / 10

In [51]:
x

array([ 7.5,  1.6,  1.7, 37.5,  1.9])

In [49]:
x[3] = x[0] * 5

In [50]:
x

array([ 7.5,  1.6,  1.7, 37.5,  1.9])

### Exercise 03: Using both arange and linspace

1. Create an array of length 5, 10, 100, 1000 of equally spaced numbers between 0 and 1 
1. Create a 2 dimensional array, "arr", of shape 3x3, with the first row having all 1's, second row having all 2's, and the third row having all 3's
1. Access the diagonal elements of "arr" created above and multiply it by 10
1. Access the first column of "arr" and add it with the third column of "arr"

In [28]:
x = np.arange(100).reshape((10,10))

In [42]:
x

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
       [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
       [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])

In [30]:
x[ np.arange(10), np.arange(10) ]

array([ 0, 11, 22, 33, 44, 55, 66, 77, 88, 99])

## Exercise 04: Fancy indexing

1. Create an array ranging from 0 to 100: arange(0, 100, 1).
1. Reshape this array into 10 rows and 10 columns.
1. Using arrays of indices, select the diagonal elements alone.
1. Using slicing operators, select the 1st 3 elements of the 1st row.
1. Select all elements that are greater than 49.
1. use np.triu_indices to extract the upper triangular elements of the matrix (all elements that lie on or above the diagonal)