# AIDM7330 Basic Programming for Data Science
# Tutorial: Numerical Python (NumPy)
## Introduction to NumPy

NumPy stands for Numerical Python and it is the fundamental package for scientific computing with Python. 

It is a package that lets you efficiently store and manipulate numerical **arrays**. 

NumPy contains an array object. The core feauture that NumPy supports is its multi-dimensional arrays. In NumPy, dimensions are called axes and the number of axes is called a rank.

First, let's import Numpy as np. This lets us use the shortcut np to refer to Numpy.

In [None]:
import numpy as np

## Creating Arrays
Create a list and convert it to a numpy array

In [None]:
mylist = [1, 2, 3]
x = np.array(mylist)
x

We can do it more succinctly by passing the list directly.

In [None]:
y = np.array([4, 5, 6])
y

Pass in a list of lists to create a multidimensional array (2\*3 array).

In [None]:
m = np.array([[7, 8, 9], [10, 11, 12]])
m

Use the shape method to find the dimensions of the array. (rows, columns)

In [None]:
m.shape

In [None]:
x.shape

We pass in a start, a stop, and a step size, and `arange` returns evenly spaced values within a given interval.

In [None]:
n = np.arange(0, 30, 2) # start at 0 count up by 2, stop before 30
n

In [None]:
n.reshape(3, 5)

In [None]:
n

`reshape` returns an array with the same data with a <span style="color:DARKORANGE">new</span> shape (a 3\*5 array).

In [None]:
n = n.reshape(3, 5) # reshape array to be 3x5
n

`linspace` returns evenly spaced numbers over a specified interval.

In [None]:
o = np.linspace(0, 4, 9) # return 9 evenly spaced values from 0 to 4
o

In [None]:
o = np.linspace(0, 4, 9)endpoint=False

`resize` changes the shape and size of array <span style="color:DARKORANGE">in-place</span>.

In [None]:
o.resize(3, 3)
o

`ones` returns a new array of given shape and type, filled with ones.

In [None]:
np.ones((3, 2))

`zeros` returns a new array of given shape and type, filled with zeros.

In [None]:
np.zeros((2, 3))

`eye` returns a 2-D array with ones on the diagonal and zeros elsewhere.

In [None]:
np.eye(4)

`diag` extracts a diagonal or constructs a diagonal array.

In [None]:
print(x)
np.diag(x)

Create an array using repeating list (or see np.tile)

In [None]:
np.array([1, 2, 3] * 3)

Repeat elements of an array using repeat.

In [None]:
np.repeat([1, 2, 3], 3)

### Combining Arrays

In [None]:
p = np.ones([2, 3], int)
p

Use `vstack` to stack arrays in sequence vertically (row wise).

In [None]:
np.vstack([p, 2*p])

Use `hstack` to stack arrays in sequence horizontally (column wise).

In [None]:
np.hstack([p, 2*p])

## Operations
Use `+`, `-`, `*`, `/` and `**` to perform element wise addition, subtraction, multiplication, division and power.

In [None]:
x

In [None]:
y

In [None]:
print(x + y) # elementwise addition     [1 2 3] + [4 5 6] = [5  7  9]
print(x - y) # elementwise subtraction  [1 2 3] - [4 5 6] = [-3 -3 -3]

In [None]:
print(x * y) # elementwise multiplication  [1 2 3] * [4 5 6] = [4  10  18]
print(x / y) # elementwise divison         [1 2 3] / [4 5 6] = [0.25  0.4  0.5]

In [None]:
x = x/3
x

In [None]:
print(x**2) # elementwise power  [1 2 3] ^2 =  [1 4 9]

In [None]:
x

Use `.dtype` to see the data type of the elements in the array.

In [None]:
x.dtype

## Math Functions
Numpy has many built in math functions that can be performed on arrays.

In [None]:
a = np.array([-4, -2, 1, 3, 5])

In [None]:
a.sum()

In [None]:
a.max()

In [None]:
a.min()

In [None]:
a.mean()

In [None]:
a.std()

`argmax` and `argmin` return the index of the maximum and minimum values in the array.

In [None]:
a.argmax()

In [None]:
a.argmin()

## Indexing / Slicing

In [None]:
s = np.arange(13)**2
s

Use bracket notation to get the value at a specific index. Remember that indexing starts at 0.

In [None]:
s[0], s[4], s[-1]

Use `:` to indicate a range. `array[start:stop]`


Leaving `start` or `stop` empty will default to the beginning/end of the array.

In [None]:
s[1:5]

Use negatives to count from the back.

In [None]:
s[-4:]

A second `:` can be used to indicate step-size.

Here we are starting 5th element from the end, and counting backwards by 2 until the beginning of the array is reached.

In [None]:
s[-5::2]

Let's look at a multidimensional array.

In [None]:
r = np.arange(36)
r.resize((6, 6))
r

We can get a specific value by using the comma notation: `array[row, column]`

In [None]:
r[2, 2]

And use `:` to select a range of rows or columns

In [None]:
r[3, 3:6]

Here we are selecting all the rows up to (and not including) row 2, and all the columns up to (and not including) the last column.

In [None]:
r[:2, :-1]

This is a slice of the last row, and only every other element.

In [None]:
r[-1, ::2]

We can also perform conditional indexing. Here we are selecting values from the array that are greater than 30.

In [None]:
r[r > 30]

Here we are assigning all values in the array that are greater than 30 to the value of 30.

In [None]:
r[r > 30] = 30
r

## Copying Data
Be careful with copying and modifying arrays in NumPy!


`r2` is a slice of `r`

In [None]:
r2 = r[:3,:3]
r2

Set this slice's values to zero ([:] selects the entire array)

In [None]:
r2[:] = 0
r2

`r` has also been changed!

In [None]:
r

To avoid this, use `r.copy` to create a copy that will not affect the original array

In [None]:
r_copy = r.copy()
r_copy

Now when r_copy is modified, r will not be changed.

In [None]:
r_copy[:] = 10
print(r_copy, '\n')
print(r)

- The codes in this notebook are modified from various sources. All codes are for educational purposes only and released under the CC1.0. 