# Numpy

*Note: This notebook is edited from [Robin Yuen's tutorial on Python](https://github.com/Robinysh/ASC19-Deep-Learning-Tutorial/blob/master/python_tutorial.ipynb).*

Numpy is a python library for doing scientific computing. It provides simplistic syntax and high performance array objects. If you're familiar with MATLAB you will find the syntax familiar.

### How does numpy provides such high performance

Locality of reference

- Lists: Array of pointers to object. (Everything in Python is an object!)

- Numpy Arrays: Like C, densely packed arrays of homogeneous type.  

- Vectorized operations are supported.

Numpy operations are implemented in C

- C is faster

- No pointer indirection and per-element dynamic type checking like python.




## Array Declaration

In [None]:
import numpy as np # Convention

a = np.array([1, 2, 3])   # Create a rank 1 array
print('a shape:', a.shape)            # Prints "(3,)"
print('a:', a)                  # Prints "[5, 2, 3]"

b = np.array([[1,2,3],[4,5,6]])    # Create a rank 2 array
print('b shape:', b.shape)                     # Prints "(2, 3)"
print('b:', b)

## Accessing Arrays

### Accesing Elements

In [None]:
a = np.array([1, 2, 3])
b = np.array([[1,2,3],[4,5,6]])

print(a[0], a[1], a[2])   # Prints "1 2 3"
a[0] = 5                  # Change an element of the array
print(a)

print(b[0, 0], b[0, 1], b[1, 0])   # Prints "1 2 4"
print(b[0][0], b[0][1], b[1][0])   # Prints "1 2 4", but slower as b[0,0] gets the element in one go, but b[0][0] works by returning b[0] then get [0] of b[0]

### Slicing

In [None]:
# Create the following rank 2 array with shape (3, 4)
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print(a)
row_r1 = a[1, :]    # Rank 1 view of the second row of a
col_r2 = a[:, 1:3]
print(row_r1, row_r1.shape)  # Prints "[5 6 7 8] (4,)"
print(col_r2, col_r2.shape)  # Prints "[[ 2 3]
                             #          [ 6 7]
                             #          [10 11]] (3, 2)"

In [None]:
# Indexing methods that work for lists also work for numpy arrays.
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
b = np.array([1,2,3,4,5,6,7,8,9])
print(b[::-1])
#prints [9 8 7 6 5 4 3 2 1]

### Integer Array Indexing

In [None]:
a = np.array([[1,2], [3, 4], [5, 6]])
print(a[[0, 2], [1, 0]])  # Gives [a[0,1],a[2,0]] which is [2 5]
print(a[[0, 1, 2], [0, 1, 0]])  # Prints "[1 4 5]"

## Array Maths
By default NumPy arrays carry out elementwise operations. That means most of the time **you don't need a for loop to iterate over NumPy arrays**.

### Elementwise operations

In [None]:
x = np.array([[1.,2.],[3.,4.]])
y = np.array([[5.,6.],[7.,8.]])

# Elementwise sum;# [[ 6.0  8.0]
#                    [10.0 12.0]]
print(x + y)
# Elementwise difference;
# [[-4.0 -4.0]
#  [-4.0 -4.0]]
print(x - y)
# Elementwise product;
# [[ 5.0 12.0]
#  [21.0 32.0]]
print(x * y)
# Elementwise square root;
# [[ 1.          1.41421356]
#  [ 1.73205081  2.        ]]
print(np.sqrt(x))

### Array Maths with Integer Array Indexing

In [None]:
a = np.array([[1,2], [3, 4], [5, 6]])
# Create an array of indices
b = np.array([0, 1, 0])
# Mutate one element from each row of a using the indices in b
print(a)
print(a[np.arange(3), b])
print(a[[0,1,2], [0,1,0]])
a[np.arange(3), b] += 10
print(a)  # prints "array([[11,  2],
          #                [ 3, 14],
          #                [15, 6]])

### Boolean Array and Indexing

In [None]:
a = np.array([[1,2], [3, 4], [5, 6]])
bool_idx = (a > 2)   # Find the elements of a that are bigger than 2;
                     # this returns a numpy array of Booleans of the same
                     # shape as a, where each slot of bool_idx tells
                     # whether that element of a is > 2.
print(a)
print(bool_idx)      # Prints "[[False False]
                     #          [ True  True]
                     #          [ True  True]]"

# We use boolean array indexing to construct a rank 1 array
# consisting of the elements of a corresponding to the True values
# of bool_idx
print(a[bool_idx])  # Prints "[3 4 5 6]"

# We can do all of the above in a single concise statement:
print(a[a > 2])     # Prints "[3 4 5 6]"

### Matrix Products

In [None]:
x = np.array([[1.,2.],[3.,4.]])
v = np.array([9,10])
w = np.array([11, 12])

# Inner product of vectors; These 3 are equivalent
print(v.dot(w))
print(np.dot(v, w))
print(v @ w) # @ indicates a matrix multiplication operator

# Matrix / vector product; both produce the rank 1 array [29 67]
print(x.dot(v))
print(np.dot(x, v))
print(x @ v)

# Matrix / matrix product; both produce the rank 2 array
# [[19 22]
#  [43 50]]
print(x.dot(y))
print(np.dot(x, y))
print(x @ y)

### sum

In [None]:
x = np.array([[1,2],[3,4]])

print(np.sum(x))  # Compute sum of all elements; prints "10"
print(np.sum(x, axis=0))  # Compute sum of each column; prints "[4 6]".
#Axis indicates the dimension to apply the operation along.
print(x.sum(axis=0))  # Equivalent to np.sum(x, axis=0)

print(x)    # Prints "[[1 2]
            #          [3 4]]"
print(x.T)  # Prints "[[1 3]
            #          [2 4]]"

### arange

Works like range, but returns a numpy array

In [None]:
print(np.arange(1,10,2)) # Prints [1 3 5 7 9]

### flatten

Flattens high dimensional array in C order

In [None]:
x = np.array([[1,2],[3,4]])
print(x)
print(x.flatten())

### einsum

Useful for physics students in manipulating tensors.

In [None]:
a = np.arange(25).reshape(5,5)
b = np.arange(5)
np.einsum('ii', a) #60 #Add all diagonal entries, a_ii
np.einsum('ii->i', a) #array([ 0,  6, 12, 18, 24]) #Get all diagonal entries, a_ii without summing
np.einsum('ij,j', a, b) #array([ 30,  80, 130, 180, 230]) #a_ij b_j, Matrix vector multiplication
a = np.arange(60.).reshape(3,4,5)
b = np.arange(24.).reshape(4,3,2)
np.einsum('ijk,jil->kl', a, b) #a_ijk b_jil -> c_kl, rank 3 tensor contraction
#array([[ 4400.,  4730.],
#       [ 4532.,  4874.],
#       [ 4664.,  5018.],
#       [ 4796.,  5162.],
#       [ 4928.,  5306.]])

## Array Broadcasting
Under some cases (usually for 1D arrays), NumPy can do broadcasting automatically. That means for example when a row vector is added to a column vector, both vectors will be copied along its columns/rows to become 2D arrays with the same shape, and finally they carry out the elementwise addition.

Dimensions of two arrays are compatible when:
1. they are equal, or
1. one of them is 1

Check out the [NumPy Manual](https://docs.scipy.org/doc/numpy-1.13.0/user/basics.broadcasting.html#general-broadcasting-rules) for more examples.

![Broadcasting](./images/broadcasting.png)

In [None]:
a = np.array([[0], [10], [20], [30]])
b = np.array([[0, 1, 2]])
print(a)
print(b)
print(a+b)
print(a*b)


Using broadcasting with `None`, which specifies which of the axis will be expanded during broadcasting, can be more convenient than using `reshape()` or `transpose()`.

In [None]:
import numpy as np
a = np.array([1,2,3])
b = np.array([4,5,6,7])
print(a[None, :])
print(b[:, None])
a[None, :] - b[:, None]

Using Broadcasting to set array element

In [None]:
a = np.ones((4, 5))
a[0] = 2  # we assign an array of dimension 0 to an array of dimension 
print(a) #array([[ 2.,  2.,  2.,  2.,  2.], 
         #       [ 1.,  1.,  1.,  1.,  1.],
         #       [ 1.,  1.,  1.,  1.,  1.],
         #       [ 1.,  1.,  1.,  1.,  1.]])