## Lecture 2 - Introduction to Numpy

### Basics of using [NumPy](https://numpy.org/doc/stable/).

Will use the packages `numpy` and `time` in this notebook. 
> If not using Google Colab (and haven't used them before) you might have to install the packages the very first time. To do so, use one of the following terminal commands from your home directory.

`py -m pip install <package name>` (Windows)

`sudo pip install <package name>` (Linux or Unix-based)

In [9]:
import numpy as np
import time

---

### NumPy arrays

The main object that you work with in Numpy is an `ndarray`, 

constructed by typing `np.array(the_list)`. 

If the items in the list are numerical &ndash; `int` or `float` &ndash; then the `ndarray` object is much like a vector &ndash; also called a _tensor of order 1_ (confusingly, some call this a ``rank 1 tensor'', though this is not rank in the usual sense). 

For any integer $d\ge 1$, you can have a Numpy array that is a tensor of order $d$. You may also read it described as a "$d$-dimensional array."

One of the attributes of a Numpy array is its _shape_. When it is order 1, the shape has the form `(n,)`, where `n` is the number of items (coordinates) in that vector.

In [10]:
v = np.array([-1,1,1])

# Here is the shape of v
v.shape

(3,)

In [6]:
type(v)

numpy.ndarray

An array made from a list of lists, each "inside" list having numeric type items, is a tensor of order 2, i.e., a matrix. (The inside lists must all have the same length.)

If `A` is such an object, then to refer to the entry in row `i` and column `j`, use `A[i,j]`.

Below, we make two matrices, `A` and `B`, and another vector `u`.  See how we can do linear combinations as in linear algebra.

In [11]:
u = np.array([1,1,0])
A = np.array([[1,2,3],[4,5,6]])
B = np.array([[1,0],[1,-1],[1,1]])

In [4]:
v + 2*u

array([1, 3, 1])

In [5]:
2*A

array([[ 2,  4,  6],
       [ 8, 10, 12]])

In [6]:
# the shape of 2d array is (m,n) when it is an m x n matrix
A.shape

(2, 3)

Multiplying matrices and multiplying a matrix times a vector are done easily. To multiply matrices, use the `@` symbol (in recent versions of Python, this replaces `np.matmul()`). This will work for a matrix times a vector too. However, for matrices and vectors (not higher rank tensors) you could also use `np.dot()`. This performs the dot product on each row (if a matrix is given first).

In [7]:
# products with arrays
A@B

array([[ 6,  1],
       [15,  1]])

In [12]:
A@v

array([4, 7])

#### Indexing and slicing

Indexing a 1d array is the same as for a list. In a matrix (2d array), separate the row and column by a comma. So, `A[0,1]` is the entry from the top row, index 1.

In [17]:
print(v[0])
print(A[0,1])

-1
2


Slicing of a 1d array is also the same as for lists. For a 2d array, you can use slicing in each of the index positions (for rows and for columns). 

For example, `A[:,0]` gives all entries in the first column &ndash; the colon goes through all rows, and you just take index 0 item from each row.

In [19]:
print(A)
A[:,0]

[[1 2 3]
 [4 5 6]]


array([1, 4])

Say we have a larger matrix, but want a submatrix.

In [29]:
M = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]])

# To get the top-right 2x2 submatrix
M[:2, -2:] # <- first two rows, last two columns

array([[3, 4],
       [7, 8]])

In [30]:
# With arrays, you can even get non-consecutive slices
M[:,[0,2]] # <- get entries with column indices 0 and 2

array([[ 1,  3],
       [ 5,  7],
       [ 9, 11],
       [13, 15]])

The transpose of a matrix is obtained simply by typing `.T` after the matrix.

In [22]:
A.T

array([[1, 4],
       [2, 5],
       [3, 6]])

---

### Linear algebra

#### All zero matrix and the identity matrix
Since they are needed so often, there is a function to construct a tensor filled with zeros of your chosen size and also a function to construct an $n\times n$ identity matrix.

In [23]:
zerovector = np.zeros(5)
zeromatrix = np.zeros((4,4))
print(zerovector, ',')
print(zeromatrix)

[0. 0. 0. 0. 0.] ,
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]


In [24]:
# quickly make a diagonal matrix
np.diag([3,2,1])

array([[3, 0, 0],
       [0, 2, 0],
       [0, 0, 1]])

In [26]:
np.identity(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

Instead of getting a submatrix, if you want to "zero out": (1) every entry above the diagonal of `A`, (2) every entry below, or (3) both above and below:
> (1) `np.tril(A)`; &nbsp;&nbsp;&nbsp;&nbsp; (2) `np.triu(A)`; &nbsp;&nbsp;&nbsp;&nbsp; or (3) `np.diag(A)`</div>

In [31]:
np.tril(M)

array([[ 1,  0,  0,  0],
       [ 5,  6,  0,  0],
       [ 9, 10, 11,  0],
       [13, 14, 15, 16]])

> Numpy has a [Linear Algebra](https://numpy.org/doc/stable/reference/routines.linalg.html#module-numpy.linalg) submodule with a lot of functionality for doing linear algebra. For example, the determinant of a matrix `np.linalg.det()`, a matrix inverse, `np.linalg.inv()`, and computing eigenvalues and eigenvectors, `np.linalg.eig()`. 

In [34]:
np.linalg.det(M), np.linalg.det(A[:,:2])

(np.float64(0.0), np.float64(-2.9999999999999996))

#### Dealing with Errors
Say we want to solve a linear system $A\textbf{x} = \textbf{b}$. If $A$ is a square, invertible matrix, Numpy handles it easily with the `solve` function in the `linalg` submodule.

In [37]:
M = A@A.T
M

array([[14, 32],
       [32, 77]])

In [38]:
np.linalg.solve(M, np.array([-2,1]))

array([-3.44444444,  1.44444444])

What about solving a system when the coefficient matrix is not square?

In [39]:
# try this, read the error message
A = np.array([[1,2,3],[1,4,-1]])
b = np.array([1,-5])
np.linalg.solve(A, b)

LinAlgError: Last 2 dimensions of the array must be square

---

### Broadcasting and universal functions

Say that you have a 1d array and you want the one whose entries are square roots of entries of your current one. How do you do it?
> BTW, `np.sqrt()` takes the square root of a number.

In [42]:
# example array to use
f = np.array([3*n+1 for n in range(1,100)])

In [44]:
# Could use a for loop
sq_f = []
for x in f:
  sq_f.append(np.sqrt(x))
sq_f = np.array(sq_f)

But NumPy has a better way: **universal functions** and **Broadcasting**.

The function `np.sqrt()` can accept an (nd)array and will take the square roots of all entries in the array.

In [47]:
# this one line will do it
sq_f = np.sqrt(f)

Many functions in NumPy work this way &ndash; called a `ufunc` (universal function).

Examples of ufuncs: `np.abs()`, `np.maximum()`, `np.minimum()`, `np.exp()`, `np.log()`.

In [55]:
np.maximum(f,100 - 2*f)

array([ 92,  86,  80,  74,  68,  62,  56,  50,  44,  38,  34,  37,  40,
        43,  46,  49,  52,  55,  58,  61,  64,  67,  70,  73,  76,  79,
        82,  85,  88,  91,  94,  97, 100, 103, 106, 109, 112, 115, 118,
       121, 124, 127, 130, 133, 136, 139, 142, 145, 148, 151, 154, 157,
       160, 163, 166, 169, 172, 175, 178, 181, 184, 187, 190, 193, 196,
       199, 202, 205, 208, 211, 214, 217, 220, 223, 226, 229, 232, 235,
       238, 241, 244, 247, 250, 253, 256, 259, 262, 265, 268, 271, 274,
       277, 280, 283, 286, 289, 292, 295, 298])

In a minute, we will see that universal functions not only make less lines of code, but dramatically improve runtimes.

Some other functions to use with arrays:
> `np.sum()`, `np.max()`, `np.min()`
These are not ufuncs, but they perform quickly on NumPy arrays (for related reasons).  
&#9888; However, they are not quick on lists! Use the built-in `sum()`, `max()`, and `min()` on a list.

In [59]:
A

array([[ 1,  2,  3],
       [ 1,  4, -1]])

In [64]:
# over all entries, sum entries in same position of each row
np.sum(A), np.sum(A, axis=0)

(np.int64(10), array([2, 6, 2]))

These efficient ufuncs use something called _broadcasting_. NumPy has made it so that basic operators work with NumPy arrays in a similar way &ndash; efficiently carrying out element-wise operations.

In [67]:
# add 3 to every entry 
A + 3

array([[4, 5, 6],
       [4, 7, 2]])

In [68]:
# multiply every entry by -1
-1*A

array([[-1, -2, -3],
       [-1, -4,  1]])

In [69]:
# square every entry
A**2

array([[ 1,  4,  9],
       [ 1, 16,  1]])

You can also easily multiply in entry-wise fashion. (Make sure the arrays have same shape.)

In [70]:
# multiply two arrays
B = np.array([[-1,2,0],[1,1,-1]])
A*B

array([[-1,  4,  0],
       [ 1,  4,  1]])

In [71]:
A.shape, B.shape

((2, 3), (2, 3))

Write one line that computes the norm (length) of a vector `v`. There is a `linalg` function for this, but use ufuncs and broadcasting &ndash; that's what it does.

Efficiency of broadcasting

We'll check difference in writing our own for loop versus using a ufunc.

In [76]:
id_matrix = np.identity(1000)
exp_matrix = np.zeros((1000,1000))
# one entry at a time in our own for loop
start = time.time()
for i in range(1000):
  for j in range(1000):
    exp_matrix[i,j] = np.exp(id_matrix[i,j])
end = time.time()
print(f"Ran in {end - start} seconds.")

Ran in 3.024575710296631 seconds.


In [77]:
id_matrix = np.identity(1000)
exp_matrix = np.zeros((1000,1000))
# now we use the ufunc np.exp
start = time.time()
exp_matrix = np.exp(id_matrix)
end = time.time()
print(f"Ran in {end - start} seconds.")

Ran in 0.011218786239624023 seconds.
