# Numpy Review

In this review, we are going to refresh our memories about the Numpy package. Numpy (numerical Python) is the basic engine that turns Python into a tool for data analysis. Anything "data sciency" that you do in Python will rely on Numpy at some point, either explicitly (when calling Numpy fuctions directly), or implicitly (when, e.g., using Pandas).

The point of Numpy is do make working with data in Python easier and faster. Here, we going to remind ourselves of basic Numpy functionality.

## Python lists

First, let's look at a basic Python list:

In [32]:
a_list = [2, 4, 6, 8]

Just to warm up, let's get some items from our list via *indexing*:

Get the first number from the list (the "zeroith" number in Pythonese):

Get the last two numbers:

Now let's make a *nested* Python list:

In [43]:
a_nested_list = [[2, 4], [3, -1], [-2, 1]]

Remember that a Python list can hold data of different types, lengths, etc., but this list is special; it is a *list of lists* all of the same length.

Let's have a look:

In [39]:
print(a_nested_list)

[[2, 4], [3, -1], [-2, 1]]


How would we get the first entry in the second list? 

We could do it in two steps... first, get the second list:

In [40]:
sec_list = a_nested_list[1]

And then get the first entry:

In [41]:
my_num = sec_list[0]
print(my_num)

3


Conveniently, we can just do this in one go:

In [42]:
my_num = a_nested_list[1][0]
print(my_num)

3


This does the same thing without having to invoke the intermediate variable `sec_list`. 

While this list-of-lists construct might seem a little abstract, there is actually a nice what to wrap our heads around it, which is to think of it as a *matrix*.

> Unfortunately, no one can be told what the Matrix is. You have to see it for yourself. *- Morpheus*

Here is an example of a matrix:

![A matrix](./images/mailboxes.png)

A ***matrix*** is a 2 dimensional (2D) arrangement of things and, for our purposes, the things are data of one form or another (numbers, strings, timestamps, etc.).

These mailboxes are numbered sequentially, because there are other mailboxes in other matrices and that's just how the USPS rolls. But, notice that, for *this* matrix of mailboxes, there is another way in which we could uniquely refer to each mailbox. Specifically, we could uniquely specify each mailbox by the *row* it is in and the *column* it is in.

For example, the open mailbox with the key in the door is in the 2nd row and the 4th column or, in terms of Python indexes, the mailbox is at location [1, 3]. 

So, we can think of matrix as an arrangement of data that has a built in ***spatial coordinate system*** used to refer to the items of data.

Here's another Python list of lists:

In [46]:
another_nested_list = [[3.3, 2.3, 2.2], [1.2, 7.8, 8.7], [4.8, 2.2, 6.5],
                       [1.5, 7.5, 9.5], [5.9, 1.6, 7.7]]

As far as Python is concerned, this is just a list that happens to contain 5 lists, each of length 3:

In [48]:
print(f'Another nested Python list: {another_nested_list}')

Another nested Python list: [[3.3, 2.3, 2.2], [1.2, 7.8, 8.7], [4.8, 2.2, 6.5], [1.5, 7.5, 9.5], [5.9, 1.6, 7.7]]


But it makes sense for our human brains to think about it as a 2D arrangement of data, like this: 

| row # | Col # | | | | |
| ---- | ---- | ---- | ---- | ---- | ---- |
|   | 0 | 1 | 2 | 3 |  4 | 5 |
| 0 | 3.3 | 1.2 | 4.8 | 1.5 |  5.9 | 9.0 |
| 1 | 2.3 | 7.8 | 2.2 | 7.5 |  1.6 | 8.1 |
| 2 | 2.2 | 8.7 | 6.5 | 9.5 |  7.7 | 5.2 |

Now we can think of the Python indexes used to access the data as spatial ***row*** and ***column*** coordinates. For example:

In [49]:
another_nested_list[2][1]

2.2

fetches the data value in the second row (row index 1) and the third column (column index 2). 

Even though, in Python terms, `another_nested_list` is just a list of lists that all happen to be of the same length, it's very helpful for our human brains to map data like this onto a matrix and think of the indexes as coordinates.

## Numpy

Numpy is a big and powerful package, but you can think of it's most basic function as making this matrix-like way of thinking about data explicit, as opposed to just a cute way of thinking about lists of lists.

To use numpy, we first import it. Traditionally, it is imported under the name "`np`".

In [6]:
import numpy as np

### Numpy arrays 

#### Making numpy arrays 

In [17]:
print(f'A python list: {a_list}')

a_numpy_thing = np.array((a_list))

print(f'A numpy thing: {a_numpy_thing}')

A python list: [2, 4, 6, 8]
A numpy thing: [2 4 6 8]


In [25]:
a = np.array([[1, 2, 3], [4, 5, 6]])
a

array([[1, 2, 3],
       [4, 5, 6]])

In [21]:
a_vec = np.ones((2,1))

a_vec

array([[1.],
       [1.]])

In [23]:
a_vec.shape

(2, 1)

In [31]:
a_vec = np.array([1, 3, 5])
a_vec.shape

(3,)

In [29]:
a_vec = a_vec[:, np.newaxis]
a_vec

array([[1],
       [3],
       [5]])

#### "vectorized" operations 

#### Indexing cells 

#### Indexing rows and columns

#### Indexing subsets ("slicing")