## Python and NumPy Foundations

You can skip reading this section if you are already comfortable with the basics of Jupyter notebooks and Python.  This is here in case you are less familiar with Python or want a review.  This document is adapated from the [Python Numpy tutorial](cs231n.github.io/python-numpy-tutorial/) for CS231n at Stanford.  I've tried to focus on just what we'll need for this class; you can find more topics in the CS231n tutorial, but discussed in less depth.

### Jupyter notebooks

This document is a Jupyter notebook.  Notebooks comprise a series of cells, each of which is either a Markdown cell or a Code cell (actually there are a couple of other types of cells, but we won't use them).  Markdown cells contain text or mathematics, and code cells contain Python code.

You can run a cell by clicking on it so that your cursor is in the cell, and then pressing the `shift` and `enter` (or `return`) keys on your keyboard, or by clicking the "Run" arrow in the toolbar at the top of the screen.  For a Markdown cell, this will just make the text look nice.  For a code cell, this will execute the code.  When a code cell has successfully run, a number will appear in square brackets to the left of the cell (the numbers show the order in which cells were run).

### Basic data types

We will work with four basic data types in this class: integers (type `int`), real numbers (type `float`), booleans (True/False, type `bool`), and text strings (type `string`).  These are briefly illustrated below:

**Integers:** (`int`)

In [2]:
a = 3

# The type of an integer is 'int'
print(type(a))

# Arithmetic operations work, and the result may be a different type:
print(a/2)
print(type(a/2))

<class 'int'>
1.5
<class 'float'>


**Real Numbers:** (`float`)

In [3]:
a = 3.0

# The type of a real number is 'float'. This stands for floating point.
# See https://en.wikipedia.org/wiki/Floating-point_arithmetic
print(type(a))

# Adding works
print(a + 1.7)

<class 'float'>
4.7


**Booleans:** (`bool`)

In [4]:
a = True
b = False

# The type of a boolean is 'bool'
print(type(a))

# Are both a and b True?  No, result is False
print(a and b)

# Is either a or b True?  Yes, result is True
print(a or b)

# What is the logical opposite of a?  `a` is True, so `not a` is False
print(not a)

<class 'bool'>
False
True
False


**Text:** (`str`)

In [5]:
a = "I love cats!"
b = "Dogs are great too!"

# The type of a text string is 'str'
print(type(a))

# "adding" strings concatenates them
print(a + " " + b)

# To concatenate with another variable type, you have to convert it to a string with str().
num_cats = 2
print("I have " + str(num_cats) + " cats =^.^=")

<class 'str'>
I love cats! Dogs are great too!
I have 2 cats =^.^=


### Containers

We often want to keep track of multiple values at once.  There are several types of container objects that can do this for us:

 * tuples: vectors where (for the most part) you can't change the values in the vector once it is created.  Created using round parentheses, `()`.
 * lists: vectors where you can modify the values and add new entries.  Created using square brackets, `[]`.
 * dictionaries: like a list, where each element has a name.  Created using curly brackets, `{}`.

There are also many other types of containers in Python, but these are the basic ones we'll work with.  We will also see Numpy arrays (discussed in a separate section below) and Pandas dataframes (discussed in a separate document later).

We'll see more about indexing containers in the Numpy section below, but here are the basics:
 * For tuples and lists, indexing starts at 0, so the first element of a container is accessed with `a[0]` and the second with `a[1]`.
 * You can access multiple elements with a `:`; for example, `a[0:2]` pulls out entries of `a` starting at entry 0 and going up to *but not including* entry 2.
 * A dictionary is indexed using the names.

**Tuples:** Can't change, created with round parentheses `()`

In [6]:
cats = ("Benedict", "Cosmos", "Mocha")

print(cats[0])
print(cats[1])
print(cats[0:2])

# This should throw an error since a tuple can't be modified:
cats[0] = "Nimbus"

Benedict
Cosmos
('Benedict', 'Cosmos')


TypeError: 'tuple' object does not support item assignment

**Lists:** Can change, created with square brackets `[]`

In [8]:
cats = ["Benedict", "Cosmos", "Mocha"]

print(cats[0])
print(cats[1])
print(cats[0:2])

# This will change the first entry of the list to "Nimbus"
cats[0] = "Nimbus"
print(cats)

Benedict
Cosmos
['Benedict', 'Cosmos']
['Nimbus', 'Cosmos', 'Mocha']


**Dictionaries:** Can change, elements are named, created with curly brackets `{}`

In [9]:
cats = {
    "Benedict": "grey tuxedo",
    "Cosmos": "black",
    "Mocha": "brown"
}

# Access an element of a dictionary by name
print(cats["Mocha"])

# Modify the value of an entry or add a new entry by indexing with the name
cats["Mocha"] = "tortoise shell"
cats["Nimbus"] = "white"
print(cats["Mocha"])
print(cats["Nimbus"])

brown
tortoise shell
white


### Functions in Python

A function is a re-usable piece of code.  Here is an example of a function that calculates $f(x, y) = 3x - 2y + 4$:

In [10]:
def f(x, y):
    '''
    Compute 3x - 2y + 4
    
    Arguments:
        x: a number
        y: a number
    
    Returns:
        3x - 2y + 4
    '''
    return 3*x - 2*y + 4

Here is a break down of the elements this code:
 * `def` stands for `define`: we are going to define a new function
 * `f` is the name of the function.  We get to decide what we want to call the function.
 * `x, y` are the names of the *arguments* of the function.  This particular function has two arguments, `x` and `y`.  Each time we call the function, we can provide different values for these arguments.
 * The following text is a *docstring* for the function.  The first line of a docstring should describe what the function does in a sentence or two; the next part describes what the arguments to the function are; and the last part says what the function returns.
 * This function has only one line of actual code, which calculates the desired result and returns it.  Returning the result means that the code that calls this function will get to see what the result is.

We can use code like the following to call the function.  By default, output from only the last line in a code cell will be displayed.  To see the results of two different calculations, we must explicitly `print` them.

In [11]:
a = f(4, 2)
print("a = " + str(a))
b = f(0, -3)
print("b = " + str(b))

a = 12
b = 10


### Numpy

Numpy is a library for mathematical computing in Python.  We'll give a brief overview of the methods we'll use most in this class for creating and computing with Numpy arrays.

#### Importing Numpy

Import the Numpy library with the following command:

In [12]:
import numpy as np

Here is a break down of the code above:

 * `import` imports a library so that we can use the functionality it provides
 * `numpy` is the name of the library we want to import
 * `as np` means that below, we can access functions in numpy using `np.function_name` instead of `numpy.function_name`.  Three precious characters of typing saved!  It's standard to `import numpy as np`; you should stick with this convention so that other people can read your code more easily.

#### The shape of an array

The code below creates two numpy arrays representing a row vector and a matrix (I don't think I'll ever ask you to create arrays using this manual approach, this is just for illustration):

$$ a = \begin{bmatrix} 1 & 2 \end{bmatrix} \qquad b = \begin{bmatrix} 3 & 4 & 5 \\ 6 & 7 & 8 \end{bmatrix}$$ 

In [13]:
a = np.array([[1, 2]])
b = np.array([[3, 4, 5],
              [6, 7, 8]])
print(a)
print(b)

[[1 2]]
[[3 4 5]
 [6 7 8]]


The shape of an array is a tuple with its dimensions:

In [14]:
# a has 1 row and 2 columns, so its shape is (1, 2)
print(a.shape)

# b has 2 rows and 3 columns, so its shape is (2, 3)
print(b.shape)

(1, 2)
(2, 3)


Arrays can also have more than two dimensions.  Here is an example of a 3-dimensional array obtained by stacking three $2 \times 2$ matrices next to each other:

In [15]:
c = np.array([[[1, 2],
               [3, 4]],
              [[5, 6],
               [7, 8]],
              [[9, 10],
               [11, 12]]])
print(c)
print(c.shape)

[[[ 1  2]
  [ 3  4]]

 [[ 5  6]
  [ 7  8]]

 [[ 9 10]
  [11 12]]]
(3, 2, 2)


Arrays can also have just one dimension, but this causes problems so we will avoid it.

#### Reshaping Arrays

An array can be "reshaped", as long as the new dimensions you provide have the right amount of space for the entries in the original array.  This doesn't modify the shape of the array in place unless you assign the return value of reshape to the variable.  Here's an example:

In [16]:
b = np.array([[3, 4, 5],
              [6, 7, 8]])

# recall b has shape (2, 3), with 6 total elements:
print("the original shape of b is " + str(b.shape))

# we can set the shape of b to (1, 6) since that also has 6 elements:
b = b.reshape((1, 6))
print("after setting shape to (1, 6), we have:")
print(b)
print(b.shape)

# we can also set the shape of b to (1, 2, 3) since that has space for 6 elements:
b = b.reshape((1, 2, 3))
print("after setting shape to (1, 2, 3), we have:")
print(b)
print(b.shape)

# but we can't set the shape of b to (5, 10) since that has space for 50 elements, not 6:
b = b.reshape((5, 10))

the original shape of b is (2, 3)
after setting shape to (1, 6), we have:
[[3 4 5 6 7 8]]
(1, 6)
after setting shape to (1, 2, 3), we have:
[[[3 4 5]
  [6 7 8]]]
(1, 2, 3)


ValueError: cannot reshape array of size 6 into shape (5,10)

#### Creating Arrays

Here are a few functions for creating numpy arrays:

`np.zeros((2, 3))`: Create an array of 0s with shape (2, 3)

In [17]:
a = np.zeros((2, 3))
print(a)

[[0. 0. 0.]
 [0. 0. 0.]]


`np.random.standard_normal((2, 3))`: Create an array of shape (2, 3) where the entries are sampled independently from a Normal(0, 1) distribution.

In [18]:
a = np.random.standard_normal((2, 3))
print(a)

[[ 1.41655483 -0.20843947  1.22443222]
 [ 0.0440023  -2.84832004 -0.11162346]]


`np.random.random((2, 3))`: Create an array of shape (2, 3) where the entries are sampled independently from a Uniform(0, 1) distribution (any real number between 0 and 1 is equally likely).

In [19]:
a = np.random.random((2, 3))
print(a)

[[0.08555475 0.92183595 0.77323926]
 [0.73416495 0.73012579 0.76983056]]


If you run the above cell multiple times, you will get different numbers in the array.  However, the "random" numbers numpy generates are not truly random.  They are generated with a deterministic algorithm that creates numbers that look random.  This algorithm takes a "seed", which basically encodes the current state of the random number generation.  If you set a seed, you will get the same numbers every time you run the code.  This can be helpful sometimes when you want to be sure your work is reproducible.

In [20]:
np.random.seed(89663)
print(np.random.random((2, 3)))

[[0.00489252 0.87210551 0.89042912]
 [0.60619467 0.32693521 0.89276379]]


#### Array Indexing

You can access and modify the elements of an array using square brackets.  You will need to specify the same number of indices as the dimension of the array.  Things to remember:

 * indexing starts at 0;
 * notation like `1:3` will select entries starting in position 1 and going up to but not including position 3;
 * a `:` by itself will select all entries along that dimension.

Here are some examples:

In [21]:
b = np.array([[3, 4, 5],
              [6, 7, 8]])

# b has shape (2, 3)
print("the shape of b is " + str(b.shape))

# the "first" element of the array, the number 3, is at position [0, 0]
print("the first element of b is " + str(b[0, 0]))

# the : notation goes up to but not including the second index you provide
print("the first two columns of b are " + str(b[0:2, 0:2]))

# a : by itself selects all values along that axis
print("the second row of b is " + str(b[1, :]))

the shape of b is (2, 3)
the first element of b is 3
the first two columns of b are [[3 4]
 [6 7]]
the second row of b is [6 7 8]


In [22]:
c = np.array([[[1, 2],
               [3, 4]],
              [[5, 6],
               [7, 8]],
              [[9, 10],
               [11, 12]]])

# c has shape 3, 2, 2; when indexing, we need to provide indices for all 3 dimensions!
c.shape

# where is the number 2 in this array?
c[0, 0, 1]

2

### Mathematical Operations for Arrays

#### Elementwise operations

The operations `+`, `-`, `*`, and `/`, as well as functions like `np.sqrt`, `np.exp`, and `np.log` are applied separately to each element of arrays.

If we are working with two arrays of different shapes, the smaller one will be *broadcasted* to a shape matching the larger, if their shapes are compatible.  You can think of this as stacking multiple copies of the smaller array next to each other until the shape matches the shape of the larger array.

We'll illustrate with the following examples:

In [23]:
a = np.array([[1, 2], [3, 4]])
print("The shape of a is " + str(a.shape))

b = np.array([[5, 6], [7, 8]])
print("The shape of b is " + str(b.shape))

c = np.array([[9, 10]])
print("The shape of c is " + str(c.shape))

d = np.array([[13]])
print("The shape of d is " + str(d.shape))

The shape of a is (2, 2)
The shape of b is (2, 2)
The shape of c is (1, 2)
The shape of d is (1, 1)


In [24]:
print("a + b = " + str(a + b))
print("a + c = " + str(a + c))
print("a + d = " + str(a + d))

a + b = [[ 6  8]
 [10 12]]
a + c = [[10 12]
 [12 14]]
a + d = [[14 15]
 [16 17]]


In [25]:
print(np.exp(a))

[[ 2.71828183  7.3890561 ]
 [20.08553692 54.59815003]]


#### Row and column sums, means, and standard deviations

The function `sum` will compute row and/or column sums of an array, `mean` will calculate row and/or column means, and `std` will calculate row and/or column standard deviations.  Some things to keep in mind:

 * The `axis` argument says which dimension of the array to sum over, take the mean of, or find the standard deviation of.  For example, if I'm working with a matrix then `axis = 0` says I want to add up values across different rows.
 * By default, the result will have a lower dimension (shorter shape) than the original array.  This is almost never what you want, and you can override the default by providing `keepdims = True`

Here are some examples:

In [26]:
b = np.array([[3, 4, 5],
              [6, 7, 8]])

# Sum across the rows (axis = 0) to get a total for each column
col_sums = b.sum(axis = 0)
print("column sums = " + str(col_sums))

# Take the mean of values across all rows (axis = 0) to get a mean for each column
col_means = b.mean(axis = 0)
print("column means = " + str(col_means))

# Sum across the columns (axis = 1) to get a total for each row
row_sums = b.sum(axis = 1)
print("row sums = " + str(row_sums))

# Sum across the columns (axis = 1) to get a total sum for the whole matrix
total_sum = b.sum(axis = (0, 1))
print("row and column sum = " + str(total_sum))

column sums = [ 9 11 13]
column means = [4.5 5.5 6.5]
row sums = [12 21]
row and column sum = 33


#### Matrix Operations

The most common matrix operations we'll need are the transpose (`.T`) and the matrix product (`np.dot`):

In [27]:
a = np.array([[1, 2], [3, 4]])
b = np.array([[3, 4, 5],
              [6, 7, 8]])

print("The shape of a is " + str(a.shape) + " and the shape of b is " + str(b.shape))

The shape of a is (2, 2) and the shape of b is (2, 3)


In [28]:
# Here's the transpose of b
b.T

array([[3, 6],
       [4, 7],
       [5, 8]])

In [29]:
# Here's the matrix product of a and b
np.dot(a, b)

array([[15, 18, 21],
       [33, 40, 47]])

In [30]:
# Note that the matrix product requires the dimensions to match up!
np.dot(b, a)

ValueError: shapes (2,3) and (2,2) not aligned: 3 (dim 1) != 2 (dim 0)

#### Logical Comparisons and Type Conversions

Suppose I have the array

$$b = \begin{bmatrix} 3 & 4 & 5 \\ 6 & 7 & 8 \end{bmatrix}$$ 

I want to check whether or not each element is at least as large as 7 as follows.  Note that the result is the same shape as the original array b, but now every element is a boolean:

In [32]:
b = np.array([[3, 4, 5],
              [6, 7, 8]])
result = b >= 7
print(result)

[[False False False]
 [False  True  True]]


Often, we would like to convert the results of this back to numbers (0 if False, 1 if True).  We can do that with the `astype` function:

In [35]:
result = result.astype(float)
print(result)

[[0. 0. 0.]
 [0. 1. 1.]]
