# Python: a Primer

Python is an object-oriented programming language. This means that **everything in python is an object**. Objects have (1) a class (aka "type"), (2) properties, and (3) methods. Let's look at some simple object types in python and their associated methods:

### Booleans and comparisons

Booleans (logicals) are a simple object type in python. They are binary because they can be only `True` or `False`.

In [None]:
True

In [None]:
False

We can also verify this is a `bool` object with the type method. This method can be used on any type of object in python and it returns the class of that object.

In [None]:
type(True)

### Logical operations

Python makes logical operations easy and allows us to determine whether some logical condition is met. For example, we can use the `and` operator. This operator will evaluate to `True` if both left and right sides are `True`

In [None]:
# And (both left and right are True)
True and True

In [None]:
# And (one side isn't true)
True and False

There is also the `or` operator, which evaluates to `True` if at least one side is `True`.

In [None]:
# Or (only one side has to be True)
True or False

Finally, the `not` (not) operator. This negates a logical's value:

In [None]:
not True

Logical operators can be easily combined to make more complex statements. They also follow an order of operations, just like mathematical expressions. The order is:

1. `()`
2. `not`
3. `and`
4. `or`

In [None]:
not False and True

In [None]:
not False or not True

In [None]:
not False and False

In [None]:
not (False and False)

In [None]:
not False or not (True or not True)

**Challenge question**: What is the output for the following (without running it yourself):

```python
False and True or True and (not False or False)
```

### Numerics and mathematics

There are three main types of numerical objects in python:

- `int` -- includes whole numbers
- `float` -- includes decimals
- `complex` -- includes imaginary numbers

Let's explore the `int` type first. We can create an instance of `int` by simply typing any whole number into the code block:

#### `int` 

In [None]:
1

We can also verify this is an `int` object with the `type` method. 

In [None]:
type(1)

#### `float`

`floats` are numerical objects which have a decimal place. For example:

In [None]:
1.01

In [None]:
type(1.01)

#### `complex`

`complex` are numerical objects which have imaginary numbers. For example:

In [None]:
2 + 2j  # j is the square root of -1

In [None]:
type(2+2j)

#### Numerical methods: Math operations

We can perform simple mathematical operations with numerical objects.

In [None]:
# Addition
1 + 1

In [None]:
# Subtraction
4 - 2

In [None]:
# Multiplication 
2 * 3

In [None]:
# Division
16 / 3

In [None]:
# Floored Division
16 // 3

In [None]:
# Modulo (remainder)
16 % 3

In [None]:
# Exponentiation
2**5

In [None]:
# Negation
-1

In [None]:
# Absolute value
abs(-1)

**Challenge question**: What is the `type` of `22 / 2`?

#### Order of operations

Python obeys PEMDAS (parentheses, exponent, multiplication, division, addition, subtraction) to determine the order in which to evaluate a mathematical operation. For example:

In [None]:
3 + 5 * 2  # It is not 16 because the multiplication comes first

**Challenge question:** What is the result of this operation (without running it yourself): 

```python
2 + 3 * (2 + 25 / 5 ** 2) 
```

#### Logical comparisons of numerics

We can use comparison operators in python to check the relationship between any two numerics. 

In [None]:
# Greater-than
9 > 8

In [None]:
# Less-than
8 < 10

In [None]:
# less-than or equal-to
2 <= 2

In [None]:
# Greater-than or equal-to
10 >= 12

In [None]:
# Equal to
1 == 2

In [None]:
# Not equal to
1 != 2

#### Complex comparisons

We can add in arthimetic to perform more mathematically complex operations

In [None]:
5 ** 2 + 1 == 52 / 2

**Challenge:** What is the result of this statement? (Without running it yourself)

```python
1j**2 == -1**(5/(2+3))
```

In [None]:
import numpy as np
print(np.sqrt(1j))
print(1j**2)

### Strings

Strings hold text data, such as names or addresses. They are constructed by using quotations (double or single):

In [None]:
'Hello world!'  # Single quotes

In [None]:
"Hello world!"  # Double quotes

In [None]:
type("Hello world!")

#### String methods

There are several basic methods for string objects. Many more exist and they will be covered later in the course. For a list of string methods please see the W3 schools guide [here](https://www.w3schools.com/python/python_ref_string.asp).

In [None]:
# Print
print("Hello world!")

In [None]:
# Concatenate
"Hello " + "world!"

In [None]:
# Upper-case
"Hello world!".upper()

#### Logical comparisons with strings

Just like numerics, logical comparisons work with strings as well

In [None]:
"Hello" == "Hello"

In [None]:
"Hello" == "World"

In [None]:
"Hello" != "World"

### Type conversion

Some types of obejcts in `python` can be converted. This is necessary when performing certain operations, such as adding `str` and `int` objects to make a phrase such as the following:

```python
"I am " + 26 + " years old!"
```

If we attempt to run this code, we should see an error because the `+` method only works with strings or numerics, but not both. 


In [None]:
"I am " + 26 + " years old!"

How can we interpret this error? When looking at an error in `python`, you can usually skip right to the last line, in this case: `TypeError: can only concatenate str (not "int") to str`. This line indicates a `TypeError` which arises when an operation is performed on incompatible object types. The text of the error says `can only concatenate str (not "int") to str`, indicating that the user has attempted to `concatenate` (`+`) a `str` with an `int` object, which is not allowed. 

To understand how to fix this, let's fix look at the ways `python` handles type conversion:

1. `int` to `float`

In [None]:
# Let's look at the int 1
type(1)

In [None]:
# Convert int: 1 to a float.
float(1)

In [None]:
# Confirm that float(1) is a float
type(float(1))

2. `bool` to `int`

In [None]:
# Let's look at True
type(True)

In [None]:
# Convert True to int
int(True)

**Challenge:** What error results from `2 ** 2 / int(False)` ? and why this operation produced that error?

3. `str` to `int`

In [None]:
        # Look at the type of "5"
type("5")

In [None]:
# Convert "5" to int
int("5")

In [None]:
# 1 is not the same as "1"
1 == "1"

In [None]:
# str(1) is equivalent to "1"
1 == int("1")

**Challenge:** Modify the code from earlier so that it doesn't produce a `TypeError`:

```python
"I am " + 26 + " years old!"
```

#### Cross-type comparison

Additionally, there is no requirement that logical comparisons involve only one data type. For example:

In [None]:
# 1 is not equivalent to "Hello world!"
1 != "Hello world!"

In [None]:
# Both sides generate booleans which can be compared using "and"
"Hello" != "world" and 1 < 2

### Variables

Variables are names (aka 'aliases' or 'references') given to an object in python. Any object in python can be assigned a variable. Rather than calling the object directly, you can use the variable name instead. This enables complicated code to be written and understood by humans.

To create a variable, use the `=` sign:

In [None]:
a = 1

Now that we have created the variable `a` to hold the integer `1`, we can perform operations on `a` directly.

In [None]:
# Use a for arithmetic
a + 2

In [None]:
# Use a for logical comparisons
a != "Hello world!"

In python, any variable can reference any object, including the results of computations.

In [None]:
result_1 = 1 + 2 < 3  # Variable to reference numeric comparison
result_2 = "Hello " + "world" == "Hello world"  # Variable to reference string comparison

In [None]:
result_1 or result_2

#### `is` and `==`

In python, two methods exist for testing equivalence:

1. `==` (equivalent values)
2. `is` (identical objects)

While the distinction is subtle, it is crucial to remember that `is` tests whether two objects are literally the same where as `==` only tests whether two objects are equal to eachother. 


For example, we can assign the numerical object `1` to the variable `a`, and then assign `a` to `b`. Both `a` and `b` refer to the same object of the numeric class holding the value `257`. Therefore, they are equivalent and the same.

In [None]:
a = 257
b = a

In [None]:
a == b

In [None]:
a is b

Conversly, if we assign `a` and `b` to `257` separately, we see that they do not refer to the same object:

In [None]:
a = 257
b = 257

In [None]:
a == b

In [None]:
a is b

**EXTREME Challenge question**:

What happens when I repeat the above example using `256` instead of `257`? Why does the result change? 

*Hint*: See [this article](https://codeburst.io/the-unseen-pitfalls-of-python-7ca57f021d08) for additional guidance.

### Intermediate Python

Now that we have discussed simple python objects, lets explore the wide world of complex objects. These objects provide powerful methods for the storage and manipulation of data. They are essential tools for the data scientist to wield.


#### Lists

Lists are a python object type which can store any arbitrary number of any type of object. 

In [None]:
# List of strings
words = ["Hello", "World"]
words

In [None]:
# List of numbers
numbers = [1, 2, 3]
numbers

In [None]:
# List of booleans
bools = [True, False, False]
bools

In [None]:
# Mixed list
mix = [1, True, "Hello"]
mix

##### List methods

Lists have a wide variety of methods associated with them. For a more exhaustive reference, please refer to the W3 schools guide [here](https://www.w3schools.com/python/python_ref_list.asp).

For now, we will discuss:
1. Construction
2. Indices
3. Appending
4. Length
5. Sort

##### Construction

Lists can be constructed using the `list()` function:

In [None]:
# Just like other python object, there is a constructor function for lists
my_list = list()
my_list

More commonly, lists are defined by using `[]` and providing the objects to include:

In [None]:
my_list = [1, 2, 3, 'a', 'b']
my_list

In [None]:
# Lists can even contain lists
lst_list = [1, 2, 3, [4, 5, 6]]
print(lst_list)

##### Indices

Lists hold data and have a specific order. To access the objects in a list, one can use the object's index. **NOTE**: unlike `R`, indices start at `0` in python.

In [None]:
my_list = [1, 2, 3, 'a', 'b']

In [None]:
# Retrieve first element from list
my_list[0]

In [None]:
# Retrieve fourth element from list
my_list[3]

In [None]:
# Retrieve the last element from list
my_list[-1]

Indices can also be accessed using a slice (`start:stop:step`). The slice indicates the range of indices to retrieve. If either `start` or `stop` is blank all elements will be included up until an element (former) or after an element (latter). The `step` is the intervals between values -- if not specified, it will be `1` by default. 

In [None]:
# Retrieve the values from the 2nd to the 5th element
my_list[1:4]

In [None]:
# Retrieve all values from the 2nd element to the end of the list
my_list[1:]

In [None]:
# Retrieve all values until the 4th element
my_list[:3]

In [None]:
# Step allows you to specify the intervals
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
numbers[2:9:2]  # From 3rd element to 10th element, in steps of 2

In [None]:
# You can also use steps to help reverse the list order
numbers[9:1:-1]  

In [None]:
# You can also get elements from within nested lists
nested_list = [1, 2, 3, ["a", "b", "c"]]
nested_list[3]

In [None]:
nested_list[3][0]

In [None]:
lst = ['a', 'b', 4, [True, 29, "Hello world!"], False, 1, int(False), bool("True")]
lst[::-1][4][::-1][0]

##### Appending

New elements can be added to the end of a list using the `append()` method:

In [None]:
my_list = [1, 2, 3, 'a', 'b']
my_list

In [None]:
# Append 1 to a list
my_list.append(1)
my_list

In [None]:
# NOTE: This overwrites the list object. 
my_list.append(2)
my_list.append("a")
my_list

In [None]:
# NOTE: References to the my_list object will also be modified!
my_list = [1, 2, 3, 'a', 'b']
my_list_2 = my_list

my_list.append("Added :)")
my_list

In [None]:
my_list_2

##### Length

It may be helpful to know the length of a list. You can get that information with the `len()` function:

In [None]:
len(my_list)

##### Sorting

It may be helpful to sort the elements of a list. This can be accomplished with the `sort()` function.

In [None]:
my_list2 = [1, 2, 5, 3]
my_list2.sort()
print(my_list2)

##### in
Sometimes, you want to test whether a value exists within a list. To do this, we use the `in` operator. 

In [None]:
my_lst = ["Hello world!", 200, int("1"), [True, False, [1, 0, bool(0)]], 2**4, "This is a list!", 2+2j]
my_lst

In [None]:
# 200 is inside of my_lst?
200 in my_lst

In [None]:
# "Hello list!" is inside of my_lst
"Hello list!" in my_lst

##### Range

Range allows you to easily construct a list-like object containing all the values between two numbers. 

They follow the pattern `range( start, stop, step )`.

In [None]:
range(0, 100)

In [None]:
range(10, 20, 3)

In [None]:
20 in range(0, 100)

Ranges can be easily converted to a `list`. **Note** they are not inclusive of the 'end' value in the range. 

In [None]:
list(range(0, 10, 1))

In [None]:
list(range(20, 30, 2))

In [None]:
list(range(10))

##### String - List Conversion

In python, strings are actually similar to a list of letters. You can access individual values using list access notation:

In [None]:
my_str = "Hello world!"
my_str[4]

In [None]:
my_str[3::2]

Additionally, strings containing a *separator* can be broken up into a list of strings using `.split()`. This is useful, for example, when trying to parse the text of a document.

In [None]:
sentence = "It was a dark and stormy night."
sentence.split(sep=" ")

You can also rejoin a list into a string using `.join()`. 

Notice that `.join()` is a method belonging to objects of the `str` class. 

In [None]:
words = ['It', 'was', 'a', 'dark', 'and', 'stormy', 'night.']
" ".join(words)

Any arbitrary `str` can use the `.join()` method!

In [None]:
"(-_-)".join(words)

#### Dictionaries

Dictionaries are the core data type of the python language. Unlike lists, dictionaries are **unordered** and accessed using keys rather than using numerical indices. This workshop will not describe dictionaries in detail, but you can refer to the W3 schools guide [here](https://www.w3schools.com/python/python_dictionaries.asp) for more info. 

In [None]:
# Create a dict using key-value pairs between {}
my_dict = {
    'hello': 1
}
my_dict

In [None]:
# Access the value in a dict using keys
my_dict['hello']

In [None]:
# Dictionaries can have numerical, string, and boolean keys. They can hold any number of arbitrary object types.
my_dict = {
    'hello': 1,
    'world': True,
    123: [1, 2, ["Hello world"]],
    True: {
        "New": True
    }
}
my_dict

In [None]:
my_dict[True]

### Modules and Packages

#### Modules

Modules are python script files (`*.py`) which are imported into python. Typically, they contain functions and/or classes which are useful for your code. We can import modules like so:

In [None]:
import builtins

`builtins` is a generic python module which contains many core functions such as `print()`. If we check the `type()` of `builtins`, we see that it is a `module` object:

In [None]:
type(builtins)

If we import `builtins` as a module, we can then use the functions within that module, like so:

In [None]:
builtins.print("Hello world!")

It might get inconvenient to keep typing `builtins` every time we want to use the `builtins.print()` function. Instead, we can import `builtins` using a variable which is easier to type:

In [None]:
import builtins as btns

btns.print("Hello world!")

Sometimes we only want to use a small number of functions from a module. So, instead of importing the whole module, we might instead just import those functions directly using `from ... import ...`:

In [None]:
from builtins import print

print("Hello world!")

Finally, we can even make a variable for a function:

In [None]:
from builtins import print as pnt

pnt("Hello world!")

##### Packages and CLI usage in Jupyter

Packages are collections of modules typically based on a shared purpose. They are typically installed using a package manager such as `pip` from the command line, like so:

```shell
pip install <name_of_package>
```

However, we are in a notebook and not on the command line! How do we install packages? Fortunately, Jupyter Notebook allows us to write any arbitrary command-line commands using the `!` symbol at the beginning of a block. For example, on the CLI, you can write "Hello world!":

In [None]:
!echo "Hello world!"

This is equivalent to opening command prompt (windows) or terminal (macOS) and typing:

```shell
echo "Hello world!"
```

This capability is also very useful when you need to install python packages using `pip`, which is typically done from the command line. Instead, we can install packages from within Jupyter like so:

```
!pip install numpy
```

although, the preferred way to install packages is using jupyter **magic**: 

```%pip install numpy```

In [None]:
%pip install numpy

#### Numpy arrays

`numpy` is a python package that provides complex data types for performing mathematical operations. In particular, numpy provides the `array` data type which is similar to the `matrix` in R. 

Before arrays can be constructed it is necessary to install the numpy library in Python (if you don't already have it) and load it into Python:

In [None]:
import numpy as np

In [None]:
type(np)

Just like other objects, they have properties and methods. We typically load modules into python because we want to use the methods they contain. As a reminder, you can access an object's methods using the `<object>.<method>()` notation. For the `numpy` module, the method we are most interested is the `array()` method -- this is what we can use to construct an `array` object.

Arrays are similar to lists, except that they are specifically design for holding only one type of data, typically numerical data. 

In [None]:
# Create a 1-dimensional (1d) array holding the values 1, 2, and 3
np.array([1, 2, 3])

We can create 2-dimensional arrays by adding lists of lists:

In [None]:
# Construct a 2d array
numpy.array([
    [1, 2, 3],
    [4, 5, 6]
])

We can even create a 3-dimension array (and beyond) using lists of lists of lists (etc). 

In [None]:
# Construct a 3d array
numpy.array([
    [
        [1, 2, 3],
        [4, 5, 6]
    ],
    [
        [7, 8, 9],
        [10, 11, 12]
    ]
])

##### Numpy array methods

Many methods are available for `array` objects. An exhaustive reference is available [here](https://numpy.org/doc/stable/reference/index.html). For now, we will discuss a few key methods:

1. Creation
2. Shape and dimensions
3. Accessing elements
4. Setting elements
5. any / all 
6. Mathematical operations

##### Creation

Numpy arrays are created in multiple ways. The simplest invovles the use of lists (shown above):

In [None]:
my_arr = np.array([
    [True, False, False],
    [False, True, False]
])
my_arr

In [None]:
type(my_arr)

Arrays can also be created using the `arange()` method. This method creates a sequential `array` given the max element specified:

In [None]:
# Create a 1d integer array from 0-10
my_arr = np.arange(10)
my_arr

In [None]:
# Create a 1d integer array from 10-20
my_arr = np.arange(10, 20)
my_arr

In [None]:
# Create a 1d integer array from 10-100 in steps of 5
my_arr = np.arange(10, 100, 5)
my_arr

In [None]:
# Create a 1d float array from 0.0-10.0
my_arr = np.arange(10.0)
my_arr

##### Shape and dimensions

`numpy` arrays have a number of dimensions and a shape. Note that these are properties, not methods. They are accessed using this pattern: `<object>.<property_name>` as follows:

In [None]:
# Construct 2d array
my_arr = np.array([
    [True, False, False],
    [False, True, False]
])

In [None]:
# Get number of dimensions property
my_arr.ndim

**Note**: The `shape` of any `array` follows the format:

```python
(dimN, dimN-1, ..., dim3, dim2, dim1)
```

Where `dim1` is the innermost brackets (dimension \#1), `dim2` is the next innermost, ... and `dimN` is the outermost brackets.

Where the value for each `dim` corresponds to the number of elements in that dimension. For example:

In [None]:
# Get the shape property (number of rows (ie dim 2), number of columns (ie dim 1))
my_arr.shape

Let's examine this in action with a 3D array:

In [None]:
# Construct a 3d array
arr_3d = np.array([
    [
        [1, 2, 3],
        [4, 5, 6]
    ],
    [
        [7, 8, 9],
        [10, 11, 12]
    ]
])

# Get the shape (number of 2d arrays (aka "stacks"), number of rows, number of columns)
arr_3d.shape

Finally, the shape of an array can be altered using the `reshape()` method. This is particularly useful for quickly constructing arrays of a desired shape:

In [None]:
my_arr = np.arange(15)
my_arr = my_arr.reshape((5, 3))  # Note that this does NOT overwrite the my_arr object until you re-assign using '='

In [None]:
my_arr

The above can be simplified in 1 line of code:

In [None]:
my_arr = np.arange(15).reshape((5, 3))
my_arr

**Challenge problem:** Create an array with all the even numbers from 2 through 13 in 2D format with two rows and three columns.

##### Accessing elements

Elements can be accessed using several approaches:

1. Numerical
2. Logical

For the **Numerical** approach, numerical indices are utilized using the pattern suited to their shape following this form:

```python
my_array[dimN, dimN-1, ..., dim3, dim2, dim1]
```

In which the value for `dim*` is the index of the element within that dimension of the dataset you want to access. For example:

In [None]:
# For a 1D array, similar to list
my_arr = np.array([3, 8, 1, 5])
my_arr[1]  # Get the 2nd element of the first (and only) dimension

Now in a 2D array with `dim2` (rows) and `dim1` (cols):

In [None]:
# For a 2D array, the pattern is array[dim2_index, dim1_index]
my_arr = np.array([
    [5, 7, 4, 6],
    [2, 1, 9, 8]
])
my_arr[0, 1]  # First element in dim 2 (row 0) and second element in dim 1 (column 2)

Now in a 3D array with `dim3` (stacks), `dim2` (rows), and `dim1` (cols):

In [None]:
# For an n-dimensional array, the pattern is the same: array[dimN_index, dimN-1_index, dimN-2_index..., dim1_index]
my_arr = np.arange(125).reshape((5, 5, 5))
print(my_arr)

In [None]:
my_arr[2, 3, 1]  # 3rd element in dim 1 (matrix 3), 4th element in dim 2 (row 4), 2nd element in dim 3 (column 2)

We can also use slice notation in order to get elements within a dimension! For example, if we wanted to get the 2nd column from every row in the first stack of `my_arr`:

In [None]:
my_arr[0, :, 1]

**Challenge problem**: Write a statement which accesses every other column, from the last two rows, of the last stack in `my_arr`.

For the **Logical** approach to accessing data, we can use a boolean array to extract the element(s) of interest:

In [None]:
num_arr = np.array([1, 2, 3])
bool_arr = np.array([False, False, True])
num_arr[bool_arr]  #  We access the element of num_arr for which bool_arr is True

This approach is extremely powerful when you can use logical operations to create a boolean array:

In [None]:
# Create a 2D matrix
dataset = np.array([
    [1, 2, 3, 4, 5],
    [6, 7, 8, 9, 10]
])
print(dataset)

In [None]:
# Create a boolean matrix for this dataset to test where values are greater than 3
bools = dataset > 3
print(bools)

In [None]:
# Extract the value(s) which satisfy this logical operation
dataset[bools]

We can also use `np.equal()` to construct a boolean array based on equivalence with a value:

In [None]:
# Create a boolean matrix for this dataset to test where values are equal to 5 using np.equals()
bools = np.equal(dataset, 5)
print(bools)

In [None]:
# Extract the value(s) which satisfy this logical operation
dataset[bools]

In [None]:
# Create a boolean matrix for this dataset to test where values are > 8 or < 3
bools = np.logical_or(dataset > 8, dataset < 3)
print(bools)
# Subset the data using these booleans
dataset[bools]

Finally, we can use the **where** approach that is a hybrid of these two methods. `where()` returns the numerical indices in which a logical condition was met. For example:

In [None]:
dataset

In [None]:
# Find the numerical indices for values in the dataset > 6
indices = np.where(dataset > 6)
print(indices)
# Subset the data using these indices
dataset[indices]

##### Setting elements

Just as you can access elements of an array, you can also set them. This can be done with integer and logical indexing. 

Here is an example with simple integer indexing:

In [None]:
# Create a 2D matrix
dataset = np.array([
    [1, 2, 3, 4, 5],
    [6, 7, 8, 9, 10]
])
print(dataset)

In [None]:
# Change row 2, column 5 to the value 100
dataset[1, 4] = 100
dataset

You can also use logical indexing to set array values:

In [None]:
# Set every value > 3 to 0
dataset = np.array([
    [1, 2, 3, 4, 5],
    [6, 7, 8, 9, 10]
])
dataset[dataset > 3] = 0 
dataset

And, finally, you can use the `where()` method:

In [None]:
# Set all value < 7 to -1
dataset = np.array([
    [1, 2, 3, 4, 5],
    [6, 7, 8, 9, 10]
])
dataset[np.where(dataset < 7)] = -1
dataset

##### Any / All

`any()` and `all()` are two methods which determine whether an array satisfies a logical condition. `any()` is `True` if any element in the array satisfies the condition. `all()` is `True` if all elements of the array satisfy the condition. Examples:

In [None]:
dataset = np.array([
    [1, 2, 3, 4, 5],
    [6, 7, 8, 9, 10]
])

In [None]:
# Any values equal to 0?
np.any(dataset == 0)

In [None]:
# All values NOT equal to 0?
np.all(dataset != 0)

##### Mathematical methods

Arrays have a large number of built-in mathematic methods. Examples include `sum()` and `mean()`. They can also be used for multi-dimensional algebraic operations, such as matrix multiplication and dot products. Here are a small number of examples:

In [None]:
my_data = np.arange(9).reshape((3,3))
my_data

In [None]:
# Multiplication by scalar
my_data * 3

In [None]:
# Addition by vector
my_vector = np.array([5, 10, 20])
my_data + my_vector

In [None]:
# Sum of values
my_data.sum()

In [None]:
# Mean of values within dimension 2 (rows) -- "axis" specificies the dimension index
my_data.mean(axis=1)

In [None]:
# Max values within dimension 1 (columns)
my_data.max(axis=0)

In [None]:
# Transposition
my_data.transpose()

In [None]:
# Make new dataset
my_data2 = np.arange(100, 109).reshape((3,3))  # 3x3 matrix of 100:109

# Compute dot product
dot_prod = np.dot(my_data, my_data2)
dot_prod

In [None]:
# Compute the pearson correlation of two 1d arrays
arr1 = np.array([1, 5, 6, 6, 7, 10])
arr2 = np.array([3, 3, 4, 3, 6, 9])
np.corrcoef(arr1, arr2)  # Correlation is ~.809

**Random**: Randomness is very useful in advanced mathematics and statistics. `numpy` has builtin methods for generating randomnes. For example, we can easily pick a random integer between 1 and 100:

In [None]:
np.random.randint(low=1, high = 100)

We can also generate a random integer array based based on a supplied size parameter:

In [None]:
np.random.randint(low=1, high=100, size=(5, 3))

We can also generate an array of random floats between 0 and 1 by supplying a shape:

In [None]:
np.random.rand(4, 4)

### Flow control & Functions

We will also briefly discuss control flow and functions in python. While these are useful techniques for python programming, they are not necessary for most typical data science activities in python. These are the topics which we will now summarize:

1. If...elif...else
2. Loops
3. Function definitions

#### If...elif...else

These statements indicate code blocks that will only be executed given that a logical condition is met.

##### If statements

`if` statements in python create a logic gate, such that some code will only execute if a logical condition is met. See an example here:

In [None]:
a = 1
b = 1

if a == b:
    # Execute this code only if a == b is True
    print("a is equal to b!")

The above example shows an `if` statement. The code in this statement only executes which the condition (`a == b`) is `True`. **Challenge:** Can you modify the above block so that the code will not execute?

##### If...else statements

`else` statements are executed if no previous conditions are satisfied. In other words, if not of the `if` statements execute, only then will the `else` statement execute.

In [None]:
a = 1
b = 2

if a == b:
    print("a is equal to b!")
else:
    print("a is NOT equal to b!")

In [None]:
a = [1, 2, 3]
cond = a[1] == 2
if cond:
    print("Yes")
else:
    print("No")

##### If...elif...else statements

`elif` is a phrase that means "else if". This means that if the preceeding logical conditions are not satisfied, only then is this statement tested. 

In [None]:
grade = 78

if grade > 90:
    # Only executes if grade > 90
    letter_grade = "A"
elif grade > 80:
    # Only executes if grade > 80 and grade <= 90
    letter_grade = "B"
elif grade > 70:
    # Only executes if grade > 70 and grade <= 80
    letter_grade = "C"
elif grade >= 60:
    # Only executes if grade > 60 and grade <= 70
    letter_grade = "D"
else:
    # Only executes if grade < 60
    letter_grade = "F"
    
print("Student earned a grade of " + letter_grade)

In the above example, each logical condition is tested in sequence. Only when a condition is not met is the next one tested. If a student has a grade of `68`, then every `elif` statement will be tested. If the student had a `96`, then no `elif` statements would have been tested.

In [None]:
trees = ['pine', 'spruce', 'fir', 'oak', 'cherry']

if 'pi' + ' ne' in trees:
    print("Pine Trees!")
elif ",".join(['fir', 'oak']) == "fir, oak":
    print("Fir Trees!")
elif "".join(['ry', 'cher'][::-1]) in trees:
    print("Cherry Trees!")
else:
    print("No trees!")

In [None]:
",".join(['fir', 'oak'])

#### Loops

Loops allow for python code to be applied to every element of an iterable object, such as a list.

##### For loops

For loops are a type of finite loop in python (as opposed to `while` loops which we will not discuss here). A for loop iterates over an iterable object, such as a `list` or `tuple`. For every element of the object, code will be executed in succession. Here is an example:

In [None]:
# Loop through a list of 1 through 10
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

for number in numbers:
    print(number)

As the loop iterates, it assigns each element of `numbers` to the variable `number` and then runs the code within the loop. For example, we can add 10 to `number`:

In [None]:
for number in numbers:
    print(number + 10)

Loops can be a relatively convenient way to add numbers to a list using the `.append()` method. For example:

In [None]:
new_numbers = list()
for number in numbers:
    new_numbers.append(number + 10)
    
new_numbers

##### Combining loops and if / else

In [None]:
# Combining loops and if...else
grades = [85, 98, 45, 73]

# Loop over list of grades and print letter grade
for grade in grades:
    if grade > 90:
        # Only executes if grade > 90
        letter_grade = "A"
    elif grade > 80:
        # Only executes if grade > 80 and grade <= 90
        letter_grade = "B"
    elif grade > 70:
        # Only executes if grade > 70 and grade <= 80
        letter_grade = "C"
    elif grade >= 60:
        # Only executes if grade > 60 and grade <= 70
        letter_grade = "D"
    else:
        # Only executes if grade < 60
        letter_grade = "F"

    print("Student earned a grade of " + letter_grade)


##### Integer indices instead of direct for loops

Rather than using the list of grades directly, it may be useful to use the numerical indices of list elements. For example:

In [None]:
# Loop through a list of letters
letters = ["a", "b", "c", "d"]

for letter in letters:
    print(letter)

In [None]:
range(len(letters))

In [None]:
for i in range(len(letters)):
    letter = letters[i]
    print(letter)

While this may seem more complicated, there are many situations in which this is necessary! For example:

In [None]:
students = ['alice', 'kevin', 'sara', 'tim']
grades = [85, 98, 45, 73]

for i in range(len(grades)):
    
    grade = grades[i]
    student = students[i]
    
    if grade > 90:
        # Only executes if grade > 90
        letter_grade = "A"
    elif grade > 80:
        # Only executes if grade > 80 and grade <= 90
        letter_grade = "B"
    elif grade > 70:
        # Only executes if grade > 70 and grade <= 80
        letter_grade = "C"
    elif grade >= 60:
        # Only executes if grade > 60 and grade <= 70
        letter_grade = "D"
    else:
        # Only executes if grade < 60
        letter_grade = "F"

    print(student + " earned a grade of " + letter_grade)


##### List comprehension

Typically, for loops are a terrible coding pattern in `Python`. There's almost always a much better/faster alternative to using one. However, they do have one area of utility in data science: **list comprehensions**. 

**List comprehension** is a *pythonic* coding pattern used for performing an action on a list. It is faster than a typical for-loop and reduces the number of lines needed to use one. 

It usually takes the following form:

```python
[ modify_value(value) for value in values ]
```
This will take every element of `values` and modify it using `modify_value()` function, returning a list of modified values of the same order and length as `values`.

For example we can simply return the `number` for every `number` in `range(1, 11)` like so:

In [None]:
# NOTE: You can also use list comprehension to achieve this
[number for number in range(1, 11)]  # Print doesn't actually return a value

We could also modify these numbers:

In [None]:
[number + 10 for number in numbers]  # Returns a value

We can even add conditionals! Such as if...else statements:

In [None]:
[-number if number % 2 == 0 else number for number in numbers]  # Returns a value

#### Functions

Functions are objects in python which take an input, perform computations, and return an output. Functions have arguments that help the function operate correctly. For example, we can define a function, `square_it()` which finds the square of any number:

In [None]:
def square_it(x):
    print(x ** 2)
    
type(square_it)

In [None]:
square_it(5)  # Gets 5 ** 2

Functions do not have to return a value. Because `square_it()` only prints an object, it doesn't return anything. Any variable that references the output of `square_it()` will be a `None`, which means "doesn't exist". 

In [None]:
result = square_it(5)

In [None]:
print(result)

Interestingly, `None` is actually a type of object in Python. This allows you to easily reference them, which can make solving certain coding problems easier. 

In [None]:
type(result)  # NoneType objects

Functions can also return a value with the `return` statement. This is more common in python programming than simply printing the value:

In [None]:
def square_it(x):
    return x ** 2  # Return a value
    
result = square_it(5)

In [None]:
print(result)