# Good Python

This notebook shows some of the most basic Python concepts that we'll be
using over and over again throughout this course.

## Declaring objects

We _declare_ variables by assigning an object to a variable using the `=` operator.

In [None]:
x = 'LEARNING PYTHON'  # Store string in variable x
y = 2.0  # Store float in variable y

Different objects have different types depending on the class they belong to.

In [None]:
type(x)

In [None]:
type(y)

All objects have _special_ attributes and methods depending on their class.

For example, `.split()` is a string-specific method.

In [None]:
x.split(' ')  # Splits the string on ' ' and returns a list

In contrast, we cannot access `.split()` for `y` because `y` is a float, not a string!

In [None]:
y.split('.')  # Does NOT return [2, 0]!

### f-strings
_f-strings_ are super useful because they can be formatted (hence the _f_ in their name).

In [None]:
# Declare some variables according to how you feel regarding Python
custom_1, custom_2, custom_3 = 'Python', 'greatest', 'better'

# Stick them in an f-string
f_string = f'{custom_1} is the {custom_2} language ever! It is much {custom_3} than R.'

# Print string
print(f_string)

## Data Structures

Lists and tuples are the building blocks of more complex data structures such as
numpy arrays, pandas dataframes and even tensorflow tensors.

Lists and tuples are array-type objects that can hold any kind of object within themselves.

> Remember that Python is zero-indexed! In other words, if you want to access the
**third** element of a list, you need to use the **second** index (because we
count from zero: 0, 1, **2**, 3).

In [None]:
# Declare a list
my_list = [1, 2.0, '3']  # Not necessarily homogeneous!

# We can append new values to the list
my_list.append('4')  # Add 4 (as a string) to the list

# We can access the i-th element from array-like objects with []
print(my_list[0])  # First element

# We can remove the i-th element from the list
my_list.pop(2)  # Remove the THIRD element from the list

# Print resulting list
my_list

Tuples, on the other hand, are basically inmutable lists. They serve as a
simliar data structure but you cannot append, remove or update any of the
elements within them.

In [None]:
# Declare a tuple
my_tuple = (42, 43, 44, 45)

# We cannot update any of its items
my_tuple[0] = 123  # We cannot modify the first value

## Calling functions

Python has several built-in functions (not to be confused with methods).

Us programmers say we _pass_ an _argument_ to a _function_.

```python
my_function(my_argument)  # Pass `my_argument` to function `my_function`
```

For example, we can pass `my_tuple` to the function `sum`, which only accepts
array-like structures composed of numeric elements (integers or floats).

In [None]:
# Sum all the elements in my_tuple
print(sum(my_tuple))

# We cannot sum all the elements in my_list because it contains strings
print(sum(my_list))

## Importing external libraries

Python has tons of useful built-in functions, but we can extend its functionality
by importing external modules.

> **NOTE**: Imports should ALWAYS go at the very beginning of your script! We are
making an excpetion here, of course.

In [None]:
# Import numpy (library) and alias it 'np'
import numpy as np

# Use numpy's sum function
np.sum(my_tuple)

## For Loops
For loops are a way to iteratively execute commands.

1. We can iterate over the values of an iterable object.

In [None]:
my_string = 'GOOD PYTHON'

# Iterate over each character and print it
for char in my_string:
    print(char)

2. We can iterate over the indexes of an iterable object:

In [None]:
for i in range(len(my_string)):
    print(my_string[i])  # Same thing as before but we're using [i]

We can update the elements of a list. For example, we can cast each element
in `my_list` to strings.

In [None]:
# Print current version of my_list
print(my_list)

# Iterate over each element and cast it to a string
for i, v in enumerate(my_list):
    my_list[i] = str(my_list[i])

# View result
my_list

## List comprehension

List comprehension is a super concise way to declare lists in a few lines of code.

Python developers use it a lot to make their code more readable.

In [None]:
# Declare [1, 2, 3, 4, ..., 98, 99, 100]
my_new_list = [i + 1 for i in range(100)]

## Custom functions

We can declare our own functions using `def`.

Ideally, we should always add a docstring (instructions on how to use our function)
and hint the expected types of each argument.

In [None]:
# Custom function that calculates the Neyman statistic
def neyman(array1, array2):
    """
    Calculate the Neyman statistic.

    Parameters
    ----------
    array1 : array-like
        Observed outcomes of the treatment group.
    array2 : array-like
        Observed outcomes of the control group.

    Returns
    -------
    float
        The Neyman statistic.
    """

    # Calculate means
    mean_array1 = np.mean(array1)
    mean_array2 = np.mean(array2)

    # Calculate the denominator
    sdev = np.sqrt(
        np.var(array1) / len(array1)
        + np.var(array2) / len(array2)
    )

    # Return the actual statistic
    return (mean_array1 - mean_array2) / sdev

Let's test out our function.

In [None]:
neyman(
    [75, 120, 80, 65],  # Yearly income of treatment group
    [65, 44, 49, 50, 28]  # Yearly income of control group
)

Best practice: We should always factor out our custom functions and place them in their own
litte script (sometimes called a _module_).

> Check out the script `./functions/stats/functions.py`, which contains the same
`neyman` function except it's called `neyman_stat`.

We will use `os.path.join` to declare the path (this way, it will work on any OS).

- On Windows, `os.path.join('.', 'functions', 'stats')` returns `'.\functions\stats'`.
- On MacOS and Linux, `os.path.join('.', 'functions', 'stats')` returns `'./functions/stats'`.


In [None]:
# Import os & sys
import os, sys  # Would normally go at the top!

# Append path to the list of paths our Python session can access
sys.path.append(
    os.path.join(
        '.', 'functions', 'stats'
    )
)

# Import function from functions.py file
from functions import neyman_stat

# Use it again
neyman_stat(
    [75, 120, 80, 65],  # Yearly income of treatment group
    [65, 44, 49, 50, 28]  # Yearly income of control group
)