# Modules and Packages

Writing your own modules and packages is a key skill for making your code reusable. You can also clear lines of code out of Jupyter notebooks and move it into separate modules, making the notebook more readable and better organized.

We'll practice this here by writing functions to calculate the mean, variance, and covariance of lists of numbers.

## Implementing our functions

Write a function, `my_mean`, to calculate the mean of a list of numbers,

$\frac{1}{N}\sum_{i=1}^N (x_i)$

Note that in this notation, the numbers are indexed from _1_ to _N_, which is different from Python list indexing. Don't import any Python modules, write your code from scratch.

In [None]:
%%writefile my_stats/averages.py

def my_mean(numbers):
    N = len(numbers)
    total = 0
    for num in numbers:
        total += num
    mean = total / N
    return mean

In [None]:
list_x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
assert my_mean(list_x) == 5.5

Next we'll write a covariance function, and write a variance function as a special case of that.

$\frac{1}{N}\sum_{i=1}^N(x_i - \bar{x})(y_i - \bar{y})$

(This is the function for the population covariance, which is different from the sample variance... don't worry if you don't know the difference between these two; it's not important for this exercise.)

This only works if `list_x` and `list_y` have the same length. We'll use an `assert` statement for now to enforce this, but there are more graceful ways of handling that, as we'll see in the next lecture.

In [None]:
%%writefile my_stats/variance.py

def my_cov(list_x, list_y):
    assert len(list_x) == len(list_y)
    N = len(list_x)
    xbar = my_mean(list_x)
    ybar = my_mean(list_y)
    total = 0
    for i in range(len(list_x)):
        total += ((list_x[i] - xbar) * (list_y[i] - ybar))
    cov = total / N
    return cov

In [None]:
list_y = [3, 6, 4, 3, 8, 0, 6, 3, 7, 4]
assert round(my_cov(list_x, list_y), 4) == 0.6

Note that the purpose of functions is to make code reusable. If you didn't use your `my_mean` function to get this to work, go back and do that now.

Finally, note that the variance formula, 

$\frac{1}{N}\sum_{i=1}^N(x_i - \bar{x})^2$

is a special case of the covariance formula. Write a variance function that uses your `my_cov` function to get the result.

In [None]:
%%writefile -a my_stats/variance.py

def my_var(list_x):
    var = my_cov(list_x, list_x)
    return var

In [None]:
assert my_var(list_x) == 8.25

## Exporting Our Functions to Modules

Now we can use the Jupyter `%%writefile` magic command to export our functions into modules. There is a folder called `my_stats` in this directory, which we will use as our package. It already contains the necessary `__init__.py` file that signals to Python that this directory behaves as a package.

Add the command `%%writefile <module_name>.py` to your function definition cells above. Put your `my_mean` function in a module called `averages.py`. Put the other two in a module called `variance.py`. 

Note that you will need to add code to the `my_cov` cell in order to import the `my_mean` function that it uses, since these functions will now be defined in two different files. Importing functions from other files in the same directory can be tricky. Try this on your own to see where you get stuck, and if you need help refer to [this question](https://stackoverflow.com/questions/43865291/import-function-from-a-file-in-the-same-folder) on StackOverflow.

You will also need to add a parameter to the `%%writefile` command in the `my_var` cell in order to append it to the file, instead of overwriting `my_cov`.

Run the next two cells to clear your python workspace, import the functions from your new modules, and pass the tests.

In [None]:
%reset

In [None]:
from my_stats.averages import my_mean
from my_stats.variance import my_cov, my_var

list_x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
list_y = [3, 6, 4, 3, 8, 0, 6, 3, 7, 4]
assert my_mean(list_x) == 5.5
assert round(my_cov(list_x, list_y), 4) == 0.6
assert my_var(list_x) == 8.25