# Python modules, packages and paths

Last year, you mostly developed and executed your code in Jupyter Notebooks. While Notebooks are incredibly useful for examining the outputs of a Python project, they are less useful for developing lower level functions that you might want to import into numerous projects.

A "module" in python is a single file with the extension .py containing functions and variables that you might want to ```import``` into another script (a text file contining python commands), interactive Python session, or Notebook.

To demonstrate, we'll follow the example [given in the Python documentation](https://docs.python.org/3/tutorial/modules.html) on the Fibonacci sequence (follow the link for much more information on how modules work in Python).

**To practice your new-found bash skills, carry out all the exercises in this worksheet using a bash terminal (e.g. on Noteable)**

## Modules: refresher

Create a folder to work in called something like ```fibo_project``` and then within it, create a file called ```fibo.py``` with the following contents:

In [1]:
# Fibonacci numbers module

def fib(n):    # write Fibonacci series up to n
    a, b = 0, 1
    while a < n:
        print(a, end=' ')
        a, b = b, a+b
    print()

def fib2(n):   # return Fibonacci series up to n
    result = []
    a, b = 0, 1
    while a < n:
        result.append(a)
        a, b = b, a+b
    return result

These two functions calculate the Fibonacci series up to some value, and then either print the values or return a list. E.g.:

In [2]:
fib(10)

0 1 1 2 3 5 8 


In [3]:
fib2(10)

[0, 1, 1, 2, 3, 5, 8]

We could practice importing our fibo.py module from a Jupyter Notebook in the same directory, but this time, we'll use an interactive Python shell. From a terminal, **navigate to the folder where you stored the ```fibo.py``` file** and start an interactive python session by typing:

```python```

You should see a prompt like this:

<img src="figures/python_prompt.png" alt="Python prompt" style="display:block;margin-left:auto;margin-right:auto;width:60%"/>

You should now be able to import the ```fibo.py``` module using the module name, excluding the ```.py``` extension:

```import fibo```

From there, try running either of the functions using, e.g.:

```fibo.fib(10)```

To exit the Python session, type ```exit()```.

### Packages

Packages are multiple Python modules (i.e., python files) grouped together in a directory. Let's extend our above example to see how Python packages work. Within your ```fibo_project/``` directory, create a new directory called ```fibo_pack``` containing the following module files:

```
fibo_pack/
    - __init__.py
    - fibo_mod1.py
    - fibo_mod2.py
```

Into ```fibo_mod1.py``` paste the function ```fib```, and into ```fibo_mod2.py``` paste the function ```fib2```.

The ```__init__.py``` file is a special file that tells Python that this ```fibo_pack``` directory should be considered a Python package. You can leave this file blank for now. A blank ```__init__.py``` file means that every ```.py``` file in the directory should be considered a module.

Because Python now thinks this directory is a package, we can now import the modules found in it. From the ```fibo_project/``` directory, launch Python again and try the following:

```
from fibo_pack import fibo_mod1
fibo_mod1.fib(10)
```

or:

```
from fibo_pack import fibo_mod2
fibo_mod2.fib2(10)
```

You should see the same outputs as before. 

So now, you can organise your code into directories, and, using ```__init__.py```, make your code importable as a Python package.

### The Python path

In the terminal, navigate up one level from the ```fibo_project/``` folder you've created, and launch a new interactive Python session (```python```). Try to import the ```fibo.py``` module again (```import fibo```). What happened?

You should have found that Python returned an error message saying something like:

```ModuleNotFoundError: No module named 'fibo'```

Python couldn't find the module file ```fibo.py```.

Python uses a list of folder names to search for available modules to run. This is called the *path*. To see the path for your current session, open Python and run the following:

```
import sys
sys.path
```

```sys.path``` is just a list like any other Python list. This particular list contains directory names. It should look something like this:

```
['/opt/conda/bin',
 '/opt/conda/lib/python37.zip',
 '/opt/conda/lib/python3.7',
 '/opt/conda/lib/python3.7/lib-dynload',
 '',
 '/opt/conda/lib/python3.7/site-packages',
 '/srv/notebook_extensions/tree-page-graphics',
 '/opt/conda/lib/python3.7/site-packages/IPython/extensions',
 '/home/jovyan/.ipython']
```

Most of these path entries point to the Python installation on your machine. The entry ```''``` means "current directory". It's because this entry is in your ```sys.path``` that Python knows to look in the current directory to find modules. When you navigated away from your ```fibo_project``` directory and launched Python from the parent directory, Python couldn't find the module file any more, because the ```fibo_project``` directory wasn't in your path.

Because ```sys.path``` is just a list, we can fix this issue by adding another list element, specifying the location of our module file. Try:

```sys.path.append('/path/to/your/fibo_project')```

Where of course you should replace ```/path/to/your``` with the appropriate location on your system.

Try importing the ```fibo.py``` module now. If you've set your path correctly, it should import as normal.

Modifying your path is a useful way to re-use modules across multiple Python projects. To get Python to import a module from another directory, just add that directory to the path that is used when you execute your current project.

See below for some more formal ways of using paths.

### *Extra 1: Running Python scripts*

We have just demonstrated one way to run Python: use a text editor to develop some functions, and then import and run them from a ```python``` session. IDEs like Spyder and VS Code have both of these elements (text editor and terminal) incorporated within them.

Let's quickly look at an alternative way of running these commands, using a *script*. A script is just a list of Python commands that are executed in order. 

Create another file in the ```fibo``` directory called ```fibo_script.py```, and paste in the following text:

```import fibo```

```fibo.fib2(10)```

Now, you can run this script non-interactively from the command line using:

```python fibo_script.py```

And you should see the output printed to the screen.

A script is a good way to start developing Python functions, before you turn them into modules, or to keep track of a set of commands that you might otherwise run in an interactive session.

### *Extra 2: Modifying \__init__\.py*

Let's go back to your ```fibo_pack``` package and have a closer look at ```__init__.py```.

```__init__.py``` is a python script that is executed when a package is initialised (i.e., when it is imported). You can see this behaviour by adding a print statement to ```__init__.py```:

```
print("Initialising fibo_pack...")
```

Try running the import statements again (e.g. ```from fibo_pack import fibo_mod1```), and you should see your message.

We can also use ```__init__.py``` to modify the way that our modules are imported. Let's just try importing our whole package at the top level (rather than grabbing one of the modules within the package):

```import fibo_pack```

Now try running one of the functions from one of the modules using:

```
fibo_pack.fibo_mod1.fib(10)
```

You'll see an error message like this:

```AttributeError: module 'fibo_pack' has no attribute 'fibo_mod1'```

This happens because the empty ```__init__.py``` file doesn't automatically load our modules into the namespace of ```fibo_pack```. If we want to be able to import and use our package in this way, we need to explicitly load the modules in ```__init__.py```. For example, add the following line to your ```__init__.py```:

```
from . import fibo_mod1
```

This line says "from the current directory, import the fibo_mod1 module". 

Let's try importing our package again (**you'll probably need to exit and restart your Python session**). The following should now work:

```
import fibo_pack
fibo_pack.fibo_mod1.fib(10)
```


### *Extra 3: Paths and environments*

This section demonstrates advanced conda/pip usage, and is only for the more adventurous!

A common way to manage your path in Python is to use the same method you might use to install other packages. Two common package managers are [pip](https://pypi.org/project/pip/) and [conda](https://docs.conda.io/en/latest/). Conda is the package manager for the Anaconda distribution, so those of you who have downloaded Anaconda will already have conda installed. 

It is considered best practice to set up a Python [virtual environment](https://realpython.com/python-virtual-environments-a-primer/) for each type of project you are working on. A Python virtual environment is essentially a set of Python packages. When you have that environment activated, you have access to those packages, but no others. You might have different environments depending on the type project you are building. For example, a data science environment might have Pandas, Scikit-learn and similar packages installed in it, whereas a numerical modelling environment might have libraries that compile parts of the Python code, such as Numba. Ring-fencing these sets of packages in different environments guards against potential problems, such as conflicting package dependencies (e.g., if two packages themselves depend on different versions of the same package).

Instructions here are for Conda, but similar instructions for Pip are [here](https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/), and then check out the "editable" option as described [here](https://pip-python3.readthedocs.io/en/latest/reference/pip_install.html#options).

You can create a Conda environment by typing the following (as outlined in Conda's documentation on [managing environments](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#)):

```
conda create --name myenv package1 package2 ...
```

where ```myenv``` is the name of your environment, and ```package1``` and ```package2``` are the names of packages you'd like to include in the new environment (e.g., ```numpy```, ```pandas```, etc.).

Activate your new environment using:

```
conda activate myenv
```

You can *add your own package* to the environment as follows. Firstly, you need to install the ```conda-build``` package (run ```conda install conda-build``` with your environment activated), navigate to your project directory, and then type:

```
conda develop .
```

As usual, the period here says "this directory".

This will add your directory to your environment, but in a state where you can edit the contents. So in this way, you've added your package to your path by modifying the virtual environment that you're working in.

### *Extra resources:*
    
Python docs: https://docs.python.org/3/tutorial/modules.html#modules

Good blog explaining modules, packages, paths: https://www.devdungeon.com/content/python-import-syspath-and-pythonpath-tutorial

More details on structuring your project: https://www.devdungeon.com/content/python-import-syspath-and-pythonpath-tutorial