# Python Imports



## Basics

**Modules**
- A *module* is a file with a `.py` extension
- Modules can also be run as scripts

**Packages**
- *Packages* are collections of modules in a hierarchical structure
- A package needs a `__init__.py` file for Python to recognise it as such
- A package can have *subpackages* (which need their own `__init__.py` files

**Python Standard Library**
- Modules that come automatically with Python
- Contains utilities to interact with the interpreter itself
- Full list [here](https://docs.python.org/3/library/index.html)

**Importing Syntax**
- `import sys`: import the whole package/module
- `from sys import argv, exit`: import specific objects
- `import numpy as np`: alias the object
- `from scipy.stats import norm`: chain subpackages/modules with dots
- `from sys import *`: import all objects in the module
    - *Highly discouraged:* Leads to confusion over what is imported and from where

## The Search Path

When importing, Python searches for the module in a list of directories in the following order:
1. The current directory (if the interpreter is being run interactively) or the directory from which the script was run 
2. The list of directories in the `PYTHONPATH` environment variable
3. A list of directories configured when Python is installed (or virtual environment folders)

**Important:** Python will stop looking when it finds a module with the desired name. So, if there is a module in the current directory called `numpy`, Python will find and import that, not the `numpy` installed.

We can access the search path through `sys.path`.

In [1]:
# the path.py script prints the paths in sys.path
!python example1/path.py

/Users/c.leonard/P/python-playground/env/bin/python: can't open file '/Users/c.leonard/P/python-playground/imports/example1/path.py': [Errno 2] No such file or directory


Note that the first path is the directory containing the script.

In [2]:
# by default PYTHONPATH is empty
!echo $PYTHONPATH




In [3]:
# but we can add to it (separate paths with :s)
!export PYTHONPATH=/Users/c.leonard/P$PYTHONPATH; echo $PYTHONPATH

/Users/c.leonard/P


In [4]:
# note that the PYTHONPATH is searched after the directory of the script
!export PYTHONPATH=/Users/c.leonard/P$PYTHONPATH; python example1/path.py

/Users/c.leonard/P/python-playground/env/bin/python: can't open file '/Users/c.leonard/P/python-playground/imports/example1/path.py': [Errno 2] No such file or directory


In [5]:
# if we import the module its first argument is the current directory, not the module directory
from abs_pkg import path

path.print_path()

/Users/c.leonard/P/python-playground/imports
/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python310.zip
/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10
/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/lib-dynload

/Users/c.leonard/P/python-playground/env/lib/python3.10/site-packages


(I don't know why `''` is added to the system path in Jupyter - something to do with `JUPYTER_PATH` environment variable?)

In [6]:
# we can add to the path
import sys

sys.path.insert(0, "/Users/c.leonard/P")
path.print_path()

/Users/c.leonard/P
/Users/c.leonard/P/python-playground/imports
/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python310.zip
/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10
/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/lib-dynload

/Users/c.leonard/P/python-playground/env/lib/python3.10/site-packages


## Importing Your Own Files

### Absolute Imports

When developing we want to import objects from elsewhere in our directory. We can specify the absolute path of imports by separating levels with dots.

The `abs_pkg` directory has the following structure:
```
abs_pkg
├── __init__.py
├── path.py
├── start.py
├── subpkg_a
│   ├── __init__.py
│   └── mod_a.py
└── subpkg_b
    ├── __init__.py
    └── mod_b.py
```

The modules import each other as follows:

In [7]:
# abs_pkg/subpkg_a/mod_a.py
!cat abs_pkg/subpkg_a/mod_a.py

def simple_a():
    print("This is simple A")


In [8]:
# abs_pkg/subpkg_b/mod_b.py
!cat abs_pkg/subpkg_b/mod_b.py

from subpkg_a.mod_a import simple_a

def simple_b():
    print("This is simple B")
    simple_a()


In [9]:
# abs_pkg/start.py
!cat abs_pkg/start.py

from subpkg_b import mod_b

mod_b.simple_b()


We can run `start.py` as a script with no problems:

In [10]:
!python abs_pkg/start.py

This is simple B
This is simple A


But `mod_b.py` gives us an error:

In [11]:
!python abs_pkg/subpkg_b/mod_b.py

Traceback (most recent call last):
  File "/Users/c.leonard/P/python-playground/imports/pkg/subpkg_b/mod_b.py", line 1, in <module>
    from subpkg_a.mod_a import simple_a
ModuleNotFoundError: No module named 'subpkg_a'


We run `start.py` from the `pkg` directory, so `sys.path` starts with `"/abs_pkg"`. When this calls `mod_b` it finds `subpkg_a.mod_a` in the path. But when we run `mod_b.py` directly, the path starts with `"/abs_pkg/subpkg_b/"` so Python can't find `subpkg_a.mod_a` on the path.

**Remember:** imports in Python are always relative to something (in this case `sys.path`)

### Relative Imports

Alternatively, we can import object from elsewhere in the package by specifying the relative path. We preface with a `.` to specify the same directory, `..` to specify the parent directory, etc.

The `rel_pkg` directory has the same structure as `abs_pkg`, but with relative imports.

In [20]:
# subpkg_b/mod_b.py
!cat rel_pkg/subpkg_b/mod_b.py

from ..subpkg_a.mod_a import simple_a

def simple_b():
    print("This is simple B")
    simple_a()


In [24]:
# start.py
!cat rel_pkg/start.py

from .subpkg_b import mod_b

mod_b.simple_b()


But now we get an error when we run `start.py`:

In [41]:
!python rel_pkg/start.py

Traceback (most recent call last):
  File "/Users/c.leonard/P/python-playground/imports/rel_pkg/start.py", line 1, in <module>
    from .subpkg_b import mod_b
ImportError: attempted relative import with no known parent package


However, we can import `start.py` with no issue:

In [43]:
from rel_pkg import start

The issue comes from how Python loads scripts.

## Modules vs Scripts

There are two ways to load a Python file
1. As the *top-level script*
    - A file is loaded as the top-level script if you load it directly, e.g. via `python rel_pkg/start.py`
    - There can only be one top-level script as a time
    - It's `__name__` is set to `__main__`
2. As a module
    - A file is loaded as a module if it's imported, either in the top-level script or as part of another module
    - It's `__name__` is set to the file name, preceded by the names of (sub)packages above it, e.g. `rel_pkg.start`

In [44]:
# in Jupyter the interactive interpreter is the top-level script
__name__

'__main__'

In [52]:
# here mod_a.py is loaded as a module
from abs_pkg.subpkg_a import mod_a

mod_a.__name__

'pkg.subpkg_a.mod_a'

When you import a module, it's `__name__` is determined *relative to the top-level package*.

In [48]:
# pkg/name.py
!cat abs_pkg/name.py

from subpkg_a import mod_a

print(mod_a.__name__)


In [49]:
!python abs_pkg/name.py

subpkg_a.mod_a


Note that because we imported `mod_a` in `name.py`, its name is different from when we imported it in the notebook.

**If a module's name has no dots, it is not considered to be part of a package regardless of where the file actually is on disk.**

Relative imports use `__name__` to determine where a file is in the package. If `__name__` contains no dots (or not enough dots), then relative imports fail. In particular, *scripts can't do relative imports*.

## Executing Modules as Scripts

You can execute modules as scripts using `python -m file`. Note that you must omit the `.py` suffix and separate levels of the file system with `.` instead of `/`.

This has a few consequences:
- The first path in `sys.path` is set to the current directory, not the directory containing the module
- The `__name__`s of all files are set relative to the current directory
- Relative imports for the top-level script are determined using `__package__`, which is set relative to the current directory

In [54]:
# note that the first path is the current directory
!python -m abs_pkg.path

/Users/c.leonard/P/python-playground/imports
/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python310.zip
/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10
/opt/homebrew/Cellar/python@3.10/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/lib-dynload
/Users/c.leonard/P/python-playground/env/lib/python3.10/site-packages


In [65]:
# the __name__ of mod_a is relative to the current directory
!python -m rel_pkg.name

rel_pkg.subpkg_a.mod_a


In [57]:
# relative imports now work from the top-level script
!python -m rel_pkg.start

This is simple B
This is simple A


In [66]:
# relative imports now work from the top-level script
!python -m rel_pkg.subpkg_b.mod_b

## Conventions

- Imports should be grouped (alphabetically within each group) in the following order, with a space between groups
    1. Standard library imports
    2. Related third party imports
    3. Local application/library specific imports
- Absolute imports are recommended over relative imports
- See [PEP 8](https://peps.python.org/pep-0008/#imports) for details

## References

- [Complete Guide to Imports in Python](https://www.pythonforthelab.com/blog/complete-guide-to-imports-in-python-absolute-relative-and-more/)
- [Python Standard Library](https://docs.python.org/3/library/index.html)
- [StackOverflow - Relative Imports](https://stackoverflow.com/questions/14132789/relative-imports-for-the-billionth-time/14132912#14132912)
- [PEP 8](https://peps.python.org/pep-0008/)