# Imports within a project. How do they work?


Many, many, *many* of you reading this will have worked on Python packages in the past and been burned by Python's... "obtuse" approach to importing from local files. This particular complaint is one I've heard a lot from budding programmers and was also one of my own pain points as I was learning (and still is to some extent!). However, it actually follows a few simple rules and what you can (and can't) do can be puzzled out fairly easily if you know what you're doing.

## Rule #1 The Python Path is King

Whenever you import a package, Python looks through all the "paths" it has listed and goes to each location to check if there is a matching package at that directory. Most of these directories are not important (for you - Python still needs them!), but there are two key locations which you should know about: the site-packages, and the one where Python is executed.

If you don't know where these are, you can easily check using Python's builtin ``sys`` package.

In [7]:
import sys

for i, p in enumerate(sys.path):
    print(f'#{i+1}', p)

#1 C:\Users\jderrick\GitHub\ScriptTipFriday\local python imports
#2 c:\users\jderrick\appdata\local\programs\python\python38\python38.zip
#3 c:\users\jderrick\appdata\local\programs\python\python38\DLLs
#4 c:\users\jderrick\appdata\local\programs\python\python38\lib
#5 c:\users\jderrick\appdata\local\programs\python\python38
#6 
#7 c:\users\jderrick\appdata\local\programs\python\python38\lib\site-packages
#8 c:\users\jderrick\appdata\local\programs\python\python38\lib\site-packages\win32
#9 c:\users\jderrick\appdata\local\programs\python\python38\lib\site-packages\win32\lib
#10 c:\users\jderrick\appdata\local\programs\python\python38\lib\site-packages\Pythonwin


Of all these outputs the 2 we care about are #1 and #7. #1 is the location in which this jupyter notebook is running. #7 is the location all my installed python packages can be found. If I were to import ``numpy``, say, Python would check and find it in ``site-packages``.

> Now ``sys.path`` is actually a list and one which you can alter dynamically whilst the program is running. As you can imagine, that is a recipe for disaster if you don't know what you're doing. So best to avoid it unless you are sure you know what you're about.

So why am I going on about this? Well, the Python path is King. From Python's perspective anything anywhere else *does not exist*. If I do an import I need to make sure it is relative to one of these paths. 

For example, if I have the following file structure:

```
.
├── my_script.py
└── src/
    └── script_package/
        ├── __init__.py
        ├── __main__.py
        └── script.py
```

And in my script (aptly named) ``my_script.py`` I want to import a function from ``script.py`` called ``func``. The import statement will look like this.

```Python
from src.script_package.script import func
```

So far, so sensible. When I run ``my_script`` the Python path directory will be at the same level as ``my_script`` so it can see there's a directory called ``src`` and then the dot notation tells it to go into that directory and look for a directory called ``script_package`` and so on until it reaches a file where it can import the function ``func``. OK so let's make it a little more complex. 


### A little more complex...

Let's say ``script_package`` is intended to become a python package and be installed into ``site-packages``, which means that all the imports below ``script_package`` need to work as if the Python path was but one level up from ``script_package`` rather than two, like it is now.

![package-site](./media/package-sight.PNG)


OK so, if ``__init__.py`` wanted to import ``func`` from ``script.py`` how would you do it? Unfortunately it is not possible to do it in such a way that would satisfy both paths at the same time. This is a serious problem for our development! We need to do both. Well, there *is* something we can do. We can use relative imports.

## Rule #2: Relative imports let you break Rule #1 in a very specific and limited way...

Python gets around this problem by letting you use relative imports. Currently, if you wanted to import ``func`` from ``script.py`` to ``__init__.py`` on the same level, whilst running the code two levels up you would have to perform the import from the level the code is run. 


Your ``__init__.py`` would look like.

```Python
from src.script_package.script import func
```

And this is not flexible, as we've already discussed. Instead Python will (begrudgingly) let you use *relative imports*. In other words, it will look at your file and say "OK I'll look at directories relative to this file, *if I must*" which makes imports so much more flexible. Relative imports make our ``__init__.py`` look like this.

```Python
from .script import func
```

The ``.`` tells Python to look in the directory this file is in and the syntax matches the fairly universal relative path syntax where ``.`` represents the current directory. 

You can even extend this to ``..`` and look *two* levels up. Although if this exceeds the level of the Python path, then it will fail, regardless of if the destination exists or not. Note this means if the import exceeds the Python path *or matches it*. If that sounds a bit confusing I've made a nice diagram that should make things a bit clearer.

![relative imports going too far](./media/relative_too_far.png)

Why does it behave like like this? I don't know man, I didn't do it.

## Rule #3 You can always hack it together with the Python path

Remember how I said you can manipulate the Python path dynamically? Well, you can always get sick of all the above and just add various directories to the path until it works. This is can be a spectacularly bad way of doing things so I do not recommend it, but it is a tool available to you. Sometimes you might find you have to do it (or something like it) to get around some strange requirement in the spec. You should recognise it for what it is (a filthy hack) but I say let the person without dirty hands throw the first stone.

## Rule #4 Caveats & things to watch

There are a couple of things to watch out for with imports that can cause problems, but aren't strictly rules for imports.

### Caveat #1 - Circular Imports

If you encounter a circular import you really have two options open to you. One is the "right" way to handle it and the other is a filthy hack, but it's a lot easier. Let's set up an example.

We have two files

``functions.py`` and ``classes.py``.

in ``functions.py`` it says

```Python
from .classes import KEYS
import random

def random_key():
    return random.choice(KEYS)
```

in ``classes.py`` it says

```Python
from .functions import random_key
from dataclasses import dataclass


KEYS = {'a', 'b', 'c'}

@dataclass
class KeyHolder:
    name: str
    _key: str = ''
    
    @property
    def key(self):
        if not self._key:
            self._key = random_key()
        return self._key
```

Hopefully the problem here is clear. If we were to run a script that imported the class ``KeyHolder``, for example, the script would import ``classes.py`` which would import ``functions.py``, which would import ``classes.py``, which would import... and 

![AAAAAA](./media/aaaa_bird.jpg)

Quite. So how to fix this?

#### The "right" way

Refactor your code my guy! You messed up and now you have to pay for it. In the example it's pretty straightforward. If we move the ``KEYS`` constant to ``functions.py`` we're good. Of course your situation may be more complex, but that just means there's no easy way out where I can give you some simple instructions to follow.

#### The "terrible" way

The quick and awful way is to make use of the fact that when Python imports a file it runs everything in that. But if you're just defining functions, or classes, these aren't run. They are *defined* but not executed. In other words if we were to alter ``classes.py`` to read this instead...

```Python
from dataclasses import dataclass


KEYS = {'a', 'b', 'c'}

@dataclass
class KeyHolder:
    name: str
    _key: str = ''
    
    @property
    def key(self):
        from .functions import random_key
        if not self._key:
            self._key = random_key()
        return self._key
```

it will work. We moved the import into the function itself so that it isn't called when the file is imported, thus breaking the circular import.

As someone who has had to do this before, don't. Refactoring is the solution!

### Caveat #2 name clashes

Be careful with your names! This is pretty obvious if you think about it but it's an easy trap to fall into. If you make a file called ``numpy.py`` whilst you also have NumPy installed, the Python interpreter is going to get confused and have a bad day. If you're working with PyAnsys, don't have a file called ``ansys.py``, because that exists in PyAnsys already somewhere. Try to keep your file and directory names distinct, short, memorable and not clashing with anything else or you'll end up like the screaming bird.

![aaaaa birds](./media/aaaa_bird2.jpg)

and no one wants that.

## FInally... Import Playground

If you check this notebook out on my GitHub repo (found [here](https://github.com/jgd10/ScriptTipFriday)), you can also find a simple package I wrote for the purposes of this article. It has enough depth and interlacing functions that you can try out almost every example here yourself (although I couldn't bring myself to recreate the circular imports). Go play! 