# Using your own modules

When importing modules, Python checks directories on your computer. If you have a module installed in one of those directories, Python will import it. The directories that are searched when importing a module can be viewed within a Python environment using `sys.path`.

In [None]:
import sys

print(sys.path)

If you run the above code, you will see the locations on your computer that Python is looking for packages when running this Jupyter notebook. The first directory in the list will be the location of whichever script you are running (i.e., this jupyter notebook in our case). If you put your own modules in any of the listed locations, you will be able to import them into another script.

## What is a module?

Modules are any Python script or directory. As modules can be either files or directories, the heirarchy of directories containing files is represented by referring to the module directories as packages and the files as modules when speaking about them. You can read about packages and modules in the Python documentation [here](https://docs.python.org/3/tutorial/modules.html#packages).

To illustrate that any directory can be a module (even if they are only useful if they contain some kind of usable file), we can import the empty directory called "empty/" that was included with these notebooks.

In [4]:
import empty

print(empty)

<module 'empty' (<_frozen_importlib_external.NamespaceLoader object at 0x7ff7ef56fb10>)>


However, we can't import directories or scripts that don't exist. Python checks if the file or directory exists and, if it does, it is imported.

In [5]:
import non_existant

print(non_existant)

ModuleNotFoundError: No module named 'non_existant'

As you might have guessed, there's not much point in importing empty directories. We can't do anything with it now that we have imported it. This is just to illustrate that Python simply treats any directory or Python script as an importable object.

## Importing local scripts as modules - scripts

Now that we have an idea of what Python can and can't import, let's do something useful with it. Let's import the script, "a.py" that was included with these notebooks on canvas. To import a script that is in the same directory as the script you are running, you can simply `import <script name>`.

Before we import the script, let's take a look at its contents. If you open the script, you'll see the following:

```python
print("Module a is being loaded...")

def say_hi():
	print("hi from a.say_hi()!!!")
```

So the script "a.py" contains a print command and a function definition. If we were to just run that script in our terminal, the `print()` command would produce the message in our terminal and the function would be defined, but it isn't called so we wouldn't see the message from that. Let's see what happens when we import the module into this Jupyter notebook

In [1]:
import a

Module a is being loaded...


The `print()` command in the module script ran and we see the output printed above. The same would happen if you created a script which imported the module "a" and ran that script in your terminal. When a script is imported, all of the commands and assignments within (i.e., anything with `=`, `def`, etc) are executed. The results are then loaded into the script which imported it. That means that we now have access to the function `say_hi()` which was defined in the script "a.py".

In [2]:
a.say_hi()

hi from a.say_hi()!!!


## Importing local scripts as modules - directories

What if we want to use directories to organize our scripts? Perhaps we have a few different module scripts, which each contain different related functions. We might in that case create a directory like the "my_module" directory included with these jupyter notebooks. That directory contains three script files which include definitions of things we would use for different things. For example, "my_module/file_handling.py" might contain functions for reading and writing files.

Using multiple scripts to organize our different project components is a great way to make the development of a complicated project simpler. There is no minimum size for Python scripts so you are free to break up your code into as many or as few files as you find most useful. In the example directory, "file_handling.py" or something like it seems like a great place to collect all of the code related to file handling. If we ever want to add or modify code relating to that activity, we know just where to find it. The other two scripts are questionable. "functions.py" is pretty generic, while "iterators.py" is just a type of function. It is less clear what we would plan to store in each of those files. While the names could be improved, they'll suffice for our purposes today.

Let's start by importing just one of the scripts, "my_module/file_handling.py" so that we can access the function defined within. First, let's consider the contents of the script.

```python
def say_hi():
	print("hi from my_module.file_handling.say_hi()!!!")

```

It looks a lot like the "a.py" script. Let's import it and call the function to see that we have imported it correctly. To import scripts that are inside of a directory (or you could says modules within a package), we simply use an import statement that represents that path.

In [4]:
import my_module.file_handling

my_module.file_handling.say_hi()

hi from my_module.file_handling.say_hi()!!!


As you can see, we have now imported the `file_handling` module within the `my_module` package and we are able to call the function `say_hi()`. If we would prefer to not have to refer to `file_handling.say_hi()` as an attribute of `my_module` then we can us `import X from Y` to just import the `file_handling` module.

In [6]:
from my_module import file_handling
file_handling.say_hi()

hi from my_module.file_handling.say_hi()!!!


We could also import the `say_hi()` function directly using the same syntax, but simply referring to the path from which the function can be imported.

In [7]:
from my_module.file_handling import say_hi
say_hi()

hi from my_module.file_handling.say_hi()!!!


Let's import another of the modules in the `my_module` package, the "functions.py" module. That file looks like this:

```python
def say_hi():
	print("hi from my_module.functions.say_hi()!!!")
```

We can import it in exactly the same way.

In [8]:
from my_module import functions
functions.say_hi()

hi from my_module.functions.say_hi()!!!


As you can see, both of these modules have a function with the same name. If we keep the imports such the each function is retained as an attribute of its containing module, then that situation is sustainable (if a strange design choice).

As in the other importing notebook, to really follow along with the following examples, you will need to restart the kernel of your notebook. You can do that by clicking the circular arrow at the top of the window. I will prompt you whenever you need to do it again within this document.

RESTART KERNEL

In [9]:
from my_module import file_handling
from my_module import functions

file_handling.say_hi()

hi from my_module.file_handling.say_hi()!!!


In [10]:
functions.say_hi()

hi from my_module.functions.say_hi()!!!


However, if we were to import both functions directly, their names would collide.

In [11]:
from my_module.file_handling import say_hi
from my_module.functions import say_hi

say_hi()

hi from my_module.functions.say_hi()!!!


As you can see, only the second function that we imported is running. That's because when the second import command executed, it overwrote the first function.

Finally, the above examples don't do anything useful. Let's take a look at one more module just to see that we can import something worth using. There is a third script in the "my_module/" dir which includes the iterator function we wrote in class together to iterate over the digits of an integer. Let's import that and us the function we wrote.

In [14]:
from my_module import iterators

for i in iterators.iter_int(54321):
    print(i)

50000
4000
300
20
1


In addition to running that function, we can also use call `help()` to see how to use the function, just like if we had defined the function within our current script.

In [15]:
help(iterators.iter_int)

Help on function iter_int in module my_module.iterators:

iter_int(num: int) -> int



## Writing scripts to be either run or imported

As we saw when importing `a`, the `print()` command in the script was run. Consider a situation in which you wrote a script that did a number of time consuming things, but also had some useful function definitions that you wanted to use elsewhere. You might want to be able to import the functions without executing the whole script. This is a common thing to want to do with Python scripts. In fact there are many situations in which you might want to be able to import a script without executing it.

We can do that by taking advantage of one of the special attributes that Python assigns to scripts and modules when they are executed or imported. Specifically, Python assigns each script a name according to how it is run or imported. The name of each script is stored in the special attribute "\_\_name\_\_". Special attributes like "\_\_name\_\_" are often referred to as "dunder" attributes as they have a double-underscore each side.

Generally, double underscores are used in Python to indicate attributes that are used by the Python language directly as opposed to variables that you define. We'll talk more about dunder attributes and methods when we start working with making our own classes in a later week.

The way we can use the "\_\_name\_\_" attribute is by checking if the script is the main script being executed or if it has been imported. Whenever a script is directly executed, it will be assigned the name "\_\_main\_\_". However, when a script is imported, it is given a differerent name such as the actual name of the script. Try it out yourself by making a script containing only the command `print(__name__)` and either executing the script or importing it into another script.

Knowing that "\_\_name\_\_" will be "\_\_main\_\_" only when a script is executed, we can simply put any code we only want to run when executing a script behind an `if` check.

In [18]:
# Imagine this is a script.

def something():
    print("look at me go!!!")
    
if __name__ == "__main__":
    something()

If you run the above block in this notebook, it will run the function. However, if you copy that block into a script and then import it into another script, the function will not be run.

## \_\_init\_\_.py

The way the "my_module/" dir is set up means that if we want to use any of the modules within, we need to import them using their full path. i.e., we can't do the following:

RESTART KERNEL

In [1]:
import my_module

my_module.file_handling.say_hi()

AttributeError: module 'my_module' has no attribute 'file_handling'

We know that the error message produced above is wrong. `my_module` does have an attribute called file_handling. Indeed, we can import it.

In [2]:
import my_module.file_handling

my_module.file_handling.say_hi()

hi from my_module.file_handling.say_hi()!!!


The reason for this behaviour is that when you import a package consisting of solely a directory containing some modules, the package is imported without Python then recursing through the sub-packages and modules inside the package. Instead, when you set up a package as we have set up `my_module`, you need to specify the modules within `my_module` that you want. We can see exactly what is imported from `my_module` using the `dir()` command.

RESTART KERNEL

In [2]:
import my_module

# dir() without inputs prints everything in our current namespace
print(dir())

['In', 'Out', '_', '_1', '__', '___', '__builtin__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', '_dh', '_i', '_i1', '_i2', '_ih', '_ii', '_iii', '_oh', 'exit', 'get_ipython', 'my_module', 'open', 'quit']


We can see that `my_module` has been added to our namespace along with a lot of other things that we won't worry about here. What about the namespace of the `my_module` module

In [3]:
print(dir(my_module))

['__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__']


That doesn't include the two modules contained within `my_module`. They have not been imported. What if we import them directly

In [1]:
import my_module.functions
import my_module.file_handling

print(dir(my_module))

['__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'file_handling', 'functions']


Importing the two modules directly loads them in our script as part of the `my_module` namespace.

What if we want to be able to just import the top-level package and then have access to all of the modules within? That's what \_\_init\_\_.py files are for. When you import a package (i.e., a directory), any \_\_init\_\_.py file within is executed. Any code within that \_\_init\_\_.py is therefore executed, meaning you can write code there to control how the import of your package behaves.

For our example module, let's just set it up so that importing `my_module` also makes available the modules within. To make that change, you need to add an \_\_init\_\_.py file within the "my_module/" dir. Make that file and add the following code to the file before proceeding.

```python
import my_module.functions
import my_module.file_handling
import my_module.iterators
```

You should then see the modules within `my_module` loaded in the `my_module` namespace upon import.

RESTART KERNEL

In [1]:
import my_module
print(dir(my_module))

['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'file_handling', 'functions', 'iterators', 'my_module']


We can now run the functions in the my_module without having to import everthing within our script.

In [2]:
my_module.functions.say_hi()

hi from my_module.functions.say_hi()!!!
