<img src="../../images/banners/python-modular.png" width="600"/>

# <img src="../../images/logos/python.png" width="23"/> Python Modules


## <img src="../../images/logos/toc.png" width="20"/> Table of Contents 
* [Python Modules: Overview](#python_modules:_overview)
* [The Module Search Path](#the_module_search_path)
* [The `import` Statement](#the_`import`_statement)
    * [`import <module_name>`](#`import_<module_name>`)
    * [`from <module_name> import <name(s)>`](#`from_<module_name>_import_<name(s)>`)
    * [`from <module_name> import <name> as <alt_name>`](#`from_<module_name>_import_<name>_as_<alt_name>`)
    * [`import <module_name> as <alt_name>`](#`import_<module_name>_as_<alt_name>`)
* [The `dir()` Function](#the_`dir()`_function)
* [Executing a Module as a Script](#executing_a_module_as_a_script)
* [Reloading a Module](#reloading_a_module)
* [Conclusion](#conclusion)

---

This section explores Python **modules** and Python **packages**, two mechanisms that facilitate **modular programming**.

**Modular programming** refers to the process of breaking a large, unwieldy programming task into separate, smaller, more manageable subtasks or **modules**. Individual modules can then be cobbled together like building blocks to create a larger application.

There are several advantages to **modularizing** code in a large application:
- **Simplicity:** Rather than focusing on the entire problem at hand, a module typically focuses on one relatively small portion of the problem. If you’re working on a single module, you’ll have a smaller problem domain to wrap your head around. This makes development easier and less error-prone.

- **Maintainability:** Modules are typically designed so that they enforce logical boundaries between different problem domains. If modules are written in a way that minimizes interdependency, there is decreased likelihood that modifications to a single module will have an impact on other parts of the program. (You may even be able to make changes to a module without having any knowledge of the application outside that module.) This makes it more viable for a team of many programmers to work collaboratively on a large application.

- **Reusability:** Functionality defined in a single module can be easily reused (through an appropriately defined interface) by other parts of the application. This eliminates the need to duplicate code.

- **Scoping:** Modules typically define a separate [**namespace**](https://realpython.com/python-namespaces-scope/), which helps avoid collisions between identifiers in different areas of a program. (One of the tenets in the [Zen of Python](https://www.python.org/dev/peps/pep-0020) is *Namespaces are one honking great idea—let’s do more of those!*)

**Functions**, **modules** and **packages** are all constructs in Python that promote code modularization.

<a class="anchor" id="python_modules:_overview"></a>

## Python Modules: Overview

There are actually three different ways to define a **module** in Python:
1. A module can be written in Python itself.
2. A module can be written in **C** and loaded dynamically at run-time, like the `re` ([**regular expression**](https://realpython.com/regex-python/)) module.
3. A **built-in** module is intrinsically contained in the interpreter, like the [`itertools` module](https://realpython.com/python-itertools/).


A module’s contents are accessed the same way in all three cases: with the `import` statement.

Here, the focus will mostly be on modules that are written in Python. The cool thing about modules written in Python is that they are exceedingly straightforward to build. All you need to do is create a file that contains legitimate Python code and then give the file a name with a `.py` extension. That’s it! No special syntax or voodoo is necessary.

For example, suppose you have created a file called `mod.py` containing the following:

> `mod.py`
> ```python
> s = "If Comrade Napoleon says it, it must be right."
> a = [100, 200, 300]
> 
> def foo(arg):
>     print(f'arg = {arg}')
> 
> class Foo:
>     pass
```

Several objects are defined in `mod.py`:

- `s` (a string)
- `a` (a list)
- `foo()` (a function)
- `Foo` (a class)


Assuming `mod.py` is in an appropriate location, which you will learn more about shortly, these objects can be accessed by **importing** the module as follows:

In [1]:
import mod
print(mod.s)

If Comrade Napoleon says it, it must be right.


In [2]:
mod.a

[100, 200, 300]

In [3]:
mod.foo(['quux', 'corge', 'grault'])

arg = ['quux', 'corge', 'grault']


In [4]:
x = mod.Foo()
x

<mod.Foo at 0x7fc3302653d0>

Similarly, you can import `re` module and access it's attributes and methods:

In [5]:
import re
dir(re)[:5]

['A', 'ASCII', 'DEBUG', 'DOTALL', 'I']

<a class="anchor" id="the_module_search_path"></a>

## The Module Search Path

Continuing with the above example, let’s take a look at what happens when Python executes the statement:

> ```python
> import mod
> ```

When the interpreter executes the above `import` statement, it searches for `mod.py` in a list of directories assembled from the following sources:

- The directory from which the input script was run or the **current directory** if the interpreter is being run interactively
- The list of directories contained in the [`PYTHONPATH`](https://docs.python.org/3/using/cmdline.html#envvar-PYTHONPATH) environment variable, if it is set. (The format for `PYTHONPATH` is OS-dependent but should mimic the [`PATH`](https://realpython.com/add-python-to-path/) environment variable.)
- An installation-dependent list of directories configured at the time Python is installed


The resulting search path is accessible in the Python variable `sys.path`, which is obtained from a module named `sys`:

In [6]:
import sys
sys.path

['/Users/ali/PERSONAL_DIR/github/pytopia/content/Python/Python/03. Modular Programming',
 '/Users/ali/opt/anaconda3/envs/py38/lib/python38.zip',
 '/Users/ali/opt/anaconda3/envs/py38/lib/python3.8',
 '/Users/ali/opt/anaconda3/envs/py38/lib/python3.8/lib-dynload',
 '',
 '/Users/ali/opt/anaconda3/envs/py38/lib/python3.8/site-packages',
 '/Users/ali/opt/anaconda3/envs/py38/lib/python3.8/site-packages/IPython/extensions',
 '/Users/ali/.ipython']

Thus, to ensure your module is found, you need to do one of the following:

- Put `mod.py` in the directory where the input script is located or the **current directory**, if interactive
- Modify the `PYTHONPATH` environment variable to contain the directory where `mod.py` is located before starting the interpreter* **Or:** Put `mod.py` in one of the directories already contained in the `PYTHONPATH` variable



- Put `mod.py` in one of the installation-dependent directories, which you may or may not have write-access to, depending on the OS


There is actually one additional option: you can put the module file in any directory of your choice and then modify `sys.path` at run-time so that it contains that directory. For example, in this case, you could put `mod.py` in directory `/Users/ali` and then issue the following statements:

In [7]:
sys.path.append(r'/Users/ali')
sys.path

['/Users/ali/PERSONAL_DIR/github/pytopia/content/Python/Python/03. Modular Programming',
 '/Users/ali/opt/anaconda3/envs/py38/lib/python38.zip',
 '/Users/ali/opt/anaconda3/envs/py38/lib/python3.8',
 '/Users/ali/opt/anaconda3/envs/py38/lib/python3.8/lib-dynload',
 '',
 '/Users/ali/opt/anaconda3/envs/py38/lib/python3.8/site-packages',
 '/Users/ali/opt/anaconda3/envs/py38/lib/python3.8/site-packages/IPython/extensions',
 '/Users/ali/.ipython',
 '/Users/ali']

Once a module has been imported, you can determine the location where it was found with the module’s `__file__` attribute:

In [8]:
import mod
mod.__file__

'/Users/ali/PERSONAL_DIR/github/pytopia/content/Python/Python/03. Modular Programming/mod.py'

In [9]:
import re
re.__file__

'/Users/ali/opt/anaconda3/envs/py38/lib/python3.8/re.py'

The directory portion of `__file__` should be one of the directories in `sys.path`.

<a class="anchor" id="the_`import`_statement"></a>

## The `import` Statement

**Module** contents are made available to the caller with the `import` statement. The `import` statement takes many different forms, shown below.

<a class="anchor" id="`import_<module_name>`"></a>

### `import <module_name>`

The simplest form is the one already shown above:

```python
import <module_name>
```

Note that this *does not* make the module contents *directly* accessible to the caller. Each module has its own **private symbol table**, which serves as the global symbol table for all objects defined *in the module*. Thus, a module creates a separate **namespace**, as already noted.

> A **symbol table** is a data structure maintained and constructed by the Python compiler that contains all of the essential information about each identifier found in the source code of the program. This data pertains to an identifier's type, value, scope level, and its position (also called symbol).

The statement `import <module_name>` only places `<module_name>` in the caller’s symbol table. The *objects* that are defined in the module *remain in the module’s private symbol table*.

From the caller, objects in the module are only accessible when prefixed with `<module_name>` via **dot notation**, as illustrated below.

After the following `import` statement, `mod` is placed into the local symbol table. Thus, `mod` has meaning in the caller’s local context:

In [10]:
import mod

But `s` and `foo` remain in the module’s private symbol table and are not meaningful in the local context:

In [11]:
s

NameError: name 's' is not defined

In [12]:
foo('quux')

NameError: name 'foo' is not defined

To be accessed in the local context, names of objects defined in the module must be prefixed by `mod`:

In [13]:
mod.s

'If Comrade Napoleon says it, it must be right.'

In [14]:
mod.foo('quux')

arg = quux


Several comma-separated modules may be specified in a single `import` statement:

```python
import <module_name>[, <module_name> ...]
```

In [15]:
import re, math, json

<a class="anchor" id="`from_<module_name>_import_<name(s)>`"></a>

### `from <module_name> import <name(s)>`

An alternate form of the `import` statement allows individual objects from the module to be imported *directly into the caller’s symbol table*:

```python
from <module_name> import <name(s)>
```

Following execution of the above statement, `<name(s)>` can be referenced in the caller’s environment without the `<module_name>` prefix:

In [16]:
from mod import s, foo

In [17]:
s

'If Comrade Napoleon says it, it must be right.'

In [18]:
foo('quux')

arg = quux


In [19]:
from mod import Foo
x = Foo()
x

<mod.Foo at 0x7fc3603233a0>

Because this form of `import` places the object names directly into the caller’s symbol table, any objects that already exist with the same name will be *overwritten*:

In [20]:
a = ['foo', 'bar', 'baz']
a

['foo', 'bar', 'baz']

In [21]:
from mod import a
a

[100, 200, 300]

It is even possible to indiscriminately `import` everything from a module at one fell swoop:

In [22]:
from <module_name> import *

SyntaxError: invalid syntax (<ipython-input-22-74b4e3baffe6>, line 1)

This will place the names of *all* objects from `<module_name>` into the local symbol table, with the exception of any that begin with the underscore (`_`) character.

For example:

In [23]:
from mod import *
s

'If Comrade Napoleon says it, it must be right.'

In [24]:
a

[100, 200, 300]

In [25]:
foo

<function mod.foo(arg)>

In [26]:
Foo

mod.Foo

> **Note:** **This isn’t recommended.** It’s a bit dangerous because you are entering names into the local symbol table *en masse*. Unless you know them all well and can be confident there won’t be a conflict, you have a decent chance of overwriting an existing name inadvertently. However, this syntax is quite handy when you are just mucking around with the interactive interpreter, for testing or discovery purposes, because it quickly gives you access to everything a module has to offer without a lot of typing.

<a class="anchor" id="`from_<module_name>_import_<name>_as_<alt_name>`"></a>

### `from <module_name> import <name> as <alt_name>`

It is also possible to `import` individual objects but enter them into the local symbol table with alternate names:

```python
from <module_name> import <name> as <alt_name>[, <name> as <alt_name> …]
```

This makes it possible to place names directly into the local symbol table but avoid conflicts with previously existing names:

In [27]:
s = 'foo'
a = ['foo', 'bar', 'baz']

In [28]:
from mod import s as string, a as alist

In [29]:
s

'foo'

In [30]:
string

'If Comrade Napoleon says it, it must be right.'

In [31]:
a

['foo', 'bar', 'baz']

In [32]:
alist

[100, 200, 300]

<a class="anchor" id="`import_<module_name>_as_<alt_name>`"></a>

### `import <module_name> as <alt_name>`

You can also import an entire module under an alternate name:

```python
import <module_name> as <alt_name>
```

In [33]:
import mod as my_module
my_module.a

[100, 200, 300]

In [34]:
my_module.foo('qux')

arg = qux


Module contents can be imported from within a [function definition](https://realpython.com/defining-your-own-python-function/). In that case, the `import` does not occur until the function is *called*:

In [35]:
def bar():
    from mod import foo
    foo('corge')

In [36]:
bar()

arg = corge


However, **Python 3** does not allow the indiscriminate `import *` syntax from within a function:

In [37]:
def bar():
    from mod import *

SyntaxError: import * only allowed at module level (<ipython-input-37-86f331738b46>, line 1)

Lastly, a [`try` statement with an `except ImportError`](https://realpython.com/python-exceptions/) clause can be used to guard against unsuccessful `import` attempts:

In [38]:
try:
    # Non-existent module
    import baz
except ImportError:
    print('Module not found')

Module not found


In [39]:
try:
    # Existing module, but non-existent object
    from mod import baz
except ImportError:
    print('Object not found in module')

Object not found in module


<a class="anchor" id="the_`dir()`_function"></a>

## The `dir()` Function

The built-in function `dir()` returns a list of defined names in a namespace. Without arguments, it produces an alphabetically sorted list of names in the current **local symbol table**:

In [40]:
# taking the top 10 as dir() output is too long
dir()[:10]

['Foo', 'In', 'Out', '_', '_13', '_17', '_19', '_2', '_20', '_21']

In [41]:
'qux' in dir()

False

In [42]:
qux = [1, 2, 3, 4, 5]
'qux' in dir()

True

In [43]:
class Bar():
    pass

In [44]:
x = Bar()
'Bar' in dir(), 'x' in dir()

(True, True)

Note how the first call to `dir()` above lists several names that are automatically defined and already in the namespace when the interpreter starts. As new names are defined (`qux`, `Bar`, `x`), they appear on subsequent invocations of `dir()`.

This can be useful for identifying what exactly has been added to the namespace by an import statement:

In [45]:
dir()[:10]

['Bar', 'Foo', 'In', 'Out', '_', '_13', '_17', '_19', '_2', '_20']

In [46]:
import mod
'mod' in dir()

True

In [47]:
mod.s

'If Comrade Napoleon says it, it must be right.'

In [48]:
mod.foo([1, 2, 3])

arg = [1, 2, 3]


In [49]:
from mod import a, Foo
'a' in dir(), 'Foo' in dir()

(True, True)

In [50]:
a

[100, 200, 300]

In [51]:
x = Foo()
x

<mod.Foo at 0x7fc330242580>

In [52]:
from mod import s as string
'string' in dir()

True

In [53]:
string

'If Comrade Napoleon says it, it must be right.'

When given an argument that is the name of a module, `dir()` lists the names defined in the module:

In [54]:
import mod
dir(mod)

['Foo',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'a',
 'foo',
 's']

In [55]:
from mod import *

for attr in dir(mod):
    print(attr in dir())

True
True
False
True
False
True
True
True
True
True
True
True


> Note how `__cache__` and `__file__` are not imported.

<a class="anchor" id="executing_a_module_as_a_script"></a>

## Executing a Module as a Script

Any `.py` file that contains a **module** is essentially also a Python **script**, and there isn’t any reason it can’t be executed like one.

Here again is `mod.py` as it was defined above:

> ***mod.py***

> ```python
> s = "If Comrade Napoleon says it, it must be right."
> a = [100, 200, 300]
> 
> def foo(arg):
>     print(f'arg = {arg}')
> 
> class Foo:
>     pass
```

This can be run as a script:

``` bash
$ /Users/john/Documents> python mod.py
```

There are no errors, so it apparently worked. Granted, it’s not very interesting. As it is written, it only *defines* objects. It doesn’t *do* anything with them, and it doesn’t generate any output.

Let’s modify the above Python module so it does generate some output when run as a script:

> ***mod2.py***

> ```python
> s = "If Comrade Napoleon says it, it must be right."
> a = [100, 200, 300]
> 
> def foo(arg):
>     print(f'arg = {arg}')
> 
> class Foo:
>     pass
> 
> print(s)
> print(a)
> foo('quux')
> x = Foo()
> print(x)
> ```

Now it should be a little more interesting:

```bash
$ /Users/john/Documents> python mod.py
If Comrade Napoleon says it, it must be right.
[100, 200, 300]
arg = quux
<__main__.Foo object at 0x02F101D0>
```

Unfortunately, now it also generates output when imported as a module:

In [56]:
import mod2

If Comrade Napoleon says it, it must be right.
[100, 200, 300]
arg = quux
<mod2.Foo object at 0x7fc3384f28b0>


This is probably not what you want. It isn’t usual for a module to generate output when it is imported.

Wouldn’t it be nice if you could distinguish between when the file is loaded as a module and when it is run as a standalone script?

Ask and ye shall receive.

When a `.py` file is imported as a module, Python sets the special **dunder** variable [`__name__`](https://realpython.com/python-main-function/) to the name of the module. However, if a file is run as a standalone script, `__name__` is (creatively) set to the string `'__main__'`. Using this fact, you can discern which is the case at run-time and alter behavior accordingly:

> ***mod3.py***

> ```python
> s = "If Comrade Napoleon says it, it must be right."
> a = [100, 200, 300]
> 
> def foo(arg):
>     print(f'arg = {arg}')
> 
> class Foo:
>     pass
> 
> if (__name__ == '__main__'):
>     print('Executing as standalone script')
>     print(s)
>     print(a)
>     foo('quux')
>     x = Foo()
>     print(x)
> ```

Now, if you run as a script, you get output:

```bash
C:\Users\john\Documents>python mod.py
Executing as standalone script
If Comrade Napoleon says it, it must be right.
[100, 200, 300]
arg = quux
<__main__.Foo object at 0x03450690>
```

But if you import as a module, you don’t:

In [57]:
import mod3
mod.foo('grault')

arg = grault


Modules are often designed with the capability to run as a standalone script for purposes of testing the functionality that is contained within the module. This is referred to as **[unit testing](https://realpython.com/python-testing/).** For example, suppose you have created a module `fact.py` containing a **factorial** function, as follows:

> ***fact.py***

> ```python
> def fact(n):
>     return 1 if n == 1 else n * fact(n-1)
> 
> if (__name__ == '__main__'):
>     import sys
>     if len(sys.argv) > 1:
>         print(fact(int(sys.argv[1])))
> ```

The file can be treated as a module, and the `fact()` function imported:

In [58]:
from fact import fact
fact(6)

720

But it can also be run as a standalone by passing an integer argument on the command-line for testing:

```bash
$ /Users/john/Documents python fact.py 6
720
```

<a class="anchor" id="reloading_a_module"></a>

## Reloading a Module

For reasons of efficiency, a module is only loaded once per interpreter session. That is fine for function and class definitions, which typically make up the bulk of a module’s contents. But a module can contain executable statements as well, usually for initialization. Be aware that these statements will only be executed the *first time* a module is imported.

Consider the following file `mod.py`:

> ***mod4.py***

> ```python
> a = [100, 200, 300]
> print('a =', a)
> ```

In [59]:
import mod4

a = [100, 200, 300]


In [60]:
import mod4

In [61]:
import mod4

In [62]:
mod4.a

[100, 200, 300]

The `print()` statement is not executed on subsequent imports. (For that matter, neither is the assignment statement, but as the final display of the value of `mod.a` shows, that doesn’t matter. Once the assignment is made, it sticks.)

If you make a change to a module and need to reload it, you need to either restart the interpreter or use a function called `reload()` from module `importlib`:

In [63]:
import mod4

In [64]:
import mod4

In [65]:
import importlib
mod4 = importlib.reload(mod4)

a = [100, 200, 300]


In [66]:
mod4.a

[100, 200, 300]

### Autoreload

You can also use an IPython extension to reload modules before executing user code. `autoreload` reloads modules automatically before entering the execution of code typed at the IPython prompt.

This makes for example the following workflow possible:

```python
>>> %load_ext autoreload
>>> %autoreload 2
>>> from foo import some_function
>>> some_function()
# open foo.py in an editor and change some_function to return 43
>>> some_function()
43
```

The module was reloaded without reloading it explicitly, and the object imported with from foo import ... was also updated.

**Usage:**
The following magic commands are provided:

- `%autoreload`
Reload all modules (except those excluded by `%aimport`) automatically now.
- `%autoreload 0`
Disable automatic reloading.
- `%autoreload 1`
Reload all modules imported with %aimport every time before executing the Python code typed.
- `%autoreload 2`
Reload all modules (except those excluded by %aimport) every time before executing the Python code typed.
- `%aimport`
List modules which are to be automatically imported or not to be imported.
- `%aimport foo`
Import module `foo` and mark it to be autoreloaded for `%autoreload 1`
- `%aimport -foo`
Mark module `foo` to not be autoreloaded.

Let's write a foo module with a function `f` that returns 42:

In [67]:
%%writefile foo.py
# write a function that returns 42
def f():
    return 42

Overwriting foo.py


In [68]:
from foo import f

In [69]:
f()

42

Using `autoreload`, we can reload the `foo` module automatically everytime we make some changes to `foo.py` without restarting the kernel or using `importlib` :

In [70]:
# autoreload foo module
%load_ext autoreload
%autoreload 1
%aimport foo

Let's change the `f` function return value to 43:

In [71]:
%%writefile foo.py
# Change the f function in foo module to return 43
def f():
    return 43

Overwriting foo.py


The module is automatically reloaded and `f` now returns 43:

In [72]:
f()

43

<a class="anchor" id="conclusion"></a>

## <img src="../../images/logos/checkmark.png" width="20"/> Conclusion 

In this tutorial, you covered the following topics:

- How to create a Python **module**
- Locations where the Python interpreter searches for a module
- How to obtain access to the objects defined in a module with the `import` statement
- How to create a module that is executable as a standalone script

This will hopefully allow you to better understand how to gain access to the functionality available in the many third-party and built-in modules available in Python.

In the next section, we will learn how Packages allow for a hierarchical structuring of the module namespace using dot notation.