<h1>Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Imports" data-toc-modified-id="Imports-1">Imports</a></span><ul class="toc-item"><li><span><a href="#Debugging" data-toc-modified-id="Debugging-1.1">Debugging</a></span></li><li><span><a href="#Namespaces" data-toc-modified-id="Namespaces-1.2">Namespaces</a></span></li><li><span><a href="#Star-imports" data-toc-modified-id="Star-imports-1.3">Star imports</a></span></li><li><span><a href="#Selective-imports" data-toc-modified-id="Selective-imports-1.4">Selective imports</a></span></li></ul></li><li><span><a href="#Methods-revisited" data-toc-modified-id="Methods-revisited-2">Methods revisited</a></span></li></ul></div>

In [1]:
import os
import sys
sys.path.insert(0, os.path.abspath(os.path.join('.', 'examples')))

# Modules

In the [previous lesson](functions.ipynb) we learned how to define our own [functions](extras/glossary.md#function). I mentioned the possibility of re-using functions in a different file from the file in which the functions are defined. This is the main thing that we will learn about in this lesson.

You may reasonably ask why we would want to do this. If we are writing a program that requires one or more custom-defined functions, why not just define them in the same file that contains the program in which they are used? After all, this simpler one-file structure is the way that the [fun_facts.py](examples/fun_facts.py) example program is organized, and that program was written by a well-respected expert programmer.

For a simple program like *fun_facts.py*, a single file is fine, and is even a good idea. Multiple files are unnecessary for simple, short programs. But for more complex programs, separating functions off from the rest of the program has some advantages:

* **Clarity**. The functions file shows the algorithmic workings of each component of the program, but the main program file shows how those components are combined to create the overall program structure. We know which file to go to if we want to change either the overall structure or the details of the program.
* **Portability**. A file that contains only functions can be re-used by a different program. The alternative, copying and pasting the function definition into every program file that needs it, is prone to mistakes and makes our programs inflexible ('[DRY](https://en.wikipedia.org/wiki/Don't_repeat_yourself)').

A Python file that contains an actual program that will 'do something' when run is often termed a [script](extras/glossary.md#script), because it is providing the Python [interpreter](extras/glossary.md#interpreter) with line-by-line instructions telling it what to do, like the script for an actor. A file that instead contains only function definitions (and possibly also some variables) intended to be used elsewhere is often termed a [module](extras/glossary.md#module). A basic structure for a simple multi-file Python program is to have one module file providing some functions, along with a script file (the 'main' file) that gets those functions and actually does something with them.

Let's see how this works, using the [initials.py](examples/initials.py) file that we saw in the previous lesson. Take a look at the example program [ids.py](examples/ids.py). This program makes use of the `get_initials()` function that is defined in *initials.py*.

## Imports

We can almost already create a multi-file program given what we have learned so far, and there is very little in *ids.py* that is unfamiliar. We know how to use functions, and we know how to use additional [control statements](extras/glossary.md#control) to determine what actions are taken when, how often, and so on.

The only new ingredient is how to 'get' the contents of one file into another. This is fairly easy. The [keyword](extras/glossary.md#keyword) `import` runs another Python file and then makes its contents (i.e. any functions or variables in it) available in the current file. The [syntax](extras/glossary.md#syntax) for importing the contents of a file is simply to write `import` followed by the name of the file, without the *.py* file extension.

So this is how we can import the contents of *initials.py* file into another Python file:

In [2]:
import initials

We can see this command on [line 12](examples/ids.py#L12) of *ids.py*.

### Debugging

Note that in order for an `import` statement to work, the file that we are importing must be located in the directory that we are working in. This means the directory in which our main program file is saved. So if you are writing your own example program in the Spyder editor to test the example commands as we go along, make sure that your copy of *initials.py* is located in the same directory in which you have saved your example program.

If we try to import a module file that does not exist or is not in the current directory, we get an error:

In [3]:
import nonexistent_module

ModuleNotFoundError: No module named 'nonexistent_module'

Note also that module filenames should not contain any spaces. Otherwise when we try to import them, Python sees something that looks like two or more separate names, and this is not valid Python [syntax](extras/glossary.md#syntax):

In [4]:
import badly named module

SyntaxError: invalid syntax (<ipython-input-4-5524f6874f05>, line 1)

### Namespaces

I promised that after [importing](extras/glossary.md#import) our *initials.py* file, the function that it defines would be available for us to use. This doesn't seem to be the case. *initials.py* contains a function called `get_initials()`, but this function still doesn't seem to be available:

In [5]:
get_initials('Mildred Bonk')

NameError: name 'get_initials' is not defined

There is just one more missing ingredient (I really promise this time, only one more). Instead of just taking everything from the imported file and putting it all into individual variables that we can use in the normal way, Python's `import` puts all the imported contents into their own '[namespace](extras/glossary.md#namespace)'. A namespace is a bit like a directory for variables. It stores multiple variables all under the same name, for neatness of organization. We can access individual variables within a namespace by writing first the name of the namespace (which here is simply the name of the imported module), followed by a dot `.`, like this:

In [6]:
initials.get_initials('Mildred Bonk')

'mb'

This is what we see on [lines 20 and 21](examples/ids.py#L20) of *ids.py*, where the `get_initials()` function from *initials.py* is first used.

Isn't this just an annoying extra complexity, another chance to get something wrong when we write our programs? Like many such things, Python's use of namespaces appears an annoyance at first but is really a blessing in disguise once we start writing more complex programs.

Imagine that you have a very long program containing a lot of variables. And then you decide that you want to import into this program a very useful but also very long [module](extras/glossary.md#module) that provides some great functions that you need. If `import` simply dumped everything from both files together in the same workspace, then you would need to first check carefully and make sure that none of the names of variables or functions in one file were the same as those in the other, because if they were, the names would 'clash' and one would overwrite the other. By keeping imported things in a separate namespace, such accidental overwrites are avoided. It is entirely possible to import a module (for example called *my_module.py*) containing a function or variable called `x` and also to have something called `x` in your main program. The former will be available as `my_module.x` whereas the latter will be available simply as `x`.

Likewise, imagine that you need to import functions from more than one module. Again, if these modules unfortunately happen to contain functions with the same name, they would overwrite each other if simply dumped into the main workspace. But thanks to namespaces, clashing function names are totally fine; one function can be available as, for example, `module_a.useful_function()` and the other as `module_b.useful_function()`.

### Star imports

If we really want to, it is possible to bypass `import`'s creation of separate namespaces for imported modules. The [keyword](extras/glossary.md#keyword) `from`, together with the `*` symbol, imports everything from a module into the main workspace of the current program:

In [7]:
from initials import *

And now the contents of *initials.py* are available in the normal way without prepending `initials.`:

In [8]:
get_initials('Mildred Bonk')

'mb'

This is sometimes termed a 'star import', because of the use of the 'star' symbol `*` (you may also see it called a '[wildcard](extras/glossary.md#wildcard)'). In many contexts in programming, the `*` symbol stands for 'everything' or 'anything'. So the command above says 'import everything from *initials.py* (and dump it all into the main workspace)'.

The star import shortcut is there if you really need it, but the general consensus among Python users is that it is not a good idea. It erases all the benefits that namespaces bring for the robustness and clarity of our program. My advice is to reserve it only for very short programs that only import one module, and whose purpose is simply to demonstrate the use of that one module.

You will sometimes see star imports used in online examples or documentation, to help keep an example short ([here](https://plotnine.readthedocs.io/en/stable/generated/plotnine.geoms.geom_bar.html#examples) is one example). This is fine for examples and demonstrations, but don't copy it into your own programs.

### Selective imports

A better use of the `from` keyword is to select just one thing that we would like to import from a module. For example, we can import just the `get_initials()` function from *initials.py* (admittedly, this is somewhat redundant here, since `get_initials()` is the only thing in *initials.py* anyway):

In [9]:
from initials import get_initials

A 'selective import' like this also makes the individual function `get_initials()` available directly without having to use a namespace.

In [10]:
get_initials('Mildred Bonk')

'mb'

But note that there is a big difference in clarity when compared to the star import. A selective `import` statement at least says explicitly *what* we are importing from the module. So if we begin our main program with `from initials import get_initials`, we and our collaborators can at least be sure when reading the remainder of the program that `get_initials()` refers to something that we have imported from *initials.py* (and that we haven't imported anything else from there). By contrast, with the star import no part of our program explicitly states which functions or variables are coming from where.

## Methods revisited

We have in fact already met [namespaces](extras/glossary.md#namespace) in a slightly different guise. We have learned about [methods](extras/glossary.md#method): functions that are 'attached' to only one [type](extras/glossary.md#type) of variable. We focused mainly on [string](extras/glossary.md#string) methods, because strings have lots of methods available to them. Each [variable](extras/glossary.md) in Python has its own namespace, and in that namespace are stored links to the methods available for variables of that type.

Recall how this works:

In [11]:
name = 'Mildred Bonk'

name.upper()

'MILDRED BONK'

This is the same 'dot' notation that we just used for getting something from an imported module's namespace, because the underlying mechanism is essentially the same.

Likewise, just as we used the `dir()` function to find out what methods a variable has available to it, we can also use `dir()` to find out the contents of an imported module's namespace (and again, we can ignore for now the 'special' contents surrounded by double underscores `__ __`):

In [12]:
dir(initials)

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'get_initials']

When we have imported a new module, for example one that we have installed or downloaded from the internet, or have been sent by a colleague, it is a good idea to go to the console and use `dir()` to see what things it has made available to us.

## Special names

By now you will of course be burning with curiosity about those things that keep appearing surrounded by double underscores `__ __`. We have encountered them twice: once when we used `dir()` to see a list of [methods](extras/glossary.md#method) available for a variable, and once again just now when we used `dir()` to see a list of the contents of an imported [module](extras/glossary.md#module).