# Covered here

- [Overview](#Overview)
- [Imports](#Imports)
    - [Recommended import conventions](#Recommended-import-conventions)
    - [Import syntax](#Import-syntax)
    - [Circular dependencies](#Circular-dependencies)
- [The import namespace](#The-import-namespace)
    - [Searching, interaction, & hierarchy](#Searching,-interaction-&-hierarchy)
    - [Intra-package references](#Intra-package-references)
    - [Imports within `__init__.py`](#Imports-within-__init__.py)
- [Use of `if __name__ == '__main__'`](#Use-of-if-__name__-==-"__main__")

# Resources & references

* Python docs: [Modules](https://docs.python.org/3/tutorial/modules.html); [The import system](https://docs.python.org/3/reference/import.html)
    * Section 6.4 - [Packages](https://docs.python.org/3/tutorial/modules.html#packages)
* tutorialspoints.com: [Python Modules](https://www.tutorialspoint.com/python/python_modules.htm)
* [StackOverflow: Python importing errors and correct importing guidance](https://stackoverflow.com/questions/44252183/python-importing-errors-and-correct-importing-guidance)
* See also: Chapter 10, "Modules & Packages", of _Python Cookbook_ (Beazley/Jones)

# Overview

A module is a `.py` file containing Python definitions and statements along with such things imported from other `.py` files. The file name is the module name with the suffix .py appended. **Within a module, the module’s name (as a string) is available as the value of the global variable `__name__`.**  Modules need to be imported.

Some modules are from the [Python standard library](https://docs.python.org/3/library/), and some are from [external libraries](https://pypi.python.org/pypi).  It is customary to refer to available open-source Python libraries as _batteries_ or _packages_.  **A package is a collection of modules in a folder.**  The main purposes of packages is to help organize modules and provide a naming hierarchy.  You can think of packages as the directories on a file system and modules as files within directories.  NumPy, for instance, is a package containing modules for scientific computing.

# Imports

## Recommended import conventions

Don't use wildcards, ever.

```python
# No
from collections import *
```

Imports should usually be on separate lines.

```python
# No
import os, sys

# Yes
import os
import sys
```

Imports should be grouped in the following order:

1. Standard library imports
2. Related third party imports
3. Local application/library specific imports

You should put a blank line between each group of imports.  Within each grouping, **imports should be sorted lexicographically, ignoring case, according to each module's full package path.**

```python
import collections
import warnings

import pandas as pd
import scipy.stats as scs
```

## Import syntax

* Use `import x` for importing packages and modules. 
* Use `from x import y` where `x` is the package prefix and **`y` is the module name with no prefix**. 
* Use `from x import y as z` if two modules named `y` are to be imported or if `y` is an inconveniently long name.

Without knowing any other information, the `import` convention rather than `from` convention is generally preferrable.

Using `from modu import func` is a way to pinpoint the function you want to import and put it in the global namespace. While much less harmful than `import *` because it shows explicitly what is imported in the global namespace, **it still potentially creates ambiguity.**  (For instance, consider `np.sqrt` versus `math.sqrt`.  If you call `sqrt(4)`, which module is `sqrt` from?)

One other important point:

> Note that when using `from package import item`, the `item` can be either:
> - a submodule (or subpackage) of the `package`, or 
> - some other name defined in the package, like a **function, class or variable**. 
> 
> The `import` statement first tests whether the item is defined in the package; if not, it assumes it is a module and attempts to load it. If it fails to find it, an `ImportError` exception is raised.
> 
> **Contrarily, when using syntax like `import item.subitem.subsubitem`, each item except for the last must be a package; the last item can be a module or a package but can’t be a class or function or variable defined in the previous item.**

Reading:
* SO - [Importing modules in Python - best practice](https://stackoverflow.com/questions/9916878/importing-modules-in-python-best-practice/29193752#29193752) and [Properly importing modules in Python](https://stackoverflow.com/questions/896112/properly-importing-modules-in-python)
* Fredrik Lundh - [Importing Python Modules](http://effbot.org/zone/import-confusion.htm)
* Hitchhiker's Guide to Python - [Modules](http://docs.python-guide.org/en/latest/writing/structure/#modules)

## Circular imports

Be careful with [circular imports](http://effbot.org/zone/import-confusion.htm#circular-imports).  A circular dependency occurs when two or more modules depend on each other. This is due to the fact that each module is defined in terms of the other.

First it is incessary to know what occurs when we use `import x` either interactively or within a script:

| Import | Result |
| :----- | :----- |
| `import x` | imports the module `x`, and creates a reference to that module in the current namespace. |
| `from x import *` | imports the module `x`, and creates references in the current namespace to all public objects defined by that module (that is, everything that doesn’t have a name starting with "\_"). |
| `from x import a, b, c` | imports the module `x`, and creates references in the current namespace to the given objects. |

Circular imports arise because **modules are fully executed during import**.

This is all best explained with a simplified example.  Consider a package called `circimpt` with the following structure, to begin with:

```
circimpt/
|-- __init__.py
    |-- from .mod_a import *
    |-- from .mod_b import *
|-- mod_a.py
    |-- def a(x):
            print(x)
|-- mod_b.py
    |-- def b(x):
            print(x)
```

There are no dependencies here.  The imports within `__init__.py` allow us to use `from circimpt import a` rather than `from circimp.mod_a import a`.

Now consider this modified structure (**only changes noted**):

```
|-- mod_a.py
    |-- from .mod_b import b
    |-- def a(x):
            b(x)
```

This is fine also.  There is a one-way dependency here.  When `a` is imported, it imports `b` first with `from .mod_b import b`.  Recall that this **imports the module `mod_b`, and creates references in the current namespace to `b`**.  All of that is fine, in this case.

Alright, now for another modification--this time, one that will give us an issue:

```
|-- mod_a.py
    |-- from .mod_b import b
    |-- def a(x):
            b(x)
|-- mod_b.py
    |-- from .mod_a import a
    |-- def b(x):
            a(x)
```

Here if we were to run `import circimpt`, we would get an `ImportError`.  Note that this will occur on *any* import involving `circimpt`.  Pay attention to the traceback; it shows us:
- Importing the package name jumps to the `__init__.py` file, where we have our wildcard syntax.
- Everything from `mod_a` is imported, which includes the `from mod_b` statement.
- That jumps us to `mod_b`, which has a `from mod_a import a` statement.
- However, `a` is not defined because we already "jumped out of" `mod_a`.  `mod_a` itself is technically defined, just as an empty module.

Note that if we got rid of the `from module import *` statements in `__init__`, this would only *partially* solve the problem.  We'd no longer have an issue on `import circimpt`, but `from circimpt.mod_a import a` would cause an `RecursionError`, as would `from circimpt import mod_a`.  What is required in this case, when two functions reference each other, is a fundamental refactoring of code.

This, on the other hand, would be fine:

```
circimpt/
|-- __init__.py  # Blank
|-- mod_a.py
    |-- from circimpt import mod_b
    |-- def a(x):
            mod_b.b(x)
|-- mod_b.py
    |-- from circimpt import mod_a
    |-- def b(x):
            print(x)
    |-- def c(x):
    |--     mod_a.a(x)
```

Why?
- If we were to `import a`, this first runs the `from circimpt import mod_b` in `mod_a`.
- This evaluates the `from circimpt import mod_a`.  `mod_a` is just an "empty module" but **that is okay because we're not importing `a` directly.**

# The import namespace

The `import` statement binds the results of the import to a name in the local scope.

## Searching, interaction & hierarchy

> *This section sacrifies some technical detail; for example, it skips over [finders and loaders](https://docs.python.org/3/reference/import.html#finders-and-loaders) and `sys.meta_path`.*

When a module is imported, Python searches for the module and if found, it creates a [module object](https://docs.python.org/3/library/types.html#types.ModuleType) (a class), initializing it. If the named module cannot be found, a `ModuleNotFoundError` is raised.

How and in what order is the [module search path](https://docs.python.org/3/tutorial/modules.html#the-module-search-path) performed?
1. `sys.modules` - a mutable list of built-in modules.  This mapping serves as a cache of all modules that have been previously imported, including the intermediate paths. So if `foo.bar.baz` was previously imported, `sys.modules` will contain entries for `foo`, `foo.bar`, and `foo.bar.baz`. Each key will have as its value the corresponding module object. [[source](https://docs.python.org/3/reference/import.html#the-module-cache)]
2. `sys.path` - this contains:
    - the current working directory, denoted by an empty string among the other entires in sys.path. [[source](https://docs.python.org/3/reference/import.html#path-entry-finders)]
    - other locations from PYTHONPATH (a list of directory names, with the same syntax as the shell variable PATH). 
    
This means that **you can place a top-level package in any path specified in `sys.path` and subsequently access it no matter what your current directory is.**

In [1]:
import sys
import pandas as pd

print('length:',len(sys.modules), '\n')
print('modules:')
with pd.option_context('max_rows', 10):
    print(pd.Series(sys.modules))

length: 1563 

modules:
IPython                     <module 'IPython...
IPython.core                <module 'IPython...
IPython.core.alias          <module 'IPython...
IPython.core.application    <module 'IPython...
IPython.core.autocall       <module 'IPython...
                                   ...         
zmq.sugar.version           <module 'zmq.sug...
zmq.utils                   <module 'zmq.uti...
zmq.utils.constant_names    <module 'zmq.uti...
zmq.utils.jsonapi           <module 'zmq.uti...
zmq.utils.strtypes          <module 'zmq.uti...
dtype: object


In [2]:
sys.path # save top-level package to any of these,
         # or you'll need to cd to it to import it or its contents

['',
 '/Applications/anaconda3/lib/python36.zip',
 '/Applications/anaconda3/lib/python3.6',
 '/Applications/anaconda3/lib/python3.6/lib-dynload',
 '/Applications/anaconda3/lib/python3.6/site-packages',
 '/Applications/anaconda3/lib/python3.6/site-packages/aeosa',
 '/Applications/anaconda3/lib/python3.6/site-packages/IPython/extensions',
 '/Users/brad/.ipython']

You can modify `sys.path` using standard list operations:

```python
sys.path.append('/ufs/brad/lib/python')
```

## Intra-package references

Consider the following setup and the simplistic “starting point” in which the `__init__.py` files are all empty, except for a docstring in the top-level `__init__.py` file.  

```
sound/
      __init__.py
      effects/
              __init__.py
              echo.py
              surround.py
              reverse.py
      filters/
              __init__.py
              equalizer.py
              ...
      formats/
              __init__.py
              wavread.py
              wavwrite.py
```

In [3]:
import os
os.getcwd()

'/Users/brad/Scripts/python'

In [4]:
# Because we're working outside of PATH, we'd need to
#     cd to our package if we haven't already
'sound' in os.listdir(os.getcwd())

True

In [5]:
import sound

In [6]:
help(sound)

Help on package sound:

NAME
    sound - This initializes the sound package and serves as a docstring.

PACKAGE CONTENTS
    effects (package)
    egg
    filters (package)
    formats (package)
    notebook
    nutmeg
    vinegar2

SUBMODULES
    echo

FILE
    /Users/brad/Scripts/python/sound/__init__.py




When packages are structured into subpackages (as with `sound`), you can use **absolute imports** to refer to submodules of sibling packages. For example, if the module `sound.effects.surround` needs to use the `echo` module in the `sound.effects` package, it can use from `sound.effects import echo`.

You can also write **relative imports**, with the `from module import name` form of import statement. These imports use leading dots to indicate the current and parent packages involved in the relative import. **From the `surround` module** for example, you might use:

```python
# surround.py
from . import echo
from .. import formats
from ..filters import equalizer
```

(A reminder on package structure:)

```
sound/
      __init__.py
      effects/
              __init__.py
              echo.py
              surround.py
              reverse.py
      filters/
              __init__.py
              equalizer.py
              ...
      formats/
              __init__.py
              wavread.py
              wavwrite.py
```

More on relative imports here: https://www.python.org/dev/peps/pep-0328/#guido-s-decision.

The jury is out on whether relative imports should be used at all.

> Do not use relative names in imports. Even if the module is in the same package, use the full package name. This helps prevent unintentionally importing a package twice. [Google Style Guide]

> Absolute imports are recommended, as they are usually more readable and tend to be better behaved. ... However, explicit relative imports are an acceptable alternative to absolute imports, especially when dealing with complex package layouts where using absolute imports would be unnecessarily verbose. [PEP 8]

> Also: https://softwareengineering.stackexchange.com/a/159505

## Imports within `__init__.py`

The purpose of placing import statements into the `__init__.py` files is to be able to “jump” in path conventions on your imports within the shell.  This is because when a [regular package](https://docs.python.org/3/reference/import.html#regular-packages) is imported, this `__init__.py` file is implicitly executed, and the objects it defines are bound to names in the package’s namespace.

This is best illustrated through examples.  Consider when you import:

In [7]:
from pandas import Series

In reality, `Series` is a class within `pandas`’ `core` package and within the `series` module:

In [8]:
import inspect
inspect.getsourcefile(Series).partition('site-packages/')[-1]

'pandas/core/series.py'

So how does the import “jump” several levels?  You need to place additional imports in the top-level `__init__.py` file.

If in the `__init__.py` for `sound` we placed:

```python
from sound.effects import echo
```

Then we could use:

```python
from sound import echo
```

# Use of `if __name__ == "__main__"`

Say you have a module:

```python
# module1.py
def func1():
    pass
f = func1()
```

When you `import module1`, **all module-level code is executed immediately at the time it is imported.**  The function is not "dangerous" as it will be created, but its internal code will not be executed until the function is called.  _However_, `f = func1()` is also ran at the time of import, a potentially unwanted side-effect.

The solution is to always put our startup code in a function (conventionally, called main) and only execute that function when we know we are running the module as a script, but not when our code is being imported from a different script:

```python
class UsefulClass:
    """This class might be useful to other modules."""
    pass

def main():
    """creates a useful class and does something with it for our module."""
    useful = UsefulClass()
    print(useful)

if __name__ == '__main__':
    main()
```

From the docs:

> The `__main__` module is a special case relative to Python’s import system. As noted elsewhere, the `__main__` module is **directly initialized at interpreter startup**, much like `sys` and `builtins`. However, unlike those two, it doesn’t strictly qualify as a built-in module. This is because the manner in which `__main__` is initialized depends on the flags and other options with which the interpreter is invoked.