# 2. Python Packages

Suppose you have developed a very large application that includes many modules. As the number of modules grows, it becomes difficult to keep track of them all if they are dumped into one location. This is particularly so if they have similar names or functionality. You might wish for a means of grouping and organizing them.

Packages allow for a hierarchical structuring of the module namespace using dot notation. In the same way that modules help avoid collisions between global variable names, packages help avoid collisions between module names.

Creating a package is quite straightforward, since it makes use of the operating system’s inherent hierarchical file structure. Consider the following arrangement:

```text
pkg
├── mod1.py
└── mod2.py
```

In mod1.py

```python
def mod1_func():
    print('This is [mod1] function()')


class Mod1Class:
    pass


```

In mod2.py

```python

def mod2_func():
    print('This is [mod2] function()')


class Mod2Class:
    pass

```

## 2.1 Import package

Given this structure, if the pkg directory resides in a location where it can be found (in one of the directories contained in sys.path), you can refer to the two modules with dot notation (pkg.mod1, pkg.mod2) and import them with the syntax you are already familiar with:

```text
# as the two module are in the same package, so it's ok to import them in one import line
import <module_name>[, <module_name> ...]
```


In [7]:
from learning_python.Lesson04_Module_Package.src import pkg

pkg.mod1.mod1_func()

pkg.mod2.mod2_func()

This is [mod1] function()
This is [mod2] function()


In [8]:
pkg.mod1

<module 'learning_python.Lesson04_Module_Package.src.pkg.mod1' from '/home/pliu/git/Learning_Python/learning_python/Lesson04_Module_Package/src/pkg/mod1.py'>


We have seen the `import pkg` doesn’t do much of anything useful. In particular, it does not place any of the modules in pkg into the local namespace. We have to give the complete path of the module to use objects inside the module.

To actually import the modules or their contents, you need to use more specific import

In [9]:
from learning_python.Lesson04_Module_Package.src.pkg.mod1 import mod1_func
from learning_python.Lesson04_Module_Package.src.pkg.mod2 import mod2_func

mod1_func()
mod2_func()

This is [mod1] function()
This is [mod2] function()


## 2.2 Package Initialization

If a file named __init__.py is present in a package directory, it is invoked when the package or a module in the package is imported. This can be used for execution of package initialization code, such as initialization of package-level data.

For example, consider the following __init__.py file has been added to pkg folder.

```python
print(f'Invoking __init__.py for {__name__}')
pkg_data = ['data', 'of', 'package']
```

Now when the package is imported, the global list pkg_data is initialized:

In [11]:
from learning_python.Lesson04_Module_Package.src import pkg
# we need to reload the pkg to see the content of __init__. Because we have imported pkg once
# remember the package or module code only get load once after the first import. If your package code changes, you need to reload it
# to see the new version
import importlib
importlib.reload(pkg)
pkg.pkg_data

Invoking __init__.py for learning_python.Lesson04_Module_Package.src.pkg


['data', 'of', 'package']

### 2.2.1 Use __init__ to pass global variable

A module in the package can access the global variable defined in __init__ by importing it. Modify the mod1.py and mod2.py, and following line to mod1

```python
def show_global_var():
    from learning_python.Lesson04_Module_Package.src.pkg import pkg_data
    print(f"The global var imported from pkg: {pkg_data}")
```

In [12]:
from learning_python.Lesson04_Module_Package.src.pkg import mod1
import importlib
importlib.reload(mod1)
mod1.show_global_var()


The global var imported from pkg: ['data', 'of', 'package']


### 2.2.2 Use __init__ to import module
__init__.py can also be used to effect automatic importing of modules from a package. For example, earlier you saw that the statement import pkg only places the name pkg in the caller’s local symbol table and doesn’t import any modules. But if __init__.py in the pkg directory contains the following:

```python
from learning_python.Lesson04_Module_Package.src.pkg import mod1, mod2
```

then when you execute import pkg, modules mod1 and mod2 are imported automatically:

In [13]:
from learning_python.Lesson04_Module_Package.src import pkg

pkg.mod1.mod1_func()
pkg.mod2.mod2_func()

This is [mod1] function()
This is [mod2] function()


## 2.3 Use wildcard to import package

Before we start, let's add some new module into the package. The new package should look like this:
```text
├── pkg
│   ├── __init__.py
│   ├── mod1.py
│   ├── mod2.py
│   ├── mod3.py
│   ├── mod4.py
│   ├── mod5.py

```

You have already seen that when `wildcard(*)` is used for a module, all objects from the module are imported into the local symbol table, except those whose names begin with an underscore(_).

We can also use `wildcard(*)` for a package. Let's try it

In [1]:
# you need to restart your jupyter to clear your namespace
# check the imported module before import
dir()

['In',
 'Out',
 '_',
 '__',
 '___',
 '__builtin__',
 '__builtins__',
 '__doc__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 '_dh',
 '_i',
 '_i1',
 '_ih',
 '_ii',
 '_iii',
 '_oh',
 'exit',
 'get_ipython',
 'pydev_jupyter_utils',
 'quit',
 'remove_imported_pydev_package',
 'sys']

In [2]:
# now let's check what the wildcard import could import to our name space
from learning_python.Lesson04_Module_Package.src.pkg import *
dir()

Invoking __init__.py for learning_python.Lesson04_Module_Package.src.pkg


['In',
 'Out',
 '_',
 '_1',
 '__',
 '___',
 '__builtin__',
 '__builtins__',
 '__doc__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 '_dh',
 '_i',
 '_i1',
 '_i2',
 '_ih',
 '_ii',
 '_iii',
 '_oh',
 '_pydevd_bundle',
 'exit',
 'get_ipython',
 'learning_python',
 'pydev_jupyter_utils',
 'pydev_jupyter_vars',
 'quit',
 'remove_imported_pydev_package',
 'sys']

Hmph. Not much. You might have expected (assuming you had any expectations at all) that Python would dive into the package directory, find all the modules it could, and import them all. But as you can see, by default that is not what happens.

In fact, Python follows this convention:
1. check if the __init__.py file in the package directory contains a list named __all__,
2. if yes, it is taken to be a list of modules that should be imported when the statement from <package_name> import * is encountered.
3. if no, do nothing

So if we want the wildcard to work, we need to add below code to the __init__:

```python
__all__ = [
        'mod1',
        'mod2',
        'mod3',
        'mod4',
        'mod5'
        ]
```

In [1]:
from learning_python.Lesson04_Module_Package.src.pkg import *
dir()

Invoking __init__.py for learning_python.Lesson04_Module_Package.src.pkg


['In',
 'Out',
 '_',
 '__',
 '___',
 '__builtin__',
 '__builtins__',
 '__doc__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 '_dh',
 '_i',
 '_i1',
 '_ih',
 '_ii',
 '_iii',
 '_oh',
 'exit',
 'get_ipython',
 'mod1',
 'mod2',
 'mod3',
 'mod4',
 'mod5',
 'quit']

Now, you can find  ['mod1', 'mod2', 'mod3', 'mod4', 'mod5'] are in our namespace

Using import * still isn’t considered terrific form, any more for packages than for modules. But this facility at least gives the creator of the package some control over what happens when import * is specified. (In fact, it provides the capability to disallow it entirely, simply by declining to define __all__ at all. As you have seen, the default behavior for packages is to import nothing.)

By the way, __all__ can be defined in a module as well and serves the same purpose: to control what is imported with import *. For example, modify mod1.py as follows:

```python
__all__ = ['mod1_func']
```

Now when we do `from pkg.mod1 import *`, it will only import what is contained in __all__

In summary,

__all__ is used by both packages and modules to control what is imported when import * is specified. But the default behavior differs:

- For a package, when __all__ is not defined, import * does not import anything.
- For a module, when __all__ is not defined, import * imports everything (except—you guessed it—names starting with an underscore).

## 2.4 Sub package

Packages can contain nested subpackages to arbitrary depth.

Importing still works the same as shown previously. Syntax is similar, but additional dot notation is used to separate package name from subpackage name

In addition, a module in one subpackage can reference objects in a sibling subpackage (in the event that the sibling contains some functionality that you need). For example, suppose you want to import and execute function mod1_func() (defined in module mod1) from within module mod6. You can use
- an absolute import: `from learning_python.Lesson04_Module_Package.src.pkg.mod1 import mod1_func`
- an relative import : `from ..mod1 import mod1_func`. Just like in file system `.` means current package/subpackage, where `..` refers to the package one level up