# Packages

## Introduction

- In Python, Packages are specalized modules which can store modules and packages also (Sub Packages). We can simply say packages are module types but they just have extra functionality to store modules and packages under them.

- Generally modules doesn't have `__path__` property. If a module as `__path__` property then it is called as packages. This is the difference between the package and module. 

- Packages and Modules in python are not just represented by File Systems. But in 99% of applications these modules and packages are represented as files, so we can say `.py` files are considered as modules and directories are considered as packages. But all directories are not considered as packages. The directories which contian `__init__.py` file are considered as packages.

- Packages usually represent a hierarichy of modules/packages. Suppose consider this example.

 <img src="..\_static\packages.png" alt="Package Structure" width="1200" height = "400"/>

 Here we have a package name called pack1 which contains two modules module1a, module1b and has one sub package pack1_1. As we know package is an module object it has a namespace under `__dict__` property of package. So here pack1 can also as a namespace. All the packages and modules under this pack1 are considered as attributes to this package. So `pack1.__dict__` contains all these packages and modules labels with corresponding objects if they gets imported. So to access module1a we actually use dot notation i.e `pack1.module1a`. To access the module1_1a we use this syntax `pack1.pack1_1.module1_1a`. So by using dot notation we can access any packages and modules under particular package becuase they are considerd as attributes to top-level package (like pack1 here).

- Suppose if you a statement in top-level program such as `import pack1.pack1_1.module1_1a`, then python performs these steps first.

  1. It first performs `import pack1` and puts it in sys.modules with label as `pack1`. `pack1` reference pack1 package.
  2. Then it performs `import pack1.pack1_1` and puts it in sys.modules with label as `pack1.pack1_1`. `pack1.pack1_1` reference subpackage pack1_1
  3. Finally it performs `imports pack1.pack1_1.module1_1a` and puts it in sys.modules with label as `pack1.pack1_1.module1_1a`. `pack1.pack1_1.module1_1a` references module1_1a.

  As we know 3 lables are stored in sys.modules but only `pack1` object gets stored in global namespace because python knows we just need only top-level object and with that top-level object we access internal modules and packages. So we have only `pack1` object in global namespace.

- As we know most of the packages in python are just directories and we know packages are just modules , modules contain code and these code gets executed when that module gets loaded. So now where we actually write code for packages as they are not files they are just directories. We cannot write code in directories. As we have already know to make a directory as package, we definitely need a file called `__init__.py`. This `__ init__.py` tells python that the directory is a package as opposed to standard directory. We actually write code of the package in this `__init__.py` file. Whenever python imports this package it executes the code present in this `__init__.py` and loads that package object (module type object) to sys.modules and globals. And this loaded object of package contains all the varaibles, functions present in `__init__.py` file as attributes. That means what ever the objects present in the namespace of `__init__.py` file of packages are gets stored in package object(module object).

- For each package we have 3 main properties. Those are `__file__` , `__path__` and `__package__` . 

  `__file__` : It tell us the location of module/package code in the file system (which is `__init__.py` for package and `module.py` for module)

  `__package__` : It is the package name where module is located in. (If that module is package then `__package__` contains package name itself else it will show us the package name where this module is located. If it is top-level module then it will show empty string)

  `__path__` : It is the location of package in file system. (Modules doesn't have this property.)

  For example consider the above package structure :

  `module1a.__file__` -> `/app/pack1/module1a.py`

  `pack1_1.__file__` -> `/app/pack1/pack1_1/__init__.py`

  `module1_1a.__package__` -> `pack1.pack1_1`

  `pack1_1.__path__` -> `/app/pack1/pack1_1`

In [1]:
# Now lets see how sub packages and modules inside the packages gets imported.

# Now iam just importing the pack1 only and see whether internal packages and modules gets loaded or not

import sys

import pack1

'pack1' in sys.modules

Running pack1...


True

In [2]:
# We can see pack1 gets loaded into sys modules nad its init code executed.

# Now lets see the internal modules and packages are loaded or not.

# If they gets loaded then they must be in pack1.__dict__ 

'module1a' in pack1.__dict__

False

In [3]:
# From the output we can say when we run the import pack1, python actually lloks for pack1 in current directory and create its object using its
# init code and just kept that object in sys.modules. It doesn't loaded internal modules and packages in it.
 
# Now lets load the internal modules of pack1

import pack1.module1a

# Now lets see module1a is in pack1.__dict__

'module1a' in pack1.__dict__

Running Module1a ....


True

In [4]:
# So module1a got imported 

# But module1a label is not in globals

'module1a' in globals()

False

In [5]:
# This is because python loads the module1a in sys.module as pack1.module1a along with reference and just add attach that reference and label 
# to pack1 module (as an attribute). So now we can use pack1 to access that module1a

# Lets check module1a is in sys.modules

'module1a' in sys.modules

False

In [6]:
# Label module1a is not in sys.modules becuase it is staored as pack1.module1a

'pack1.module1a' in sys.modules

True

In [1]:
# To keep module1a label in globals, we need to import module1a like this

from pack1 import module1a



Running pack1...
Running Module1a ....


In [2]:
# Now lets check module1a is in globals or not

'module1a' in globals()

True

In [None]:
# Now lets check whether it is in sys.modules or not

'module1a' in sys.modules

# Label module1a never exist in sys.modules even though you imported it differently. If you import module1a like this, python first imports 
# pack1 and then it imports module1a and label it as pack1.module1a and returns the module1a object as module1a. Since we are returning that
# object as module1a, so this label gets stored in globals

# Like this we import internal modules and packages also

False

In [9]:
import pack1.pack1_1.module1_1a

 Running Module1_1a ....


In [10]:
'module1_1a' in pack1.pack1_1.__dict__

True

**Note** : This file contain the sample code for implemention of packages -> [Package_Implementation_code](../_static/Packages_Implementation_Demo_Code.zip)

## Need of Packages

- By using packages, we can simply break the code up into smaller chunks which makes our code:
 
  1. Easier to write
  2. Easier to test and debug
  3. Easier to read and understand
  4. Easier to document

- But all the code which we broken up into chunks can be stitched together by hiding the inner implementation from users and this helps users to write their code , test and debug easily.


## Structuring the Packages

- Generally, we have sub packages inside a main package. Whenever we are performing the imports, we usually use `import <mainpackage>.<subpackage1>.<subpackage1_1>`. But when we are performing these we acutally giving each detail of how our package is structured and also it is tedious for the users to perform these many dot operators if they want to import a particular attribute or function from a subpackage1_1. 

- So instead of doing these dot operations for import, what we actually do is relative imports. Relative imports simply means we actually need to import sub packages and modules in `__init__.py` file of that paticular package. We gnerally do `from subpackage1_1 import *` in `__init__.py` file of subpackage1 (additionally we also import all attributs and methods present in modules of subpackage1 i.e `from module1 import *`).

- Suppose if you are importing a module which is one level up, what means it is present in just outside of this directory (i.e module and this directory are in same directory). So import that module we use these syntax `from ..module1 import *` (Here .. means go 1 level up from current directory, . means current directory, ... means 2 levels up of current directory). 

- Generally we don't perform `import *` . Because we know these might replaces all common labels with newer objects. And some people only want to import fewer methods and attributes of that module when we performed `import *`. To import whatever methods and attributes that we need to import we use a special attribute called `__all__` which is list. Whatever this list contains those attributes and methods gets imported when we have imported that module. Suppose a module contains boolean(), boolean_helper_1(), boolean_helper_2() methods. If you want to just import boolen() method when you performed `from module import *` then you have write this statement in your module i.e `__all__ = ['boolean']`. This `__all__` is applicable when perform `import *` only

**Note** : When you perform `import <package1>.<subpackage1> as subpackage`, then python import packackage1 and package1.subpackage1 into sys.modules and then it places object of package1.subpackage1 in global namespace with label as subpackage.