# Explore Python Modules

This notebook lets you explore internals python modules.  This notebook compliments to [this](http://bit.ly/2ZrvaD7) exploratory project.

First, let's set things up.

In [1]:
import sys
sys.path.append('share_data')

Now, let's import a leaf module and give it a name *a*

In [2]:
import a.a as a

This module contains standard things that module usually contains, plus exported symbols

In [3]:
dir(a)

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'get_val',
 'globs',
 'set_val',
 'x']

Let's look at some of the attributes

First one is `__spec__`.  This one is used by the two step import process.  This object is created after one of the module finders finds the module.  The module finder populates the spec with the information needed by module loader. Also it decides which module loader is going to be used for importing.

In [4]:
a.__spec__

ModuleSpec(name='a.a', loader=<_frozen_importlib_external.SourceFileLoader object at 0x10c605c18>, origin='share_data/a/a.py')

Since this module is going to be imported by `SourceFileLoader`, it will need a filename and location of the source file.  Note the `__file__` attribute is will not always be there,  for example built in modules like `math` do not have such attribute.

In [5]:
a.__file__

'share_data/a/a.py'

In [6]:
a.__loader__

<_frozen_importlib_external.SourceFileLoader at 0x10c605c18>

The `__name__` attribute is used to identify the module to the python import subsystem.  Note that this name does not depend on how we refer to this module.  

In [7]:
a.__name__

'a.a'

Now let's import the parent of the `a.a` module and call it `b`

In [8]:
import a as b

Note this module has the same standard attributes.

In [9]:
dir(b)

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 'a']

The spec is looks almost the same as a leaf module,  but take a look at the `submodule_search_locations` this indicates where to look for the submodules of this.  For example for relative imports.

In [10]:
b.__spec__

ModuleSpec(name='a', loader=<_frozen_importlib_external.SourceFileLoader object at 0x10c605ac8>, origin='share_data/a/__init__.py', submodule_search_locations=['share_data/a'])

Again, see that even though we are calling this module `b` the python knows it's real name 

In [11]:
b.__name__

'a'

Now let's take a look at which modules python knowns about. Because we are in Jupyter notebook environment, python has imported everybody and their brothers.  Looks towards the end of this list to see the modules we just imported.

In [12]:
sys.modules.keys()



Now let's look the how Python treats the modules. The `id` function below is used to uniquely identify a Python objects.  It is very similar to the pointer in C or C++.   If two variables have the same id,  it means they refer to exactly the same memory location, so they refer to exactly the same object

Let's look at the ID of module `a` and `a.a`

In [13]:
id(sys.modules['a.a'])

4502465496

In [14]:
id(sys.modules['a'])

4502464376

Now let's look at our notebook variables to see if the point to the same location.

In [15]:
id(a)

4502465496

In [16]:
id(b)

4502464376

Note that module the parent module that we call `b` has a reference to it's child module that we call  `a`.  See that both the internal reference inside of the `b` module and our notebook refence to the same module indeed refer to exactly the same object.

In [17]:
id(b.a)

4502465496

Indeed changing the state of the internal `a` module, also changes the state of the our notebook `a` module

In [18]:
b.a.get_val()

10

In [19]:
a.set_val(1345)

In [20]:
b.a.get_val()

1345

Now let's import 'a' leaf module again, and give it a different name

In [21]:
import a.a as c

See that the name in the `__spec__` object is the still the realm name of the module.

In [22]:
c.__spec__

ModuleSpec(name='a.a', loader=<_frozen_importlib_external.SourceFileLoader object at 0x10c605c18>, origin='share_data/a/a.py')

In [23]:
c.__name__

'a.a'

So is the id of the actual module object.  In fact, when importing the same module again,  just returns the reference from the `sys.modules` dictionary we saw above.

In [24]:
id(c)

4502465496

## Loading built-in extension modules

As it was mentioned above, modules don't always come from source `py` files.  Let's look at another type of module, for example built in extension module like `math`.

In [25]:
import math

Let's look at the spec, notice it doesn't have `__file__` attribute, because this module is is stored in the binary shared libary.

In [26]:
math.__spec__

ModuleSpec(name='math', loader=<_frozen_importlib_external.ExtensionFileLoader object at 0x10ac7dcc0>, origin='/Users/vlad/.pyenv/versions/3.7.2/lib/python3.7/lib-dynload/math.cpython-37m-darwin.so')