When calling `python folder_name or a zipfile` it actually will look for a `__main__.py` file
and if found will run it.

PEP 302

# What is a Module

## Namespaces
Namespaces hold references to all the symbols we have an which object they point to in memory.

In [None]:
# To check the namespace
globals() # Dict containing all that info

In [None]:
def my_func():
    print("hello_world!")
# We could check the existance of our function inside the namespace

globals().get('my_func') # it returns the reference to the object in memory

# We could call the function by
f = globals().get('my_func')

print(f() == my_func())

In [None]:
# When we are in the global scope both locals and globals return the same dict
print(locals() is globals())

True


In [None]:
# however, inside a local scope we see a different story
a = 100
print(globals()['a'])

def my_func():
    a = 3
    print(f"I am the local value for a: {locals()['a']}")
    print(f"I am the global value for a inside the function: {globals()['a']}")

my_func()

100
I am the local value for a: 3
I am the global value for a inside the function: 100


## What happens when we import a module?

It is important to note that when running the `import` statement Python:
1. Imports at run time, i.e while our code is running. This is a difference with compiled languages
as those get the modules compiled and linked at compile time. One thing to note is that in both cases
(both compiled and interpreted) the system needs to know where those code files **Exist**

In [None]:
# Let's check how python finds modules.
import sys

# where is python installed
print(f"Python is installed in {sys.prefix}")

# Where are the C binaries located
print(f"The C Binaries are located in {sys.exec_prefix}")

Python is installed in /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8
The C Binaries are located in /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8


These two properties show that python and it's modules are in the same folder, However, if we were to create a VENV then python would modify these properties to prefix where the binaries are located.   
So now, where does python look for imports?

In [None]:
# Python looks for imports in
print(f"I look for imports in {sys.path}")

I look for imports in ['/Users/crestrepo/Personal/Estudio/Udemy/Python_Deep_Dive/Modules, Packages and Namespaces', '/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python38.zip', '/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8', '/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/lib-dynload', '', '/Users/crestrepo/Library/Python/3.8/lib/python/site-packages', '/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/site-packages']


In [None]:
# When we import a module we see 
# <{type/module} {module_name} from {place_imported}
# NOTE: Usually built-in modules are written in C.
import math
print(math)

<module 'math' from '/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/lib-dynload/math.cpython-38-darwin.so'>


In [None]:
# This means that we now have an object module called math
print(type(math))

<class 'module'>


In [None]:
# This means that math(our module) is in our namespace.
print(globals()['math'])

# So our namespace (our dict) now has a mention to the object module imported.
print(globals()['math'] is math)

<module 'math' from '/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/lib-dynload/math.cpython-38-darwin.so'>
True


I actually goes through all the paths contained in the `sys.path` and try to find its, location.   
At a high level Python actually does the following.
1. Check the `sys.modules` cache to see if the module has already been imported, if so simply uses the reference in there, otherwise it moves to step 2
2. It creates a new module object (`types.ModuleType`)
3. Loads the source code from file
4. adds an entry to `sys.modules` with name as key and the newly created object.
5. Compiles and executes the source code.

It is important to remember that if the module gets imported again, it does not changes or loads the module again it simply looks in the cache and uses that.

In [None]:
# id's stay the same
import fractions
print(id(fractions))
import fractions
print(id(fractions))
# Both id's are the same

4481924448
4481924448


Modules usually don't get realoaded when a second import to them is run again, the id in memory is the same.
- There is a global cache that mantains the module loaded for further use.
- we can think of it as a singleton object.
- So modules get loaded into memory and are referenced in the globals dict, however, they are loaded into the system cache

In [None]:
# To check the module in the system cache we use
import sys
sys.modules# all loaded modules, contains the symbol and where it is located in memory.

# it is a dict
print(type(sys.modules))
print(sys.modules['math'])
print(id(sys.modules['math'])) 

<class 'dict'>
<module 'math' from '/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/lib-dynload/math.cpython-38-darwin.so'>
4463955456


When python imports a module it gets loaded in the sys.modules and then referenced in the globals, that way when reloading a module
the first place it looks is the sys.modules to check if it has to load it again.

In [None]:
# we can check metadata of the module
print(math.__dict__['__spec__']) # module spec

import fractions
print(fractions.__dict__['__spec__'])

# so fractions imports a .py file since it is written in python
print(fractions.__dict__['__file__']) # we can check where it imported it from.

# If we were to do a globals inside the module we would get the same as the __dict__

ModuleSpec(name='math', loader=<_frozen_importlib_external.ExtensionFileLoader object at 0x10a12b3a0>, origin='/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/lib-dynload/math.cpython-38-darwin.so')
ModuleSpec(name='fractions', loader=<_frozen_importlib_external.SourceFileLoader object at 0x10b24f520>, origin='/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/fractions.py')
/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/fractions.py


In [None]:
import fractions
# We can delete the entry from the globals dict and python will fail to find it.
print(id(fractions))
del globals()['fractions']

import fractions # however once i imported here all it does is look in the cache and assign that again.
print(id(fractions))
f = fractions.Fraction()

4417724064
4417724064


In [None]:
# To further test that Python looks in the sys cache we can "Hack" it and create our module
sys.modules['test'] = lambda: 'Testing module caching'

import test
print(test())
print(test)

Testing module caching
<function <lambda> at 0x108b17940>


notes: 
- __main__ is our module where we start.
- types.ModuleType is the object type.
- We can create modules from this type

In [None]:
import types
print(isinstance(math, types.ModuleType))

True


In [None]:
mod = types.ModuleType('test', 'This is a test')
print(mod)
print(isinstance(mod, types.ModuleType))
print(mod.__dict__)

# We can add attrubutes
mod.hello = lambda: 'hello!'

print(mod.hello())

# we can use
hello_func = getattr(mod, 'hello')
print(hello_func()) 
# which is the same as
hello_func2 = mod.__dict__['hello']
print(hello_func2())

<module 'test'>
True
{'__name__': 'test', '__doc__': 'This is a test', '__package__': None, '__loader__': None, '__spec__': None}
hello!
hello!
hello!


## Imports and Importlib

in searching for the module to load we usually have some helpers in the form of.
- finders, goes and looks for the module since it can be in a number of different places. So when importing
Python goes to the different finders and asks if it exists.
- loaders
- finder + loader == 

Note. there exists `.pth` files to prevent hardcoding of paths.

we can import using both
- `import` statement
- `importlib.import_module` function

In [None]:
# So our modules usually have some specifications
import sys
import math
# we can see that it has the name, loader and origin of the module.
math.__spec__

ModuleSpec(name='math', loader=<_frozen_importlib_external.ExtensionFileLoader object at 0x1063f13d0>, origin='/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/lib-dynload/math.cpython-38-darwin.so')

In [None]:
# To look at the importers we can use the
print(sys.meta_path)

[<class '_frozen_importlib.BuiltinImporter'>, <class '_frozen_importlib.FrozenImporter'>, <class '_frozen_importlib_external.PathFinder'>, <six._SixMetaPathImporter object at 0x10721b190>, <pkg_resources.extern.VendorImporter object at 0x108476dc0>, <pkg_resources._vendor.six._SixMetaPathImporter object at 0x108491be0>]


In [None]:
# This gives us the spec of the module
importlib.util.find_spec('decimal')

ModuleSpec(name='decimal', loader=<_frozen_importlib_external.SourceFileLoader object at 0x1071bfbb0>, origin='/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/decimal.py')

In [None]:
with open('module1.py', 'w') as code_file:
    code_file.write('print("Running module1.py")\n')
    code_file.write('a=100\n')

# see if the finders know where it is.
print(importlib.util.find_spec('module1'))

# now let's move the .py file
import os
import sys

with open(os.path.join(os.environ['HOME'], "Desktop", "mo2.py"), 'w') as code_file:
    code_file.write('print("Running module1.py")\n')
    code_file.write('a=100\n')
# it can not find it since it is not in the sys.path
print(importlib.util.find_spec('mo2'))
# i append the path to the module
sys.path.append(os.path.join(os.environ['HOME'], "Desktop"))
# it is now able to find it.
print(importlib.util.find_spec('mo2'))

ModuleSpec(name='module1', loader=<_frozen_importlib_external.SourceFileLoader object at 0x108d7e940>, origin='/Users/crestrepo/Personal/Estudio/Udemy/Python_Deep_Dive/Modules, Packages and Namespaces/module1.py')
None
ModuleSpec(name='mo2', loader=<_frozen_importlib_external.SourceFileLoader object at 0x108ccb310>, origin='/Users/crestrepo/Desktop/mo2.py')


In [None]:
import importlib
# it is created as a reference in sys.modules
importlib.import_module('math')
# it exists
print('math' in sys.modules)
# but we can not explicitly call the module since it does not 
# exists in our global namespace
print('math' in globals())

# these two work the same way.
math2 = sys.modules['math']
math2 = importlib.import_module('math')

print('math2' in globals())

True
False
True


## Import Variants

In [None]:
import sys

# We can verify that a library like cmath is not already in the sys.modules
print('cmath' in sys.modules)

# We can check too that it is not in our global namespace
print('cmath' in globals())

# We can also see what is imported by default, there is though a lot of stuff
# From Jupyter

# for name in sorted(sys.modules):
#   print(name)

False
False


### Basic Import

So the import work as follows:
- make sure it does not exists in `sys.modules`.
  - if it does just load it.
  - if it doesn't load it and insert the reference `sys.modules -> "module_name": <module object>`
- add `module_name` to our code **global namespace** referencing the same object.
  - if it already exists in our global namespace just replace that reference.

### Import module as

So the import work as follows:
- make sure it does not exists in `sys.modules`.
  - if it does just load it.
  - if it doesn't load it and insert the reference `sys.modules -> "module_name": <module object>`
- add `module_name` from the `as` sentence to our code **global namespace** referencing the same object from the `sys.modules`.
  - if it already exists in our global namespace just replace that reference.

### From module import ...

So the import work as follows:
- make sure it does not exists in `sys.modules`.
  - if it does just load it.
  - if it doesn't load it and insert the reference `sys.modules -> "module_name": <module object>`
- add `module_func/object` to our code **global namespace** referencing the `<module.function/object object>`.
  - if it already exists in our global namespace just replace that reference.
  - the module is not in our namespace, only the reference to the specified function/object

In [None]:
# does not exist in sys.modules
print('cmath' in sys.modules)

# import statement
from cmath import exp

# loaded cmath in sys.modules
print('cmath' in sys.modules)
# did not load cmath in globals
print('cmath' in globals())
# created reference to exp in globals
print('exp' in globals())

# how to reference it
cmath = sys.modules['cmath']

False
True
False
True


### From module import ... as ...

So the import work as follows:
- make sure it does not exists in `sys.modules`.
  - if it does just load it.
  - if it doesn't load it and insert the reference `sys.modules -> "module_name": <module object>`
- add `module_func/object` symbol specified in the `as` sentence to our code **global namespace** referencing the `<module.function/object object>`.
  - if it already exists in our global namespace just replace that reference.
  - the module is not in our namespace, only the reference to the specified function/object

### From module import *

So the import work as follows:
- make sure it does not exists in `sys.modules`.
  - if it does just load it.
  - if it doesn't load it and insert the reference `sys.modules -> "module_name": <module object>`
- add all `module_func/object` symbols (var) to our code **global namespace**.
  - if it already exists in our global namespace just replace that reference.
  - all symbols **but** the module name get defined.

This implementation can lead to bugs and errors in our code, since importing * means that if two modules have objects/functions that have the same symbols (var names) they get overwritten.

In [None]:
from cmath import *

# all references are loaded
for key in globals():
  pass
  # print(key)

# but if we were to do this
globals()['sin']
from math import *
# now points to math.sin and not cmath.sin
globals()['sin']

<function math.sin>

In every case the module had to be loaded into memory and referenced in `sys.modules`. So there is no more efficient way of handling an import since it all comes down to how do we reference what we want in our `Global namespace`.
- In reality, for modules we only affect what symbols we place in our code namespace.
- It is efficient in the way that when we call the `module.function/object` it has to look up first the `module` and then the `function/object` while if referenced directly only the lookup is done for the `function/object`. (since this is technically dictionary lookups there is not really much difference).

in the end don't do things for efficiecy this way, just do it for readability.

In [None]:
# let's check some efficiency
from time import perf_counter
from collections import namedtuple

Timings = namedtuple('Timings', 'timing_1m timing_2, abs_diff rel_diff_perc')

def compare_timings(timing1, timing2):
  rel_diff = (timing2 - timing1) / timing1 * 100

  timings = Timings(round(timing1, 1),
                    round(timing2, 1),
                    round(timing2 - timing1, 1),
                    round(rel_diff, 2))
  return timings

In [None]:
# let's do some test with iterations
test_repeats = 10_000_000

In [None]:
import math

start = perf_counter()
for _ in range(test_repeats):
  math.sqrt(2)
finish = perf_counter()

elapsed_fully_qualified = finish - start
print(f'elapsed: {elapsed_fully_qualified}')

elapsed: 1.4263942390000466


In [None]:
from math import sqrt

start = perf_counter()
for _ in range(test_repeats):
  sqrt(2)
finish = perf_counter()

elapsed_direct_symbol = finish - start
print(f'elapsed: {elapsed_direct_symbol}')

elapsed: 1.0784087579997959


In [None]:
import math

def func():
  math.sqrt(2)

start = perf_counter()
for _ in range(test_repeats):
  func()
finish = perf_counter()

elapsed_func_symbol = finish - start
print(f'elapsed: {elapsed_func_symbol}')

elapsed: 2.227085728999782


In [None]:
from math import sqrt

def func():
  sqrt(2)

start = perf_counter()
for _ in range(test_repeats):
  func()
finish = perf_counter()

elapsed_nested_func_direct_symbol = finish - start
print(f'elapsed: {elapsed_nested_func_direct_symbol}')

elapsed: 1.8495281240000168


In [None]:
def func():
  import math
  math.sqrt(2)

start = perf_counter()
for _ in range(test_repeats):
  func()
finish = perf_counter()

elapsed_func_direct_symbol = finish - start
print(f'elapsed: {elapsed_func_direct_symbol}')

elapsed: 3.5456399209999745


In [None]:
def func():
  from math import sqrt
  sqrt(2)

start = perf_counter()
for _ in range(test_repeats):
  func()
finish = perf_counter()

elapsed_func_direct_symbol_2 = finish - start
print(f'elapsed: {elapsed_func_direct_symbol_2}')

elapsed: 10.702530687000035


In [None]:
compare_timings(elapsed_fully_qualified, elapsed_direct_symbol)

Timings(timing_1m=1.4, timing_2=1.1, abs_diff=-0.3, rel_diff_perc=-24.4)

In [None]:
compare_timings(elapsed_func_symbol, elapsed_func_direct_symbol)

Timings(timing_1m=1.8, timing_2=1.8, abs_diff=0.0, rel_diff_perc=2.24)

## Reloading Modules

In [None]:
# let's create a file with a single function on our local system

import os

def create_module_file(module_name, **kwargs):
  """Create a module file named <module_name>.py
  Module has a single function (print_values) that will print
  out the supplied (stringified) kwargs
  """

  module_file_name = f"{module_name}.py"
  module_rel_file_path = module_file_name
  module_abs_file_path = os.path.abspath(module_rel_file_path)

  with open(module_abs_file_path, 'w') as f:
    f.write(f"# {module_name}.py\n\n")
    f.write(f"print('running {module_file_name}...')\n\n")
    f.write(f"def print_values():\n")
    for key, value in kwargs.items():
      f.write(f"\tprint('{str(key)}', '{str(value)}')\n")

In [None]:
create_module_file('test', k1=10, k2='python')

In [None]:
# we can now import it
import test
test.print_values()

k1 10
k2 python


In [None]:
# Now let's create a new one that overwrites the current test.
create_module_file('test', k1=10, k2='python', k3='cheese')

In [None]:
# since it is now in sys.modules, even though it was updated the reference still
# Exists so it is just loading that again.
import test

test.print_values()

k1 10
k2 python
k3 cheese
k4 2


In [None]:
# We need to reload it by
del sys.modules['test']

# re importing so the reference gets updates
import test

test.print_values()

# This is not a good approach since only the code that runs AFTER this import
# gets the updated test. the old one is still referenced.

running test.py...
k1 10
k2 python
k3 cheese


In [None]:
# Now let's create another one that overwrites the current test.
create_module_file('test', k1=10, k2='python', k3='cheese', k4=2)

In [None]:
import importlib
# keeps the same memory address so it UPDATES it so any previous calls get the 
# new object module.
importlib.reload(test)

# this however is limited when we instead of reloading the module, have a direct
# reference to a function inside, this way our globals() do not know of the
# module and only the sys.modules does.

running test.py...


<module 'test' from '/content/test.py'>

## Recap

When a module is imported
1. System cache is checked first in `sys.modules`. -> if in cache just return cached reference.
2. if not found, use **finders**, e.g `sys.meta_path`.
3. Once located we need to retrieve the code with the use of **loaders**, returned by `finder` -> `ModuleSpec`.
4. An empty module typed object is created, and a reference to the module is added to the system cache. -> add reference to `sys.modules`(to prevent circular references the reference is added before code is compiled and executed).
5. Module is compiled.
6. Module is executed.

Module Finders

Found in `sys.meta_path`.
- `_frozen_importlib.BuiltinImporter` -> finds built-ins, such as math.
- `_frozen_importlib.FrozenImporter` -> finds frozen(self contained python apps) modules.
- `_frozen_importlib_external.PathFinder` -> File based modules. it finds modules based on `sys.path` and package `__path__`

Built in module properties

In [6]:
# built-in
import math

# Type of module
print(type(math))

# Specs of module
print(math.__spec__)

# Name of module
print(math.__name__)

# Package?, should be empty
print(math.__package__)

# File (Built-ins don't have that)
# math.__file__

<class 'module'>
ModuleSpec(name='math', loader=<class '_frozen_importlib.BuiltinImporter'>, origin='built-in')
math



Standard Library

In [7]:
# built-in
import fractions

# Type of module
print(type(fractions))

# Specs of module
print(fractions.__spec__)

# Name of module
print(fractions.__name__)

# Package?, should be empty
print(fractions.__package__)

# File (Built-ins don't have that)
print(fractions.__file__)

<class 'module'>
ModuleSpec(name='fractions', loader=<_frozen_importlib_external.SourceFileLoader object at 0x7f21b168b510>, origin='/usr/lib/python3.7/fractions.py')
fractions

/usr/lib/python3.7/fractions.py
