# Lecture 25 Module Coding Basics

In this lecture, we will continue studying modules in Python, we will explain in more detail how the `import` and `from` statements work, and we will cover other related topics.

The subsections in the lecture include:
- [Module Creation](#section1)
- [The `import` Statement](#section2)
- [The `from` Statement](#section3)
- [Module Namespaces](#section4)
- [Reloading Modules](#section5)
- [Modules Usage Modes: `__name__` and `__main__`](#section6)

## Module Creation <a id="section1"/>

As we explained in the previous lecture, every Python file with code is referred to as a ***module***.

To create modules, we don’t need to write special syntax to tell Python that we are making a module. We can simply use any text editor to type Python code into a text file, and save it with a `.py` extension; any such file is automatically considered a Python module. 

For example, I have created a simple file called `my_module.py` that is saved in the same directory as this jupyter notebook. The module does not do anything useful, it just defines a few names and print a few statements, but it will help us to understand concepts about modules. The code inside `my_module.py` is shown below.

<img style="float: left; height:280px;" src="images/pic1.jpg">

Similar to the rules for naming other variables in Python, module names should follow the same rules and can contain only letters, digits, and underscores. The module names cannot use Python reserved keywords (e.g., you cannot create a module file named `if.py`.)

## The `import` Statement <a id="section2"/>

Python programs can use the modules file we have created by running an `import` or `from` statement. These statements find, compile, and run a module file’s code. The main difference is that `import` fetches the module as a whole, while `from` fetches specific names out of the module.

Let's import `my_module`. Python executes the statements in the module file one after another, from the top of the file to the bottom. For this module, the two print statements at the top level of the file are executed. The print statements inside the two functions (`main_report` and `sub_report`) are not executed; they will be executed only when the functions are called.

In [1]:
import my_module

I am inside my_module
The value of the variable X is: 3


Note that we don't use the `.py` extension for the files with the `import` statement (i.e., `import my_module.py` will raise an exception).

When the module is imported, a new module object is created. The module object is shown below, where Python mapped the module name to an external filename by adding a directory path from the module search path to the file, and a `.py` extension at the end. 

In [2]:
# The name my_module references to the loaded module object
my_module

<module 'my_module' from 'C:\\Users\\Alex\\Documents\\Codes\\My Folder 2020\\Libraries\\Module 7\\my_module.py'>

Overall, the name `my_module` serves two different purposes: 
1. It identifies the external file `my_module.py` that needs to be loaded.
2. After the module is loaded, it becomes a reference to the module object.

During importing, all the names assigned at the top level of the module become attributes of the module object. In this example, the variables `X` and `Y` and the functions `sub-report` and `main_report` become attributes of the module, and we can call them by using the `object.attribute` syntax (a.k.a. *qualification*). 

In [3]:
my_module.X

3

In [4]:
my_module.Y

5

In [5]:
my_module.sub_report()

The value of the variable Z is: 8
I am a function named sub_report


## The `from` Statement <a id="section3"/>

The `from` statement fetches specific names from the module, and allows to use the names directly (without the need for `module_object.attribute`). This way, we can call the names in the module with less typing.

In [6]:
from my_module import X
X

3

The `from` statement in effect copies the names out of the module into another scope; in this case, in the scope of this jupyter notebook, where the `from` statement appears.

When we run a `from` statement, internally Python first imports the entire module file as usual, then copies the specific names out of the module file, and finally, it deletes the module file. This is similar to the following code:

```python
import my_module 
X = module.X 
del my_module
```

With `from`, we can also import several names at the same time, separated by commas.

In [7]:
from my_module import X, Y, sub_report

In [8]:
sub_report()

The value of the variable Z is: 8
I am a function named sub_report


Another alternative is to use a `*` instead of specific names, which fetches all names assigned at the top level of the referenced module. The following code fetches all four names in our module: `X`, `Y`, `sub_report`, and `main_report`. Note again that the names `Z` and `U` are not defined at the top level in the module, but are enclosed in the functions, and therefore, they can not be fetched with the `import` statement.

In [10]:
from my_module import *
main_report()

The value of the variable U is: 10
I am a function named main_report


One problem with using `from module import *` is that it can silently overwrite variables that happen to have the same name as existing variables in our scope.

In the following example, we have a variable `X = 15`, which was overwritten by the variable `X` with the same name in `my_module` which has the value 3. The way this variable was overwritten may not be obvious (e.g., in large modules with many variables we cannot remember and keep track of all variable names).

In [12]:
X = 15
from my_module import *
print(X)

3


On the other hand, if we use `import`, all names will be defined only within the scope of the module, and the names will not collide with other names in our programs.

In [13]:
X = 15
import my_module
# The print statements this time were not displayed (we will explain later why)

In [15]:
print(X)
print(my_module.X)

15
3


Therefore, programmers need to be careful when using the `from` statement (especially with `*`), and the `import` only statement should be preferred. However, `from` also provides convenience of less typing, and it is still very commonly used.

### When Using `import` is Required

When the same name of a variable or function is defined in two different modules, and we need to use both of the names at the same time, then we must use the `import` statement. 

For instance, let's assume that another module file named `module_no_2.py` also contains a variable `X` and a function `main_report`.

<img style="float: left; height:220px;" src="images/pic2.png">

Using `import` we can load the two different variables X, because including the name of the enclosing module
makes the two names unique.

In [1]:
import my_module # when a module is imported the first time, it is executed
import module_no_2 # when a module is imported afterward, it is not executed

I am inside my_module
The value of the variable X is: 3
I am inside module_no_2
The value of the variable X is: 22


In [2]:
print(my_module.X)
print(module_no_2.X)

3
22


The same holds for the function `main_report` which appears in both modules.

In [17]:
my_module.main_report()
module_no_2.main_report()

The value of the variable U is: 10
I am a function named main_report
The value of the variable Y is: 15
I am a function named main_report


In this case, the `from` statement will fail because we can have only one assignment to the name `X` in the scope.

In [18]:
# Only one variable name X can exist at one time
from my_module import X
from module_no_2 import X
print(X)

22


Another way to resolve the name clashing problem is to use the `as` extension to `from/import` that allows to import a name under another name that will be used as a synonym.

In [19]:
from my_module import X as X1
from module_no_2 import X as X2
print(X1)
print(X2)

3
22


## Module Namespaces <a id="section4"/>

Modules can be understood as just places where collections of names are defined that we want to make visible to the rest of our code. These collections of names live in the module's namespace and represent the attributes of the module object.

To access the namespace of `my_module` object, we can use the built-in `dir` method. We can notice the names we assigned to the module file: `X`, `Y`, `main_report`, and `sub_report`. However, Python also adds some names in the module’s namespace for us; for instance, `__file__` gives the path to the file the module was loaded from, and `__name__` gives the module name.

In [20]:
dir(my_module)

['X',
 'Y',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'main_report',
 'sub_report']

Internally, the module namespaces created by imports are stored  as dictionary objects. Module namespaces can also be accessed through the built-in `__dict__` attribute associated with module objects, where the names are dictionary keys.

In [22]:
my_module.__dict__.keys()

dict_keys(['__name__', '__doc__', '__package__', '__loader__', '__spec__', '__file__', '__cached__', '__builtins__', 'X', 'Y', 'sub_report', 'main_report'])

In [24]:
my_module.__dict__['__file__']

'C:\\Users\\Alex\\Documents\\Codes\\My Folder 2020\\Libraries\\Module 7\\my_module.py'

In [26]:
my_module.__dict__['__name__']

'my_module'

## Reloading Modules <a id="section5"/>

As we have seen, when we `import` a module, the code is executed only once when the module is imported the first time. Subsequent imports use the already loaded module object without reloading or rerunning the file’s code.

To force a module’s code to be reloaded and rerun, you need to ask Python to do so explicitly by calling the `reload` built-in function. The `reload` reruns a module file’s code and overwrites its existing namespace, rather than deleting the module object and re-creating it. Also, the `reload` function returns the module object at the output of the cell.

In [21]:
import my_module

In [22]:
from imp import reload
reload(my_module)

I am inside my_module
The value of the variable X is: 3


<module 'my_module' from 'C:\\Users\\Alex\\Documents\\Codes\\My Folder 2020\\Libraries\\Module 7\\my_module.py'>

Reloading can help to examine a file, for instance, when we make changes to the file. In this case where we use jupyter notebooks, to `import` a file again after we have done some changes to the file, we can just restart the kernel, which will allow us to import the file, without using `reload`.

## Modules Usage Modes: `__name__` and `__main__` <a id="section6"/>

We mentioned before that each module has a built-in attribute called `__name__`, which
Python assigns automatically to all module objects. The attribute is assigned as follows:
- If the file is being imported by using the `import` statement, `__name__` is set to the module’s name.
- If the file is being run as a top-level program file, `__name__` is set to the string `__main__`.

Let's check it with an example. The module file `module_no_3` is shown below, and note that in the first line we will print the assigned attribute `__name__` to confirm that the above is correct. 

<img style="float: left; height:180px;" src="images/pic3.jpg">

As expected, `__name__` is assigned to `module_no_3` when imported, and to `__main__` when run directly.

In [27]:
# The module is imported
import module_no_3

Print the built-in attribute name of the module: module_no_3
The value of the variable X is: 1


In [30]:
# The module is run by passing it as a command to the Python interpreter
!python module_no_3.py

Print the built-in attribute name of the module: __main__
The value of the variable X is: 1


The advantage is that a module can use the `__name__` attribute in the following `if` test  `if __name__ == '__main__'` to determine whether it is being run or imported. The most common way to use this is for self-test code that is written at the bottom of a file under the `__name__` test.

For instance, the file `module_no_3a` is similar to the file `module_no_3`, only that it includes several lines of code at the bottom, which test whether the function `CelsiusToFahrenheit` outputs expected values. When run as a command in the cell, the `if __name__ == '__main__'` is True, and the lines that test the outputs of the `CelsiusToFahrenheit` are run. Conversely, when the module file is imported, the various variables and functions are imported, but the `if __name__ == '__main__'` is False, and the lines that test the outputs of the `CelsiusToFahrenheit` are not run. 

<img style="float: left; height:280px;" src="images/pic4.jpg">

In [38]:
!python module_no_3a.py

Print the built-in attribute name of the module: __main__
The value of the variable X is: 1
--------------------
Self-testing
100 degrees Fahrenheit is 37.77777777777778 degrees Celsius
32 degrees Fahrenheit is 0.0 degrees Celsius
0 degrees Fahrenheit is -17.77777777777778 degrees Celsius


In [39]:
import module_no_3a

Print the built-in attribute name of the module: module_no_3a
The value of the variable X is: 1


The above code allows to test the logic in our code without having to retype everything at the notebook cell of at the interactive command line each time we edit the file. Besides, the output of the self-test call will not
appear every time this file is imported from another file. 

Functions defined in files with the `__name__` test can be run as standalone functions, and they can also be reused in other programs.