# Python Scripts, Modules and Packages

There are two ways a Python source file can be executed:
- as a main program/script
- as a module/package

When a Python file is executed as a script, the `__name__` variable will set to the string `__main__`. If it is imported as a module, the `__name__` variable will be set to the module name. 

In [3]:
__name__

'__main__'

## Modules and Packages

Large Python programs are typically organised into modules and packages. A module or a package is a way of wrapping up your code into nice organisational units. This has several benefits:
- Logic in modules/packages can be reused in other scripts or programs. 
- Modules/packages in a program allow for better code design and separation of responsibilities. 

### Modules
Any Python source file (`.py`) file can be used as a module. A module is just a python `.py` file. Modules can be imported using the `import` statement. 

### Import priority

When you run an import statement, python will look in the following two places for a module/package:

1. Your current working directory.
2. File paths that are on you PYTHON PATH.

In [4]:
import os # This is a file called: os.py that lives somewhere on my computer
import sys

In [5]:
# get current working directory.
os.getcwd() # First an import statement will look here

'c:\\Users\\GarethDavies\\OneDrive - Kubrick Group\\cohorts\\DE\\de28\\Python\\4-Production\\1-ModulesAndPackages'

In [6]:
# sys.path shows you the file paths on your PYTHON PATH
sys.path

['c:\\Users\\GarethDavies\\OneDrive - Kubrick Group\\cohorts\\DE\\de28\\Python\\4-Production\\1-ModulesAndPackages',
 'c:\\Users\\GarethDavies\\anaconda3\\python39.zip',
 'c:\\Users\\GarethDavies\\anaconda3\\DLLs',
 'c:\\Users\\GarethDavies\\anaconda3\\lib',
 'c:\\Users\\GarethDavies\\anaconda3',
 '',
 'c:\\Users\\GarethDavies\\anaconda3\\lib\\site-packages',
 'c:\\Users\\GarethDavies\\anaconda3\\lib\\site-packages\\locket-0.2.1-py3.9.egg',
 'c:\\Users\\GarethDavies\\anaconda3\\lib\\site-packages\\win32',
 'c:\\Users\\GarethDavies\\anaconda3\\lib\\site-packages\\win32\\lib',
 'c:\\Users\\GarethDavies\\anaconda3\\lib\\site-packages\\Pythonwin',
 'c:\\Users\\GarethDavies\\anaconda3\\lib\\site-packages\\IPython\\extensions',
 'C:\\Users\\GarethDavies\\.ipython']

#### Creating our own custom module


In [None]:
# Save the following content into a file called hello.py
# 
import pandas as pd

my_list = [1,2,3]
my_series = pd.Series(my_list, index = ['a','b','c'])

def make_df(cols, rows):
    data = {c:[str(c)+str(r) for r in rows] for c in cols}
    return pd.DataFrame(data)

my_df = make_df('abc','1234')

if __name__ == '__main__':
    print(f'__name__ is {__name__}' )
    print('This file is running as a script')
else:
    print(f'__name__ is {__name__}' )
    print('This file is running as a module')

Run the above file as a script by running `python hello.py` from command line and check the output. Also, do `import hello` and check the output. Compare the differences. 

#### Loading our module

In [7]:
import hello

__name__ is hello
This file is running as a module


In [8]:
dir(hello)

['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'make_df',
 'my_df',
 'my_list',
 'my_series',
 'pd']

In [9]:
# Getting variables from hello
hello.my_list

[1, 2, 3]

In [11]:
# Calling functions from hello
new_df = hello.make_df("abc", "1234")
new_df

Unnamed: 0,a,b,c
0,a1,b1,c1
1,a2,b2,c2
2,a3,b3,c3
3,a4,b4,c4


In [12]:
import hello # Calling import for the second time in the file won't reload the module

To reload an already imported module, we can use the reload method from importlib 

In [13]:
from importlib import reload

In [15]:
reload(hello) # When you import a module, all code in the module will be run

    a   b   c
0  a1  b1  c1
1  a2  b2  c2
2  a3  b3  c3
3  a4  b4  c4
__name__ is hello
This file is running as a module


<module 'hello' from 'c:\\Users\\GarethDavies\\OneDrive - Kubrick Group\\cohorts\\DE\\de28\\Python\\4-Production\\1-ModulesAndPackages\\hello.py'>

#### Import specific names from the module

`from <module> import <symbol1>, <symbol2>`

In [17]:
# Just import my_list and make_df, 
# the other variables and functions won't exist in this file's namespace
from hello import my_list, make_df

In [19]:
# Now my_list and my_series will be directly accesible in this namespace
# Note: I don't have to use the hello. syntx
my_list

[1, 2, 3]

In [22]:
make_df("abc", "123")

Unnamed: 0,a,b,c
0,a1,b1,c1
1,a2,b2,c2
2,a3,b3,c3


#### Import all names (Aside)
You can import all elements from a module by using the following syntax:
`from <module> import *`

By default, this imports all symbols into the current namespace. You can control which symbols are imported by defining an `__all__` variable within the module. For example, add `__all__ = ['my_variable']` in the test_module.py, then only `my_variable` will be imported when you run `from test_module import *`.

**Note**: While this is possible, it is considered bad practice to import all objects into the current namespace. 


## Packages
Packages allow a collection of modules and subpackages to be grouped together under a common package name. 

A package is defined by creating a directory with the same name as the package, then creating `__init__.py` in that directory. When we import a package, the `__init__.py` file is run. 

In the directory, it is possible to have additional modules or subpackages. 

In [28]:
reload(my_package)

<module 'my_package' from 'c:\\Users\\GarethDavies\\OneDrive - Kubrick Group\\cohorts\\DE\\de28\\Python\\4-Production\\1-ModulesAndPackages\\my_package\\__init__.py'>

In [29]:
# Variables and functions in the __init__.py are available in the my_package namespace
my_package.package_var

2

In [25]:
# If I want to access variables and functions in a my_module
# We need to import it with the '.' syntax
import my_package.my_module as mm

In [26]:
mm.my_var

1

In [27]:
mm.my_func()

Calling from my_func in my_package.my_module


#### Concept Check

Create a test package with the following structure

- test_package
    - `__init__.py`
    - `module1.py`
    - subpackage1
        - `__init__.py`
        - `module2.py`
        - `module3.py`
    - subpackage2
        - `__init__.py`
        - `module4.py`

For `moduleX.py`, use the following code to store in the file.

```
varX = X
def my_funcX():
    print('my_package\moduleX')
```

Include a package level variable: `my_pack_var = 0`


In [1]:
import test_package
test_package.my_pack_var

0

In [2]:
import test_package.module1 as m1

m1.my_func1()
m1.var1

my_package\module1


1

In [4]:
import test_package.subpackage1.module2 as m2
import test_package.subpackage1.module3 as m3

m2.my_func2()
print(m2.var2)

m3.my_func3()
print(m3.var3)


my_package\module2
2
my_package\module2
2


In [6]:
import test_package.subpackage2.module4 as m4


m4.my_func4()
print(m4.var4)

my_package\module4
4
