# Modules and Packages Basics

* One of the key features of Python is that the actual core language is fairly small. This is an intentional design feature to maintain simplicity. Much of the powerful functionality comes through external modules and packages.

* Modules are just .py scripts call in another .py scrips. Packages are collection of modules.

The best online resource is the official docs:
https://docs.python.org/3/tutorial/modules.html#packages



## Modules basics
### Create and import modules
The first time a module is loaded into a running Python script, it is initialized by executing the code in the module once. If another module in your code imports the same module again, it will not be loaded twice but once only - so local variables inside the module act as a "singleton" - they are initialized only once.  

A general rule of thumb is that from <module> import * is OK for interactive analysis within IPython but you should avoid using it within scripts.




In [1]:
%%writefile file1.py
def myfunc(x):
    return [num for num in range(x) if num%2==0]
list1 = myfunc(11)

Writing file1.py


file1.py is going to be used as a module.

In [3]:
%%writefile file2.py
import file1 #We can also use: from file1 import myfunc
file1.list1.append(12)
print(file1.list1)

Overwriting file2.py


In [4]:
! python file2.py

[0, 2, 4, 6, 8, 10, 12]


In [5]:
import file1
print(file1.list1)

[0, 2, 4, 6, 8, 10]


The above cell proves that we never altered file1.py, we just appended a number to the list *after* it was brought into file2. The following is an example on how to import function of a module but not import the module name.

In [7]:
%%writefile myModule.py
def func(): #func can have not arguments but () cannot be eliminated. 
    return [num for num in range(10) if num%2==0]

Writing myModule.py


In [8]:
%%writefile myApp.py
from myModule import func

print(func()) #func can have not arguments but () cannot be eliminated. 

Writing myApp.py


In [9]:
!python myApp.py

[0, 2, 4, 6, 8]


### Passing command line arguments
Python's `sys` module gives you access to command line arguments when calling scripts.

In [5]:
%%writefile file3.py
import sys
import file1
num = int(sys.argv[1])

#extra code
print(type(sys))
print(sys)
print(sys.argv[0])
print(sys.argv[1])
#extra code

print(file1.myfunc(num))

Overwriting file3.py


Note that we selected the second item in the list of arguments with `sys.argv[1]`.<br>
This is because the list created with `sys.argv` always starts with the name of the file being used.<br>

In [6]:
! python file3.py 27

<class 'module'>
<module 'sys' (built-in)>
file3.py
27
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26]


### Exploring built-in modules
Two very important functions come in handy when exploring modules in Python - the <code>dir</code> and <code>help</code> functions. (1) We can look for which functions are implemented in each module by using the <code>dir</code> function. (2) When we find the function in the module we want to use, we can read about it more using the <code>help</code> function. **In Jupyter, use Shift+Tab. In VS code, it might be OK to use right clicking**. 

In [15]:
import math
print(dir(math))

['__doc__', '__loader__', '__name__', '__package__', '__spec__', 'acos', 'acosh', 'asin', 'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'copysign', 'cos', 'cosh', 'degrees', 'e', 'erf', 'erfc', 'exp', 'expm1', 'fabs', 'factorial', 'floor', 'fmod', 'frexp', 'fsum', 'gamma', 'gcd', 'hypot', 'inf', 'isclose', 'isfinite', 'isinf', 'isnan', 'ldexp', 'lgamma', 'log', 'log10', 'log1p', 'log2', 'modf', 'nan', 'pi', 'pow', 'radians', 'sin', 'sinh', 'sqrt', 'tan', 'tanh', 'tau', 'trunc']


In [16]:
help(math.ceil)

Help on built-in function ceil in module math:

ceil(...)
    ceil(x)
    
    Return the ceiling of x as an Integral.
    This is the smallest integer >= x.



## Packages basics
Packages are name-spaces which contain multiple packages and modules themselves. They are simply directories, but with a twist.

Each package in Python is a directory which MUST contain a special file called **\__init\__.py**. This file can be empty, and it indicates that the directory it contains is a Python package, so it can be imported the same way a module can be imported.

If we create a directory called foo, which marks the package name, we can then create a module inside that package called bar. We also must not forget to add the **\__init\__.py** file inside the foo directory.

To use the module bar, we can import it in two ways:

### Create and import packages

In [None]:
import foo.bar

In [None]:
# OR could do it this way
from foo import bar

In the first method, we must use the foo prefix whenever we access the module bar. In the second method, we don't, because we import the module to our module's name-space.

The **\__init\__.py** file can also decide which modules the package exports as the API, while keeping other modules internal, by overriding the **\__all\__** variable, like so:

In [None]:
__init__.py:

__all__ = ["bar"]

Check a Flask example from the following folder: C:\Users\ljyan\Desktop\courseNotes\web\Flask\06-Larger-Flask-Applications\01-Using-Blueprints\myproject\__init__.py. There we have a different story. 

### __main__  and __name__  
When running your script, python will first initialize some hidden special variables. `__name__` is one of them. If run the code directly with "python script name", or Ctrl-shift in Jupyter to run a block, then the `__name__` is initialized as `__main__`. If we run the script by import it, then `__name__` is initialized as the module name.  

**My comments**. Either the imported modules, or the current module (not the imported) all have their `__name__`'s. What will assigned to these `__name__`'s are decided by whether these modules are imported or not. **The key is there are many `__name__`'s to be assigned**. 

Python has no main() as in C/C++, the code in indentation level 0 is automatically the code to run. That is, they are the main() codes.

**Be careful the following code: if `__name__`=='__main__', but not if `__name__`== __main__**

**Example 1** 

In [21]:
%%writefile printExample.py
if __name__ == '__main__':
    print("hello")
else:
    print("printExample module's name: {}".format(__name__))

Overwriting printExample.py


In [22]:
# Need keep the .py in the file name in the following.
!python printExample.py
# However, when importing the module, I don't need .py. 

hello


In [23]:
%%writefile secondFile
import printExample

Writing secondFile


In [24]:
!python secondFile

printExample module's name: printExample


In [25]:
%%writefile secondFile
#Now change the secondFile to be:
import printExample
print("SecondFile module's name: {}".format(__name__))

Overwriting secondFile


In [33]:
!python secondFile
#Next we explore why the __name__ and __main__ is useful

printExample module's name: printExample
SecondFile module's name: __main__


**Example 2**

In [27]:
%%writefile printExample.py
def main():
    print("printExample module's name: {}".format(__name__))
if __name__ == '__main__':
    main()

Overwriting printExample.py


In [28]:
!python printExample.py

printExample module's name: __main__


In [29]:
%%writefile secondFile
import printExample
print("SecondFile module's name: {}".format(__name__))

Overwriting secondFile


In [30]:
!python secondFile

SecondFile module's name: __main__


In the above code. when importing printExample, the __name__ is set to the module name  (i.e,printExample here) but not __main__. So the main() in the printExample.py will not be run. 

**Example 3** 

In [135]:
%%writefile printExample.py
if __name__ == '__main__':
    print("Run directly!")
else:
    print("Run indirectly")


Overwriting printExample.py


In [136]:
!python printExample.py

Run directly!


In [140]:
%%writefile secondFile
import printExample
print("SecondFile module's name: {}".format(__name__))

Overwriting secondFile


In [138]:
!python secondFile

Run indirectly
SecondFile module's name: __main__


**Example 4** 

In [141]:
%%writefile printExample.py
print("This will always be run")

def main():
    print("printExample module's name: {}".format(__name__))
if __name__ == '__main__':
    main()

Overwriting printExample.py


In [142]:
%%writefile secondFile
import printExample
print("SecondFile module's name: {}".format(__name__))

Overwriting secondFile


In [108]:
!python secondFile

This will always be run
SecondFile module's name: __main__


The main() of printExample is not run when we run secondFile because in the second file, printExample is imported, and thus its __name__ is not set to __main__. However, if we really want to run the main() in the printExample, change the secondFile to be 


In [31]:
%%writefile secondFile
import printExample
printExample.main()
print("SecondFile module's name: {}".format(__name__))

Overwriting secondFile


In [32]:
!python secondFile

printExample module's name: printExample
SecondFile module's name: __main__


# Modules and Packages: Advanced

## Where are the packages?

### Where to find packages to install?
PyPI  
The Python Package Index is the main repository for 3rd party Python packages (about 14000 packages and growing). 
The advantage of being on PyPI is the ease of installation using pip install <package_name>.    

### Where are the packages installed?

#### Method 1
https://stackoverflow.com/questions/122327/how-do-i-find-the-location-of-my-python-site-packages-directory

There are two types of site-packages directories, global and per user.  
(1) Global site-packages ("dist-packages") directories are listed in sys.path when you run:  
python -m site
(2) The per user site-packages directory (PEP 370) is where Python installs your local packages:
python -m site --user-site

In [7]:
import site
!python -m site 

sys.path = [
    'C:\\Users\\ljyan\\Desktop\\courseNotes\\python\\pythonFundamentals',
    'C:\\Users\\ljyan\\Anaconda3\\python36.zip',
    'C:\\Users\\ljyan\\Anaconda3\\DLLs',
    'C:\\Users\\ljyan\\Anaconda3\\lib',
    'C:\\Users\\ljyan\\Anaconda3',
    'C:\\Users\\ljyan\\AppData\\Roaming\\Python\\Python36\\site-packages',
    'C:\\Users\\ljyan\\Anaconda3\\lib\\site-packages',
    'C:\\Users\\ljyan\\Anaconda3\\lib\\site-packages\\win32',
    'C:\\Users\\ljyan\\Anaconda3\\lib\\site-packages\\win32\\lib',
    'C:\\Users\\ljyan\\Anaconda3\\lib\\site-packages\\Pythonwin',
]
USER_BASE: 'C:\\Users\\ljyan\\AppData\\Roaming\\Python' (exists)
USER_SITE: 'C:\\Users\\ljyan\\AppData\\Roaming\\Python\\Python36\\site-packages' (exists)
ENABLE_USER_SITE: True


In [21]:
!python -m site --user-site

C:\Users\ljyan\AppData\Roaming\Python\Python36\site-packages


#### Method 2

In [None]:
import os
help(os)

In [22]:
import sys
sys.path

['',
 'C:\\Users\\ljyan\\Anaconda3\\python36.zip',
 'C:\\Users\\ljyan\\Anaconda3\\DLLs',
 'C:\\Users\\ljyan\\Anaconda3\\lib',
 'C:\\Users\\ljyan\\Anaconda3',
 'C:\\Users\\ljyan\\AppData\\Roaming\\Python\\Python36\\site-packages',
 'C:\\Users\\ljyan\\Anaconda3\\lib\\site-packages',
 'C:\\Users\\ljyan\\Anaconda3\\lib\\site-packages\\win32',
 'C:\\Users\\ljyan\\Anaconda3\\lib\\site-packages\\win32\\lib',
 'C:\\Users\\ljyan\\Anaconda3\\lib\\site-packages\\Pythonwin',
 'C:\\Users\\ljyan\\Anaconda3\\lib\\site-packages\\IPython\\extensions',
 'C:\\Users\\ljyan\\.ipython']

### How Python looks for modules?
When the Python interpreter executes an import statement, it looks for modules on a search path. A default value for the path is configured into the Python binary when the interpreter is built. You can determine the path by importing the sys module and printing the value of sys.path (introduced earlier).    
Within a script it is possible to adjust the search path by modify sys.path which is just a Python list. Generally speaking you will want to put your path at the front of the list using insert:  

import sys
sys.path.insert(0, '/my/path/python/packages')

### Are python libraries compiled? 
Python can execute functions written in Python (interpreted) and compiled functions. There are whole API docs about writing code for integration with Python. cython is one of the easier tools for doing this.

Libraries can be **any combination** - pure Python, Python plus interfaces to compiled code, or all compiled. The interpreted files end with .py, the compiled stuff usually is .so or .dll (depending on the operating system). It's easy to install pure Python code - just load, unzip if needed, and put the right directory. **Mixed code requires a compilation step (and hence a c compiler, etc), or downloading a version with binaries.**  

Typically developers get the code working in Python, and then rewrite speed sensitive portions in c. Or they find some external library of working c or Fortran code, and link to that.  

numpy and scipy are mixed. They have lots of Python code, core compiled portions, and use external libraries. And the c code can be extraordinarily hard to read.  

As a numpy user, you should first try to get as much clarity and performance with Python code. Most of the optimization SO questions discuss ways of making use of the compiled functionality of numpy - all the operations that work on whole arrays. It's only when you can't express your operations in efficient numpy code that you need to resort to using a tool like cython or numba.

In general if you have to iterate extensively then you are using low level operations. Either replace the loops with array operations, or rewrite the loop in cython.

**Important point**. From above, we know that we cannot always find a source python script for any thing.  Here is an example true for linux. For a pure python module you can find the source by looking at themodule.__file__. The datetime module, however, is written in C, and therefore datetime.__file__ points to a .so file (there is no datetime.__file__ on Windows), and therefore, you can't see the source. 