# Module

As your code grows more and more complex, it is useful to collect all code in a an external file. Here we store all the functions form our notebook in a single file *grm.py*. Nothing new happens, we just copy all code cells from our notebook into a new file. Below, we can have a look at an *html* version of th file, which we created using *PyCharm*'s exporting capabilites. 

In [2]:
from IPython.core.display import HTML, display
display(HTML('material/images/grm.html')) 

0
grm.py


As our module is already quite complex, it is better to study the structure using *PyCharm* which provides tools to quickly grasp the structure of a module. So, check it out.

## Namespace and Scope

Now that our *Python* code grows more and more complex, we need to discuss the concept of a ***Namespace***. Roughly speaking, a name in *Python* is a mapping to an object. In *Python* this describes pretty much everything: lists, dictionaries, functions, classes, etc. Think of a namespace as a dictionary, where the dictionary keys represent the names and the dictionary values the object itself.

In [29]:
display(HTML('material/images/namespace1.html')) 

Now, the tricky part is that we have multiple independent namespaces in Python, and names can be reused for different namespaces (only the objects are unique), for example:

In [30]:
display(HTML('material/images/namespace2.html')) 

The ***Scope*** in Python defines the “hierarchy level” in which we search namespaces for certain “name-to-object” mappings. 

In [31]:
i = 1
 
def foo():
    i = 5
    print(i, 'in foo()')
print(i, 'global')
 
foo()

(1, 'global')
(5, 'in foo()')


What rules are applied to resolve conflicting scopes?

1. Local can be inside a function or class method, for example.
2. Enclosed can be its enclosing function, e.g., if a function is wrapped inside another function.
3. Global refers to the uppermost level of the executing script itself, and
4. Built-in are special names that Python reserves for itself.

This introduction draws heavily on a couple of very useful online tutorials: [Python Course](http://www.python-course.eu/), [Beginners Guide to Namespaces](http://spartanideas.msu.edu/2014/05/12/a-beginners-guide-to-pythons-namespaces-scope-resolution-and-the-legb-rule/), and [Guide to Python Namespaces](http://bytebaker.com/2008/07/30/python-namespaces/).

## Interacting with the Module

Let us import a couple of the standard libraries to get started.

In [32]:
# Unix Pattern Extensions
import glob

# Operating System Interfaces
import os

# System-specific Parameters and Functions
import sys

Now we turn to our very own module, we just have to import it as any other library first. How does *Python* know where to look for for our module?

Whenever the interpreter encounters an import statement, it searches for a build-in module (e.g. os, sys) of the same name. If unsuccessful, the interpreter searches in a list of directories given by the variable *sys.path* ... 

In [33]:
print '\n Search Path:'
for dir_ in sys.path:    
    print ' ' + dir_


 Search Path:
 material/module
 material/modules
 material/modules
 
 /usr/local/lib/python2.7/dist-packages/pip-1.5.6-py2.7.egg
 /usr/lib/pymodules/python2.7
 /usr/lib/python2.7/dist-packages
 /usr/lib/python2.7
 /usr/lib/python2.7/plat-x86_64-linux-gnu
 /usr/lib/python2.7/lib-tk
 /usr/lib/python2.7/lib-old
 /usr/lib/python2.7/lib-dynload
 /usr/local/lib/python2.7/dist-packages
 /usr/lib/python2.7/dist-packages/PILcompat
 /usr/lib/python2.7/dist-packages/gtk-2.0
 /usr/lib/python2.7/dist-packages/ubuntu-sso-client
 /usr/lib/python2.7/dist-packages/wx-3.0-gtk2
 /usr/local/lib/python2.7/dist-packages/IPython/extensions
 /home/peisenha/.ipython


and the current working directory. However, our module in the **modules** subdirectory, so we need to add it manually to the search path. 

In [34]:
sys.path.insert(0, 'material/module')

Please see [here](https://docs.python.org/2/tutorial/modules.html) for additional information.

In [35]:
%ls -l

total 212
-rw-rw-r-- 1 peisenha peisenha 108000 Aug 23 14:47 dataset.grm.txt
-rw-rw-r-- 1 peisenha peisenha  78243 Aug 23 14:47 lecture.ipynb
drwxrwxr-x 5 peisenha peisenha   4096 Aug 23 14:24 [0m[01;34mmaterial[0m/
-rw-rw-r-- 1 peisenha peisenha    534 Aug 23 14:47 results.grm.txt


Returning to *grm.py*:

In [36]:
# Import grm.py file
import grm

# Process initializtion file
init_dict = grm.process('material/msc/init.ini')

# Simulate dataset
grm.simulate(init_dict)

# Estimate model
rslt = grm.estimate(init_dict)

# Inspect results
grm.inspect(rslt, init_dict)

# Output results to terminal
%cat results.grm.txt


 softEcon: Generalized Roy Model
 -------------------------------

 Average Treatment Effects

     ATE       -0.40

     TT        -0.46

     TUT       -0.29


 Parameters

     Start    Finish

      0.50      0.50
      0.20      0.20
      0.52      0.52
      0.52      0.52
      0.10      0.10
      0.12      0.12
      0.30      0.30
      0.52      0.52
      0.20      0.20
      0.12      0.12
      0.25      0.25
      0.35      0.35
      0.02      0.02
      0.02      0.02
      0.10      0.10
      0.12      0.12


Given our work on the notebook version, we had a very clear idea about the names defined in the module. In cases where you don't:

In [37]:
# The built-in function dir() returns the names 
# defined in a module.  
print '\n Names: \n'
for function in dir(grm):
    print ' ' + function



 Names: 

 __builtins__
 __doc__
 __file__
 __name__
 __package__
 _add_auxiliary
 _check_integrity_process
 _check_integrity_simulate
 _distribute_arguments
 _distribute_parameters
 _get_start
 _load_data
 _max_interface
 _negative_log_likelihood
 _process_bene
 _process_cases
 _process_not_bene
 _transform_start
 _write_out
 codecs
 copy
 estimate
 inspect
 minimize
 norm
 np
 process
 shlex
 simulate


What is the deal with all the leading and trailing underscores? Let us check out the [Style Guide for Python Code](https://www.python.org/dev/peps/pep-0008/).

There are many ways to import a module and then work with it.

In [38]:
# Import all public objects in the grm
# module, but keep the namespaces
# separate.
#
import grm as gr

init_dict = gr.process('material/msc/init.ini')

# Imports only the estimate() and simulate()
# functions directly into our namespace.
#
from grm import process

init_dict = process('material/msc/init.ini')

try:
    
    data = simulate(init_dict)

except NameError:
    
    pass

# Imports all pubilc objects directly
# into our namespace.
#
from grm import *

init_dict = process('material/msc/init.ini')

data = simulate(init_dict)

## Cleanup

In [39]:
# Create list of all files generated by the module
files = glob.glob('*.grm.*')

# Remove files
for file_ in files:
    os.remove(file_)

## Additional Resources

* [Tutorial on Python Modules and Packages](https://docs.python.org/2/tutorial/modules.html)

**Formatting**

In [1]:
import urllib; from IPython.core.display import HTML
HTML(urllib.urlopen('http://bit.ly/1OKmNHN').read())