# Introduction to Functions and Modules
#### Doug Ollerenshaw
#### dougo@alleninstitute.org
#### 7/5/2016

## Goals
  * Review function definitions
  * Explain the Python path
  * Show how to put a simple function into an external file and import it
  * Introduce the interactive debugger
  * Practice making functions with an exercise

## Overview
By itself, Python has relatively few built-in functions available (https://docs.python.org/2/library/functions.html). This keeps the namespace (the list of defined functions and variables) small and lightweight by default, allowing the user to import only what they need.

There is a very active development community creating and maintaining specialized libraries. As a Python user, you also have the ability to create your own libraries of functions for everyday tasks. 

## External modules
Python comes with a large number of handy standard libraries that can be imported and used:  
     https://docs.python.org/2.7/library/  
    
In addition, the Anaconda Python distribution comes packaged with a selection of commonly used packages:  
    https://docs.continuum.io/anaconda/pkg-docs  
    
One commonly used package, which we'll return to in detail later, is Numpy:  
    http://www.numpy.org/

In [1]:
import numpy

Type "numpy." then tab to see available methods. Or type dir(numpy) to print all available methods.

In [2]:
numpy.pi

3.141592653589793

In [3]:
# this will show all available methods in the numpy library
dir(numpy)

['ALLOW_THREADS',
 'BUFSIZE',
 'CLIP',
 'DataSource',
 'ERR_CALL',
 'ERR_DEFAULT',
 'ERR_IGNORE',
 'ERR_LOG',
 'ERR_PRINT',
 'ERR_RAISE',
 'ERR_WARN',
 'FLOATING_POINT_SUPPORT',
 'FPE_DIVIDEBYZERO',
 'FPE_INVALID',
 'FPE_OVERFLOW',
 'FPE_UNDERFLOW',
 'False_',
 'Inf',
 'Infinity',
 'MAXDIMS',
 'MachAr',
 'NAN',
 'NINF',
 'NZERO',
 'NaN',
 'PINF',
 'PZERO',
 'PackageLoader',
 'RAISE',
 'SHIFT_DIVIDEBYZERO',
 'SHIFT_INVALID',
 'SHIFT_OVERFLOW',
 'SHIFT_UNDERFLOW',
 'ScalarType',
 'Tester',
 'True_',
 'UFUNC_BUFSIZE_DEFAULT',
 'UFUNC_PYVALS_NAME',
 'WRAP',
 '_NoValue',
 '__NUMPY_SETUP__',
 '__all__',
 '__builtins__',
 '__config__',
 '__doc__',
 '__file__',
 '__git_revision__',
 '__mkl_version__',
 '__name__',
 '__package__',
 '__path__',
 '__version__',
 '_import_tools',
 '_mat',
 'abs',
 'absolute',
 'absolute_import',
 'add',
 'add_docstring',
 'add_newdoc',
 'add_newdoc_ufunc',
 'add_newdocs',
 'alen',
 'all',
 'allclose',
 'alltrue',
 'alterdot',
 'amax',
 'amin',
 'angle',
 'any',
 '

In [4]:
#this will display all variables currently in the global namespace
dir()

['In',
 'Out',
 '_',
 '_2',
 '_3',
 '__',
 '___',
 '__builtin__',
 '__builtins__',
 '__doc__',
 '__name__',
 '__package__',
 '_dh',
 '_i',
 '_i1',
 '_i2',
 '_i3',
 '_i4',
 '_ih',
 '_ii',
 '_iii',
 '_oh',
 '_sh',
 'exit',
 'get_ipython',
 'numpy',
 'quit']

#### Another option is to import the specific functions that you need from a libary

In [7]:
from numpy import cos,pi

In [8]:
dir()

['In',
 'Out',
 '_',
 '_2',
 '_3',
 '_4',
 '_5',
 '_6',
 '__',
 '___',
 '__builtin__',
 '__builtins__',
 '__doc__',
 '__name__',
 '__package__',
 '_dh',
 '_i',
 '_i1',
 '_i2',
 '_i3',
 '_i4',
 '_i5',
 '_i6',
 '_i7',
 '_i8',
 '_ih',
 '_ii',
 '_iii',
 '_oh',
 '_sh',
 'cos',
 'exit',
 'get_ipython',
 'numpy',
 'pi',
 'quit']

In [9]:
cos(pi)

-1.0

#### Finally, it is possible to use the * notation to import all methods in a libary
AVOID THIS UNLESS ABSOLUTELY NECESSARY - it opens you up to the possibility of namespace conflicts. For instance, if you had already defined a variable or function with a name that matches something in the imported library, the libraries function would overwrite your variable name. It is being introduced here because you are likely to come across it in help documentation, so it is important to understand what is happening. 

In [10]:
from numpy import *

In [11]:
e

2.718281828459045

In [14]:
#note how full the global namespace is now.
dir()

['ALLOW_THREADS',
 'BUFSIZE',
 'CLIP',
 'DataSource',
 'ERR_CALL',
 'ERR_DEFAULT',
 'ERR_IGNORE',
 'ERR_LOG',
 'ERR_PRINT',
 'ERR_RAISE',
 'ERR_WARN',
 'FLOATING_POINT_SUPPORT',
 'FPE_DIVIDEBYZERO',
 'FPE_INVALID',
 'FPE_OVERFLOW',
 'FPE_UNDERFLOW',
 'False_',
 'In',
 'Inf',
 'Infinity',
 'MAXDIMS',
 'MachAr',
 'NAN',
 'NINF',
 'NZERO',
 'NaN',
 'Out',
 'PINF',
 'PZERO',
 'PackageLoader',
 'RAISE',
 'SHIFT_DIVIDEBYZERO',
 'SHIFT_INVALID',
 'SHIFT_OVERFLOW',
 'SHIFT_UNDERFLOW',
 'ScalarType',
 'True_',
 'UFUNC_BUFSIZE_DEFAULT',
 'UFUNC_PYVALS_NAME',
 'WRAP',
 '_',
 '_11',
 '_12',
 '_13',
 '_2',
 '_3',
 '_4',
 '_5',
 '_6',
 '_8',
 '_9',
 '__',
 '___',
 '__builtin__',
 '__builtins__',
 '__doc__',
 '__name__',
 '__package__',
 '__version__',
 '_dh',
 '_i',
 '_i1',
 '_i10',
 '_i11',
 '_i12',
 '_i13',
 '_i14',
 '_i2',
 '_i3',
 '_i4',
 '_i5',
 '_i6',
 '_i7',
 '_i8',
 '_i9',
 '_ih',
 '_ii',
 '_iii',
 '_oh',
 '_sh',
 'absolute',
 'add',
 'add_docstring',
 'add_newdoc',
 'add_newdoc_ufunc',
 'ad

# Making your own libary
As you work more in Python, you're likely to find yourself creating functions that you're likely to use regularly in your work. Putting these functions into your own importable module allows you to avoid duplicating effort in the future.

## A simple function
We'll define a simple function with two regular arguments and one keyword argument

In [15]:
def cumulative_sum(lower_bound,upper_bound,upper_inclusive=True):
    '''
    THIS FUNCTION WAS DEFINED IN THE NOTEBOOK
    returns sum of all integers between lower_bound and upper_bound
    
    Includes upper_bound in sum by default. Exclude upper bound by setting:
    
        upper_inclusive = False        
    '''
    result = 0
    for number in range(lower_bound,upper_bound+int(upper_inclusive)):
        result += number

    return result
                

### Use the function

In [16]:
print cumulative_sum(0,100)

5050


### Use the help function to print the docstring

In [17]:
cumulative_sum?

If you're likely to use a function repeatedly, it doesn't make sense to have to declare it each time you need it.  
Instead, you can store your functions in an external module and import them when you need them

## We're going to pause a minute and make sure that everyone has a suitable text editor
White space in Python has meaning, so we have to make sure we stick to a convention on how that white space is defined. Most people use four spaces as their standard indent. But hitting the space key four times would be tedious. So it's easier if you can just hit the tab key, which means that you need a text editor that will let you configure your tab key to return four spaces. Here are two editors that are very easy to set up and use

On PC, use notepad++: https://notepad-plus-plus.org/download/v6.9.2.html  
Installed in under 1 minute on my PC  
After installing, go to settings=>preferences=>Tab Settings, then click "Replace by space"

On Mac, use textwrangler: http://www.barebones.com/products/textwrangler/download.html  
Installed in about a minute  
After installing, go to TextWrangler=>Preferences=>Editor Defaults, then click "Auto-expand tabs"

A really nice cross-platform editor is SublimeText: https://www.sublimetext.com/3  
SublimeText is free to try, but $70 to buy. It'll remind you every so often that you haven't paid yet, but otherwise it's fully functional in free mode.  
After installing, click on the "spaces" button in the lower right corner, then click "indent using spaces" and make sure that "tab width: 4" is selected.  

Here's a very comprehensive list of available editors if you want to explore:  
https://wiki.python.org/moin/PythonEditors

## Create an external module - then import it
In the current working directory, we've already created a file called "my_functions.py" and put our function definition there.  Open that file up in your text editor and take a look at it.  

We can then import that function and use it in the same way that the function declared above was used.

In [18]:
# Importing the module gives you access to all of the methods in that module
import my_functions

#### Now, if you type "my_functions", then hit tab, you'll see the function definition in the list of available methods

In [None]:
my_functions.

In [19]:
print my_functions.cumsum(0,100)

5050


#### You can also assign the module to a new variable name while importing to
This is especially useful if you'll be using the module repeatedly and don't want to type out the full name every time

In [1]:
import my_functions as mf
print mf.cumsum(0,100)

5050


#### Note that we still have access to the docstring for help
It's good practice to write a brief docstring when creating new functions to help you and others interpret them later

In [21]:
mf.cumsum?

## Add another function to our module

In [8]:
def print_something(input_string):
    print input_string

In [2]:
# import new function and use it here
mf.print_something('hello_world')

hello_world


#### Non-standard libraries:
There are a number of other additional libraries that are commonly used for scientific computing and data visualization. Many of the common libraries are included as part of the Anaconda distribution (e.g. Numpy, Scipy, Pandas, Matplotlib, which will be covered later). Others can be found through the python package index (https://pypi.python.org/pypi) and installed at the command line with pip (e.g. 'pip install allensdk')

In [4]:
import allensdk

In [6]:
from allensdk.api.queries.cell_types_api import CellTypesApi

## A brief aside on the python path  

The custom library we've been working on was saved in the working directory of this notebook (i.e., the external hard drive you were given). Code running from another directory wouldn't be able to import it.  

Instead, libraries should be saved somewhere on your PYTHONPATH, which is a list of file locations where Python will search for importable modules.  

The 'path' command in the 'sys' module lets you view your current python path

In [7]:
import sys
sys.path

['',
 'C:\\Anaconda2\\lib\\site-packages\\thunder_python-1.0.0-py2.7.egg',
 'C:\\Anaconda2\\lib\\site-packages\\toolbox-0.1-py2.7.egg',
 'C:\\Users\\dougo\\Dropbox\\PythonCode',
 'C:\\Users\\dougo\\Dropbox\\PythonCode\\imaging_behavior_master',
 'C:\\Users\\dougo\\Dropbox\\PythonCode\\toolbox_master',
 'C:\\Users\\dougo\\Dropbox\\PythonCode\\CorticalMapping_master',
 'C:\\Users\\dougo\\Dropbox\\PythonCode\\isee_engine_master',
 'C:\\Users\\dougo\\Dropbox\\PythonCode\\zro',
 'C:\\Anaconda2\\python27.zip',
 'C:\\Anaconda2\\DLLs',
 'C:\\Anaconda2\\lib',
 'C:\\Anaconda2\\lib\\plat-win',
 'C:\\Anaconda2\\lib\\lib-tk',
 'C:\\Anaconda2',
 'c:\\anaconda2\\lib\\site-packages\\sphinx-1.4.1-py2.7.egg',
 'c:\\anaconda2\\lib\\site-packages\\setuptools-20.7.0-py2.7.egg',
 'C:\\Anaconda2\\lib\\site-packages',
 'C:\\Anaconda2\\lib\\site-packages\\win32',
 'C:\\Anaconda2\\lib\\site-packages\\win32\\lib',
 'C:\\Anaconda2\\lib\\site-packages\\Pythonwin',
 'C:\\Anaconda2\\lib\\site-packages\\IPython\\exte

Your 'site-packages' folder is where packages will be installed by default. Try moving the "my_functions.py" file into your site packages folder and importing again

Alternatively, you can add a location to your python path using the sys module. If you use a cloud storage service such as Dropbox or Google drive, you can make a folder that will be shared across machines you work on.  

First, make a folder somewhere on your drive, then append that path to your system path

In [None]:
sys.path.append('/users/dougo/dropbox/python_code')

Now, move a copy of 'my_functions.py' into that folder and give it a new name, then try to import it.

In [None]:
import my_functions_2 as mf2

It's important to note that any paths added using 'sys.path.append' will only remain on the PYTHONPATH for the duration of that session. We'll return to this in the exercises below.

## Excercises

### Excercise 1:
  * Write a function that takes two input arguments, then returns their sum
  * Save it in an external library in your current working directory
  * Import it
  * Ensure that it works as expected after importing

In [None]:
#introduce imp module

### Excercise 2:
  * Move your library to a folder that is not on the PYTHONPATH by default
  * Permantly add the folder to your PYTHONPATH (see links below)
  * To ensure that the path has been successfully added to your PYTHONPATH, open a new notebook and import and use your function  
  
Instructions for Windows: http://stackoverflow.com/questions/3701646/how-to-add-to-the-pythonpath-in-windows-7  
Instruction for Mac: http://stackoverflow.com/questions/3387695/add-to-python-path-mac-os-x


In [None]:
# don't permanently add to path, use sys.path.append

In [None]:
import sys
sys.path