<center>
<h1>Python Application Building and Version Control</h1>
<img src="http://i.imgur.com/91PUPZA.png" width=20%>
</center>

<center>
AY 250, Spring 2013; Josh Bloom
</center>

<h2>Important reminders</h2>
<h3>Contact</h3>
- email us: `ucbpythonclass+seminar@gmail.com`
- Piazza: https://piazza.com/berkeley/fall2013/ay250/home
<h3>Keeping up to Date</h3>
 Keep updated with the git repository:

```bash
git clone http://github.com/python-seminar.git
# and keep updated by pulling often
git pull
```
*We’ll talk more about git later today*

<h3>Help</h3>
 Monday help sessions will are 10am - noon, Evans 481


<center><h1>Outline</h1></center>

* Managing Packages - `pip`, `setup.py`, `virtualenv`, `conda`, ...
* Command Line Parsing - `argparse`
* Building Modules & Packages
* Breakout
* Version Control - `git` 
* Debugging & Testing - `pdb`, `ipy`, `%debug`, `pylint`, `pep8`, `nose`
* Distribution - `distutils2`

<h1>Getting Packages - `pip`</h1>

* `pip` is a package manager for Python, similar to apt-get for Ubuntu or MacPorts/Homebrew for OSX 
* `easy_install` is the outdated version - still works, but is being phased out
* These are all run from the command line (not within Python), it is automatically associated with your Python installation
* Downloads packages from the official PyPI - the Python Package Index
* May have to install `pip` using: `easy_install pip`)

<h3>Enthought/Continuum Package Managers</h3>

* The package manager for Anaconda is `conda`; `enpkg` is the manager for Enthought

** In general, you should do `conda` before doing `pip` but the interactions between these two package managers shouldn't be too painful **

## Installing a package ##

One these *should* work for you (try in this order) on the command line:

     conda install <pkg>  # or enpkg install simplejson
     pip install <pkg>
     sudo pip install <pkg>
     easy_install <pkg>

In [None]:
!conda install simplejson

In [None]:
!pip install simplejson

## upgrading ##

    conda upgrade <pkg>
    pip install --upgrade <pkg>
    sudo pip install --upgrade <pkg>


In [None]:
!pip install --upgrade simplejson

## uninstalling ##

    conda remove <pkg>
    pip uninstall <pkg>
    sudo pip uninstall <pkg>

In [None]:
!conda remove simplejson

In [None]:
!pip uninstall simplejson

Both `pip` and `conda` have a pretty rich command set and useful interface:

    pip --help
    pip install --help
    pip search sator
    
    http://docs.continuum.io/conda/index.html
    conda install ~/redis-py-2.7.2-py27_0.tar.bz2
    conda install matplotlib=1.2

In [None]:
!conda --help

<center>What if you don’t have superuser privileges?   Maybe on a department computer?  You can install packages to your own folder, and include them by modifying your `.bashrc` or `.profile` file.</center>

    pip install <pkg> --target=<my_choice>

In [None]:
!pip install simplejson --target=/tmp/

In [None]:
!ls /tmp/simple*

[EGG files are like .jar files: self-contained packages with code and metadata. Have a look at http://mrtopf.de/blog/en/a-small-introduction-to-python-eggs/]

Now, you can have Python know about your special installation directory by modifying your `PYTHONPATH` environment variable in your .bashrc, .cshrc, or .tcshrc file:
```bash
#BASH Style: 
export PYTHONPATH=/tmp/simplejson:$PYTHONPATH
#CSH Style:
setenv PYTHONPATH /path/to/my_choice:$PYTHONPATH
```

# Getting and Installing Packages with `setup.py` #

Sometimes `conda` and `pip` cannot find a codebase you're trying to install. In this case you'll need to do it yourself using a tarball and a `setup.py` file. This is the most straightforward way to get packages: download them from the developer’s website and hope that they’ve followed the standard conventions.

There is a standard Python package distribution scheme using `distutils2` and `setup.py` files...more on that later.

Basic workflow of installing a package with `setup.py`:
```bash
$ cd [folder with package and setup.py file]
$ sudo python setup.py install
   # [ progress report ... ]
$ Finished processing dependencies for [package]
   # [if you want more info, there are several options to modify]
$ python setup.py --help install
```


To do a custom installation directory (if you dont have sudo, e.g.):
```bash
# {-- on unix --}
$ python setup.py install --home <my_choice>

# {-- on windows --}
$ python setup.py install --prefix “my_choice”
```

# Managing Packages - `virtualenv`/`conda` environments #

* Open Source software is constantly changing - how do you protect working code against future updates?
* Or, what if there is a beta release of a package you want to try, but you don’t want to fully commit yet?
* `virtualenv` and `conda -n` creates a local, self-contained, and totally separate python installation.
* Use it to create a local Python ecosystem, separate from your computer’s main system, so that you can do what you want in one without affecting the other.

# `virtualenv` #

installing:

In [None]:
!pip install --upgrade virtualenv

Creating a new environment:

In [None]:
!virtualenv --no-setuptools LocalPython

In [None]:
!virtualenv --no-setuptools --system-site-packages Test1

During a shell session, you can source this environment so that it runs as the default:

```bash
$ source LocalPython/bin/activate
(LocalPython)$
#[ pip and python commands now point to new environment ]
(LocalPython)$ which python
LocalPython/bin/python
```

We can get out of the enviroment:

```bash
 deactivate LocalPython
```
Just delete to remove environment:

```bash
rm -r LocalPython
```

# conda -n #
http://www.continuum.io/blog/conda

In [None]:
!conda info

In [None]:
!conda info -e

In [None]:
!conda search numpy

```bash
Py4DS> conda create -n numpy15 numpy=1.5.1

Package plan for creating environment at /Users/jbloom/anaconda/envs/numpy15:

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    numpy-1.5.1                |           py27_4         2.3 MB

The following packages will be linked:

    package                    |            build
    ---------------------------|-----------------
    numpy-1.5.1                |           py27_4
    python-2.7.5               |                2
    readline-6.2               |                1
    sqlite-3.7.13              |                1
    tk-8.5.13                  |                1
    zlib-1.2.7                 |                1

Proceed ([y]/n)? y

Fetching packages ...
numpy-1.5.1-py27_4.tar.bz2 100% |####################| Time: 0:00:02 816.17 kB/s
Extracting packages ...
[      COMPLETE      ] |##################################################| 100%
Linking packages ...
[      COMPLETE      ] |##################################################| 100%
#
# To activate this environment, use:
# $ source activate numpy15
#
# To deactivate this environment, use:
# $ source deactivate
```

In [None]:
!conda info -e

In [None]:
!ls /Users/jbloom/anaconda/envs/numpy15/bin

We could make this environment the default if we want to:
```bash
export PATH=~/anaconda/envs/numpy15/bin:$PATH
```
And if we want to remove that environment:
```bash
conda remove -n numpy15 --all
```

<center><h1> Command Line Parsing</h1></center>

<center>`python myawesomeprogram.py -o option1 -p parameter2 -Q -R`</center>
<p>
 **Goal**: build a command-line 'standalone' codebase in Python, w/ CL options & keywords
 
 **Solution**: argparse, which has been built in to Python 2.7 & above (if you don’t have it, you can get it with `pip argparse`)
 
* Allows for  user-friendly command line interfaces, and leaves it up to the code to determine what it was the user wanted.

* Also automatically generates help & usage messages and issues errors when invalid arguments are provided.

(Note on optparse: being replaced in favor of argparse)


In [None]:
import argparse

# Setting up a parser #


* First step for `argparse`: create parser object & tell it what arguments to expect. 
* It can then be used to process the command line arguments on runtime
* Parser class: `ArgumentParser`. Takes several arguments to set up the description used in the help text for the program & other global behaviors 
   
 <p>
See  http://www.doughellmann.com/PyMOTW/argparse/
</p>

In [None]:
!cat myfile.py

In [None]:
parser = argparse.ArgumentParser(description='Sample Application')
print "hi"

# Defining Arguments & Parsing

* Arguments can trigger different actions, specified by the action argument to add_argument(). 
* Several supported actions (next slide).
* Once all of the arguments are defined, you can parse the command line by passing a sequence of argument strings to parse_args(). 
* By default, arguments are taken from `sys.argv[1:]`, but you can also pass your own list.

In [None]:
%%file argparse_action.py
import argparse
parser = argparse.ArgumentParser(description='Sample Application')
parser.add_argument('required_arg_1', help='This positional argument is required')
parser.add_argument('required_arg_2', help='This positional argument is also required')
parser.add_argument('-s', action='store', dest='simple_value',
                    help='Store a simple value')
parser.add_argument('-c', action='store_const', dest='constant_value',
                    const='value-to-store',
                    help='Store a constant value')
parser.add_argument('-t', action='store_true', default=False,
                    dest='boolean_switch',
                    help='Set a switch to true')
parser.add_argument('-a', action='append', dest='collection',
                    default=[],
                    help='Add repeated values to a list',
                    )
parser.add_argument('-A', action='append_const', dest='const_collection',
                    const='value-1-to-append',
                    default=[],
                    help='Add different values to list')
parser.add_argument('-B', action='append_const', dest='const_collection',
                    const='value-2-to-append',
                    help='Add different values to list')
parser.add_argument('--version', action='version', version='%(prog)s 1.0')

results = parser.parse_args()
print 'required_args    =', results.required_arg_1, results.required_arg_2
print 'simple_value     =', results.simple_value
print 'constant_value   =', results.constant_value
print 'boolean_switch   =', results.boolean_switch
print 'collection       =', results.collection
print 'const_collection =', results.const_collection

* store: Save the value, after optionally converting it to a different type (default)
* store_const: Save the value as defined as part of the argument specification, rather than a value that comes from the arguments being parsed
* store_true/store_false: Save the appropriate boolean value
* append: Save the value to a list.  Multiple values are saved if the argument is repeated
* append_const: Save a value defined in the argument specification to a list
* version: Prints version details about the program and then exits

<center><h2>Modules and Packages</h2></center>

* As code gets more involved, it becomes unwieldy & unnatural to keep everything in the same file, or even the same folder

* Functions from other codes made for different reasons might be useful elsewhere

* Useful to break up code into modules and packages  - used like ‘package.module’

* **Module**: file containing defined functions & variables. **It must have a .py extension.**

* **Package**: a properly-organized folder containing modules (packages Numpy are well-developed examples - you can make your own) 

In [None]:
!ls /Users/jbloom/Dev/Anaconda/lib/python2.7/site-packages

In [None]:
!ls /Users/jbloom/Dev/Anaconda/lib/python2.7/site-packages/numpy

<p><h2>Modules: Setting up your path</h2></p>
`PYTHONPATH`
Augment the default search path for module files. The format is the same as the shell’s PATH: one or more directory pathnames separated by os.pathsep (e.g. colons on Unix or semicolons on Windows). Non-existent directories are silently ignored.

In addition to normal directories, individual PYTHONPATH entries may refer to zipfiles containing pure Python modules (in either source or compiled form). Extension modules cannot be imported from zipfiles.
The default search path is installation dependent, but generally begins with prefix/lib/pythonversion (see PYTHONHOME above). It is always appended to PYTHONPATH.

An additional directory will be inserted in the search path in front of PYTHONPATH as described above under Interface options. The search path can be manipulated from within a Python program as the variable sys.path.

Add to your .bashrc, .cshrc, or .tcshrc file:
```bash
#BASH Style: 
export PYTHONPATH=/path/to/your/code
#CSH Style: 
setenv PYTHONPATH /path/to/your/code
```

<p><h3>Modules: More Path Stuff</h3></p>

In [None]:
import sys
# Get a list of all paths python is looking at with sys.path
print sys.path[-4:]   # only look at the first 4 to save space
# Can append to this list:
# sys.path.append(“/new/software/path/”)

New paths appended will not be preserved upon exiting python. 
For long-term path appending, use PYTHONPATH environment variable defined in previous slide.

<p><h2> Packages</h2></p>

* If path is set correctly, code can be broken up into reasonable folders and imported as necessary, either by importing entire modules (.py files) or functions/classes within the modules.

* Put an `__init__.py` file in each folder you want to be able to import from.

* Code in `__init__.py` is run when the package, or any derivative of it, is imported.  Usually `__init__`.py is an empty file.

In [None]:
!ls /Users/jbloom/Classes/python-seminar/Breakouts/01_Versioning_Application_Building

# Breakout! #

* Go to the breakout folder in: `../Breakouts/01_Versioning_Application_Building/`

* Work on the file `breakout1.py`.  Do not move or modify the other files, in the other folders, but you will need to use them.  (You may add files to these directories, if necessary)

* Build up a command line parser which allows the user to specify:
 - how many datapoints to generate
 - whether to plot with a filled in histogram or an outlined one
 - the title of the plot
 - And then have the plot be generated.

* We want to be able to run a command like:

```bash
python breakout1.py -t -n 200 -T "My Awesome Title"
```

In [None]:
ls

In [None]:
%run breakout1_solution.py -t

In [None]:
%run breakout1_solution.py -n 200 -T "My Awesome Title"

In [None]:
%run breakout1_solution.py -n 2000 -T "2000!"