In [1]:
%matplotlib inline

In [2]:
from __future__ import print_function, division
import pandas as pd
import numpy as np

## Virtual environments

System installation of python and associated python packages is not always what you need. The development of scientific software usually happens faster and is not aligned with OS releases (true for both Mac and linux).On Mac the situation is confounded by the inclusion of both 32 and 64 -bit python libraries.

The solution is to create an isolated (almost) self-contained environment that has all the packages you need for a particular project.

### Aside: package management

The program that tracks the list and the state of installed packages in the system is called pacakge manager.

* `apt-get`, `aptitude` - Debian based linuxes
* `brew` - for Mac (?)

These tools install, upgrade and uninstall software system-wide, resolving dependencies on the fly.

However, python programs and packages are also distributed via `pip`. `pip` is cross-platform, while the above tools are platform-specific. Mixing the two is not good.

The current best practice is to install system-level stuff using your OS package manager and use `pip` only to install stuff inside virtualenvs. One exception: installing `virtualenv` package itself.

### `virtualenv`

This is the original, "pythonic" way to do this. Originated in Web development. Python 3.4 includes `virtualenv` as a part of standard library.

### `conda`

This tool is developed and maintained by Continuum Analytics, the company behind Anaconda python distribution. Since we are using Anaconda anyway, we'll stick with this. 

#### create a new environment

    $ conda create --name pydata3 python=3 pandas
    $ conda create --name pydata python=2 numpy matplotlib
    
#### list environments

    $ conda info --envs

#### activate environment

    $ source activate pydata3
    (pydata3)$ deactivate
    
#### once inside environment you can use both `conda` and `pip`

    (pydata3)$ conda install numpy
    (pydata3)$ pip install ipython

### deactivate environment

    (pydata3)$ deactivate
    $

Whatever you install while in the environment will only be available inside that environment. Note how the output of `which python` command changes when inside an environment.

### `docker`

This is almost a full-blown virtual machine. Allows one to combine arbitrary programs and packages (not only python) into a portable image that can be installed and run on another host computer.

## Git

Distributed source control program. Tracks changes to a project and keeps them as a graph allowing to go back to the state of code at any point of time.

>If it's not in `git` it doesn't exist.
(anonymous)

The centralized repository of many OSS project is known as github.

What's relevant to us is that `git + pip` make it easy to install stuff straight from github.

    $ pip install git+https://github.com/has2k1/ggplot.git@rewrite
    
An alternative way to do the same if you intend to play with the source:

    $ git clone https://github.com/has2k1/ggplot.git
    $ pip install -e ggplot
    
This will tell `pip` to install the package from the existing `ggplot` directory