# Table of Contents
* [Learning Objectives:](#Learning-Objectives:)
* [Conventions](#Conventions)
* [Setting up your environment](#Setting-up-your-environment)
	* [Working with `conda`](#Working-with-conda)
		* [Upgrade Anaconda](#Upgrade-Anaconda)
		* [Creating environments](#Creating-environments)
		* [Environment files](#Environment-files)
	* [Additional Modules](#Additional-Modules)
* [Finding Packages](#Finding-Packages)
* [PyData Stack](#PyData-Stack)
* [Python 3](#Python-3)
* [The Python Interpreter](#The-Python-Interpreter)
	* [Jupyter Notebooks](#Jupyter-Notebooks)


# Learning Objectives:

After completion of this module, learners should be able to:

* use `conda` to manage Python environments & packages on their computer
* distinguish & use distinct ways of executing Python code (e.g., command-line scripts, interactive shell, IDE, notebook)

# Setting up your environment

For this course we highly recommend installing the free [Anaconda](http://continuum.io/anaconda) Python distribution. Anaconda has [serveral hundred packages](http://docs.continuum.io/anaconda/pkg-docs) regularly used in science, math, engineering and data analysis applications. Installed packages and environments can be maintained using the open source `conda` binary package manager.

Once Anaconda has been installed the following tutorial on `conda` can be performed in a command line terminal. Use `Terminal` or [iTerm](https://www.iterm2.com/) on Mac OS X. In Windows use the `cmd` prompt or the `Anaconda Prompt`. Note, while Windows PowerShell may work is not officially supported.

In the terminal the Python version carries with it the Anaconda version.

<img src='img/python-version.png'>

## Working with `conda`

`conda` is a package and environment manager that permits experimentation with codes dependent on various versions of libraries. It contributes significantly to solving the problem of reproducibility in computational science. `conda` runs on Linux, Mac OS, and Windows and is even programming language-agnostic so it can be used with projects using numerous programming languages.

For a more examples of what `conda` can do for you, please consult
+ Christine Doig's post [Conda for Data Science](http://continuum.io/blog/conda-data-science).
+ the [`conda` documentation](http://conda.pydata.org/docs/index.html)

When your terminal starts you will be in the `root` conda environment. The command `conda env list` (or `conda info -e`) shows the names of the available environments.

```bash
% conda env list
# conda environments:
#
root                  *  /Users/albert/Applications/anaconda3

```

The command `conda list` will show the packages and versions installed in the active environment. There are about 150 packages installed by default and their versions are were determined on the date of the Anaconda release.

```bash
% conda list
# packages in environment at /Users/albert/Applications/anaconda3:
#
You are using pip version 7.0.3, however version 7.1.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
_license                  1.1                      py34_0  
abstract-rendering        0.5.1                np19py34_0  
alabaster                 0.7.3                    py34_0  
anaconda                  2.3.0                np19py34_0  
appscript                 1.0.1                    py34_0  
argcomplete               0.8.9                    py34_0  
astropy                   1.0.3                np19py34_0  
babel                     1.3                      py34_0  
...
xlsxwriter                0.7.3                    py34_0  
xlwings                   0.3.5                    py34_0  
xlwt                      1.0.0                    py34_0  
xz                        5.0.5                         0  
yaml                      0.1.6                         0  
zeromq                    4.0.5                         0  
zlib                      1.2.8                         0 
```

Individual Python packages can be installed or upgraded by running `conda install <package-name>`.

### Upgrade Anaconda

`conda` can be used to upgrade the Anaconda distribution. New package versions will be downloaded and installed into the root environment, but the minor python version will not be upgraded. Updating Anaconda will not upgrade from Python 2.7 to Python 3 or from Python 3.4 to Python 3.5. While there are ways of upgrading Python in the `root` enviornment, the best option may be to create a separate `conda` environment.

```
% conda update conda
% conda update anaconda
```

### Creating environments

This course has been verified to work with Python 3.4+.  You can create a `conda` environment in the terminal and specify particular versions of packages:
```bash
% conda create -n coursework python=3.5 jupyter
```

The preceding command will download any required libraries or binaries. In this case, requesting Python 3.4 with the `jupyter` package will also install `ipython` and several other packages that are required to run the Jupyter Noteboook server. These packages are downloaded to a directory called `envs/coursework` in your Anaconda install path and will only be available to the `coursework` conda environment.

To use this new environment, you would type

**Linux and Mac OS X**
```bash
% source activate coursework
(coursework)% # The prompt changed to reflect new environment has loaded
```

**Windows**
```
C:\Users\albert> activate coursework
Activating environemnt "C:\Anaconda3\envs\coursework"...

[coursework] C:\Users\albert>
```

at the command prompt. Now you would have access to all the dependencies required for your project (and the shell prompt would inform you of the new environment). To deactivate this new environment and return to your previous default environment, you would type

**Linux and Mac OS X**
```bash
(coursework)% source deactivate
%  # The prompt returned to normal after environment is shut down.
```

**Windows**
```
[coursework] C:\Users/albert> deactivate
Dectivating environemnt "C:\Anaconda3\envs\coursework"...

C:\Users\albert>
```

Other useful `conda` commands include `conda info`, `conda search`, `conda list`, and so on.

### Environment files

It is possible to create a `conda` environment specified in a `yaml` file, which is useful for sharing work with others. Files can be shared over email, GitHub, etc.

The file below named `environment.yml` specifies a `conda` environment called `pandas-tutorial-2`.

```bash
$ cat environment.yml
name: pandas-tutorial-2
dependencies:
  - python=3.4
  - pandas
  - bokeh
  - blaze
  - jupyter
  - numpy
```

To to create the environment from the `environment.yml` file in the working directory run.

```bash
% conda env create 
```

The current `conda` environment can be exported to a yaml file and shared.

```bash
% conda env export > freeze.yml
```

A specific yaml file can be used to create an environment by using the `-f` flag.

```bash
% conda env create -f freeze.yml
```

## Additional Modules

During the various modules of this course, we may use some of the following packages that might be worth installing now, in advance:

```
% conda install -y numpy
% conda install -y pandas
% conda install -y matplotlib
% conda install -y bokeh
% conda install -y scipy
% conda install -y scikit-learn
% conda install -y xlrd
```

Many of the [PyPI](https://pypi.python.org/pypi) packages available using `pip` are not directly available from `conda`. Packages installed using `pip` will only be installed in the current environment.

We'll need one for this course.

```
% pip install memory_profiler
```

# Finding Packages

One of the great things about python is the wide variety and large number of high quality packages that are available in the eco-system. A great way to solve a problem is to find a suitable package that brings you most of the way, then building some custom code using that package.

As mentioned above, [PyPI](https://pypi.python.org/pypi) is a great resource. You can also search for ``conda`` packages with: ``conda search <package_name>``

Here are some mailing lists that provide updates on the ecosystem:

- [Python Announce ML](https://mail.python.org/mailman/listinfo/python-announce-list)
- [Python Weekly ML](http://www.pythonweekly.com/)

Further, they are some really great scientific python conferences:

- [SciPy](http://conference.scipy.org/)
- [PyData](http://pydata.org/)

# PyData Stack

<img src="img/PyData_Stack.png">

# Python 3

This tutorial (and Python in general) runs more smoothly under versions of Python 3. There is still legacy Python 2 code in the wild and you may well encounter it, but we will use Python 3 in this tutorial. To execute a cell while advancing the cursor to the next cell beyond, press  &lt;`Shift`&gt;&lt;`Enter`&gt; while the cursor is active in the cell. To execute the cell *without* advancing to the next cell, press &lt;`Control`&gt;&lt;`Enter`&gt; instead.

Whether you're running on Python 2 or Python 3, please install the [Python-Future](http://python-future.org/futurize.html) module in your environment:
```
% conda install -y future
```

# The Python Interpreter

The Python language is designed to be explicit, simple, and readable. Python code is usually run through an *interpreter* that reads and executes the code's intent in a single pass. Traditionally, programs are run through a *compiler* to generate an *executable* file (or *object code*); in practice, what that means is an extra step between writing code and running/executing it. Programming languages that run through interpreters are favored by many because they encourage an experimental approach to programming that is more difficult to achieve with compiler-based workflows.

There are many great tools one can use to experiment with Python:
* [IPython](http://ipython.org), the interactive Python environment (*included with Anaconda*)
    * command line: ``ipython``
    * QT Console: ``ipython qtconsole``
* [Spyder](https://code.google.com/p/spyderlib/), a Python IDE (*included with Anaconda*)
* [Sublime Text](http://www.sublimetext.com/), a cross-platform programmer's text editor
* [PyCharm](https://www.jetbrains.com/pycharm/), another Python IDE (free community edition)
* [PyDev](http://pydev.org/), a plugin for Python development in the Eclipse IDE

## Jupyter Notebooks

For this course, we are going to work with Python in a [Jupyter notebook](https://jupyter.org/) (formerly called an IPython notebook). Jupyter is a web application that allows documents to be created and shared that contain code, equations and visualizations. Notebooks can be created in languages popular in Data Science such as Python, R, Julia and Scala and many others.

To start your notebook session

1. launch your terminal
2. activate your conda environment
3. `cd` to the directory where the course was extracted
4. Run `jupyter notebook`

Let's experiment in the next few cells. Remember, to execute a cell while advancing the cursor to the next cell beyond, press ``<Shift><Enter>`` while the cursor is active in the cell. To execute the cell *without* advancing to the next cell, press ``<Control><Enter>`` instead.

In [None]:
# This command, well, prints "Hello, world!" as is commonly done to learn programming.
print("Hello, world!")

In [None]:
print("Goodbye")

In [None]:
import this  # This command loads a module that prints the Zen of Python.

In [None]:
import antigravity # This command opens a new browser page or tab with an xkcd comic.

In [None]:
# This command loads the "sys" module into the Python session
import sys

In [None]:
# This command prints the string "version" that from the "sys" module (an attribute)
print(sys.version) 

The last cell above shows you what version number of Python you are currently running. The version number has the form
``MAJOR.MINOR.MICRO``, e.g. ``1.5.2`` means *major* version 1, *minor* version 5, and *micro* version 2. The meanings of these revision levels are as follows:
* Major version: Significant change to language structure (grammar and reserved words) and implementation.
* Minor version: Attempts to be forward- and backward-compatible, but introduces new *Standard Library* modules or updates to existing modules.
    * forward-/backward-compatibility is not guaranteed
    * at least 2 minor versions advance notice is given of language changes
* Micro version: only bug fixes-no language or API changes

We can use a few more commands to extract more information about the ``sys`` module just loaded into the Python session.

In [None]:
help(sys)

In [None]:
print(sys.flags)
print(sys.version_info)

In [None]:
sys.version_info.major