One of the major hurdles in getting up and running with Python is the complexity of the installation
process, as there are not only multiple versions of Python to consider, but packages must
be downloaded and installed to the right location, and any dependencies between them must
also be managed.

Fortunately, several Python distributions have been put together that work as a one stop installation of both Python and many of the extended libraries. 

# Installing the Anaconda Distribution

The Anaconda distribution is cross platform and comes with Python 2.7 or 3.6, as well as other tools and libraries such as Jupyter, Spyder, IPython, NumPy,
SciPy, pandas, seaborn and matplotlib already configured.

The fastest way to get up and running with Python is to download the Anaconda Python distribution
from [https://www.anaconda.com/download](https://www.anaconda.com/download), and follow the installation instructions.

The default installation path for the Python 3 Anaconda distribution on Windows is:

```
    C:\Users\<username>\AppData\Local\Continuum\Anaconda3
```

# Jupyter Notebooks

There are several ways of using Python for data analysis. Spyder is a full interactive development environment (IDE), similar to RStudio. Another tool are Jupyter Notebooks which we will be using here. Like R Notebooks and R Markdown, they are designed for an easy integration of text and programing. They are aimed at providing a more interactive workflow for Python programming, analysis and reporting. While an R Notebook is a script that gets rendered into an output file in one step, Jupyter Notebooks contain cells which can all be executed and rendered interactively. 

## Starting Jupyter

Once installed look for and run an application called Jupyter Notebook, which on Windows is located in:

```
    Start menu --> All programs --> Anaconda3 --> Jupyter Notebook 
```

Alternatively, open up a command line console and type:

```
    jupyter notebook
```

After about a minute, you should see the Jupyter application appear. The figure shows an
example of what Jupyter looks like on Mac. The appearance on Windows and Linux operating
systems will be slightly different.

![](img/fig/jupyter1.png)

You should see three main tabs in Jupyter on start-up:

* **Files** Your file directory

* **Running** Lists all of the notebooks currently running

* **Clusters** For using IPython in parallel with your cluster (beyond the scope of this course)

## Opening a new Jupyter Notebook

To open a Jupyter Notebook, click the New drop down menu on the File tab and select "Python 3" under the Notebooks heading.

![](img/fig/jupyter2.png)

This will open a blank Notebook with an IPython console (using Python 3) running underneath it.

![](img/fig/jupyter3.png)

The IPython console is used to input and execute Python code interactively. Outputs, errors
and warning messages are directly shown in the same window. A command which has been
input into the console is executed by simply pressing `Shift+Enter`.

## The IPython Kernel

The IPython kernel is the processing back-end that Jupyter notebooks use to execute Python code. The IPython kernel is a python interpreter designed for interactive use, and adds many helpful features such as:

* Coloured input and output lines, as well as error messages for added clarity.
* Aliases to useful operating system calls for directory navigation e.g. pwd, cd, ls and mkdir.
* 'Magic' functions provide many other helpful shortcuts. Type `%quickref` for a full list. A few are covered later in this course.
* Tab completion of variable names, imports, class attributes and functions as well as file names in the current working directory.
* Object exploration through the use of `?` and `??` (see below).
* Input/Output history referencing and searching.

A detailed introduction and overview of these features can also be found by entering a single
question mark `?` within IPython. Key features will be introduced gradually throughout this
course with examples.

## Cell Based Execution

It is common when working interactively to only want to run a section of the code, rather than the entire script. To aid in this, Jupyter uses a concept of 'code cells', which can be executed individually in the IPython console. Each block represents a code cell and can be executed by:

* Press `Ctrl+Enter` to execute all the code in the current active cell.

* Press `Shift+Enter` to execute all the code in the current active cell and advance the cursor to the next cell.

Using `Shift+Enter` multiple times to incrementally step through the code and investigate the results in the IPython console is a common workflow when prototyping and doing exploratory work.

![](img/fig/jupyter4.png)

## Command and Edit Modes

Jupyter notebook is a modal editor which means that the keyboard does different things depending on which mode the notebook is in. There are two modes: **edit mode** and **command mode.** To go from command mode to edit mode, press `Enter`. To return to command mode from edit mode, press `Esc`.

Edit mode is indicated by a green cell border and left sidebar, and a prompt showing in the editor area:

![](img/fig/CellModeEdit.png)

When a cell is in edit mode, you can type into the cell, like a normal text editor. Enter edit mode by pressing `Enter` or using the mouse to click on a cell's editor area.

Command mode is indicated by a grey cell border and a blue sidebar:

![](img/fig/CellModeCommand.png)

When you are in command mode, you are able to edit the notebook as a whole, but not type into individual cells. Most importantly, in command mode, the keyboard is mapped to a set of shortcuts that let you perform notebook and cell actions efficiently. For example, if you are in command mode and you press `c`, you will copy the current cell - no modifier is needed.

**Warning: Cells in Command Mode** 

Don't try to type into a cell in command mode; unexpected things will happen!

A summary of useful shortcuts is included at the end of this section, however a full list is always available by going to:

```
    Help --> Keyboard Shortcuts
```

**Tip: Keyboard Shortcuts** 

You can also access the Keyboard Shortcuts list by pressing `h` in command mode.

## Markdown Text Cells

The default cells are IPython console cells though a cell can be changed to include Markdown text. To change the type of cell from code to Markdown go to the top menu bar:

```
    Cell > Cell Type > Markdown
```

Or press `m` while in Command Mode and highlighting the cell. 

Markdown text cells support plain text, Markdown and HTML. For information about Markdown and HTML:

* **Markdown** - [https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet)
* **HTML** - [https://web.stanford.edu/group/csp/cs21/htmlcheatsheet.pdf](https://web.stanford.edu/group/csp/cs21/htmlcheatsheet.pdf)

## Useful Keyboard Shortcuts

All of the cell manipulation commands also have keyboard shortcuts to execute. A list of the commonly used ones are below; however, to view a full list of shortcuts either go to **Help > Keyboard Shortcuts**, or in Command Mode, press **H**.

Command Mode (press Esc to enable)

* **Enter** enter edit mode
* **H** show keyboard shortcuts
* **Shift + Enter** run cell, select below
* **Ctrl + Enter** run cell
* **Alt + Enter** run cell, insert below
* **Y** to code
* **M** to markdown
* **A/B** insert cell above or below
* **X** cut selected cell
* **C** copy selected cell
* **V** paste cell
* **D, D** (i.e. **D** then **D** again) delete cell

Edit Mode (press Enter to enable)

* **Tab** code completion or indent
* **Ctrl + Z** undo

# Modules, Packages, and Libraries

## Modules

A module in Python is simply a `.py` file that contains additional code to define functions, classes or variables. Modules provide a way to logically group related code, making it easier to understand and use.

In Python terms, a module is just another Python object with attributes that you can bind and reference, though in this case the attributes are the various functions, classes or variables defined in the `.py` file. To use them within our namespace we use the `import` statement in one of three ways:

    import module_name
    import module_name as alias
    from module_name import object1, object2, object3

The method used to import will affect how the objects imported are used. Some examples are
shown below.

In [1]:
# method 1
import math
math.sin(math.pi/3)

# method 2
import math as m
m.sin(m.pi/3)

# method 3
from math import sin, pi
sin(pi/3)

0.8660254037844386

There is also a fourth way of importing things with a * instead of specifying each object individually.
This imports all the functions, classes and variables contained within the module.

In [2]:
from math import *
log(sin(pi/3))

-0.14384103622589053

However, this method of importing objects is frowned upon as it is less clear what is imported and will also override variables already defined if their names clash.

A module is loaded only once, regardless of the number of times it is imported. This prevents
the module execution from happening over and over again if multiple imports occur in a single
script.

## Packages and Libraries

A package is a collection of modules, similar to packages in R. A collection of packages is usually referred to as a library. There is no formal distinction between a package and a library so sometimes package and library are used interchangeably.

Packages and libraries can be made available in the same manner as above, simply using the package or library name instead of the `module_name` in the `import` statements.

It is also possible to selectively import only certain modules in a package or library. To access a module (or sub-package), we must use the `.` operator. For example, to extract a module named `foo` from a package named `mypackage`, as the alias `mpf`, we use the following code:

```python
import mypackage.foo as mpf
```

Then to use any functions from this module, we simply call the module alias, the `.`, and then the function name, i.e., for function `func1` we have the following:

```python
x = mpf.func1()
```

### The Python Standard Library

Python comes with many useful modules and packages as standard. Details on using the Python Standard Library can
be found in the documentation at [https://docs.python.org/3.6/library/](https://docs.python.org/3.6/library/), with additional examples available at this fantastic site [http://pymotw.com/2/contents.html](http://pymotw.com/2/contents.html).

**Exercise**

1. Open up a new Jupyter Notebook and print "Hello Python world!" on the console.
2. Add a markdown cell and include a note that this is the notebook for today's workshop.
3. Import the **numpy** library using the alias np.

Extension:

4. Draw a (pseudo) random number from a uniform distribution on the interval [2, 5].

In [2]:
"Hello Python world!"

import numpy as np

import random
random.uniform(2,5)

2.8139219124386594

## Installing Additional Packages

If you have the Anaconda Python distribution, this comes with many third party python packages
already installed. However you may come across additional packages that you wish to install.

Unlike R's `install.packages`, Python itself has no built in system to manage packages, but there is a central repository for Python packages known as the Python Package Index or PyPI. Several command line utilities
have been developed to look up and install Python packages from PyPI.

The recommended tool to use to install additional Python packages is pip. Python >= 3.4 now comes with pip by default.

You can search for packages on PyPI using,

```python
pip search <package name>
```

Packages are then installed with,

```python
pip install <package name>
```

Other helpful pip commands are `pip list`, which lists all the packages installed by pip, and
`pip uninstall <package name>` to remove an installed package.

# The Help System

In Python, to find out information about a function we can use the online documentation for the package that it is in. However, we can also find out more about functions by using Jupyter's inbuilt help system.

The `?` can be used to display help on a function, Jupyter allows the use of the `?` before or after the function in question. For example, to find out information about the `np.floor` function we can use the following:

```
>>> import numpy as np
>>> ?np.floor
```

![](img/fig/HelpFile.png)