# Getting Started

This tutorial is based on [Jupyter Notebook](http://jupyter.org/). Jupyter Notebook is a web-based Python development environment allowing you to combine documentation (markdown), code, and their results) into a single document. This follows a similar idea as [Mathematica](http://www.wolfram.com/mathematica/).

## Installation

[Anaconda](https://www.anaconda.com/download) is a free Python distribution that includes the most common Python packages for data analysis out-of-the-box. Prebuilt packages for the different platforms make it simple to use.

> TASK: Install Anaconda on your machine.
>
> As alternative, you can use the deployed version directly: [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/JKU-ICG/python-visualization-tutorial/master)

We use different frameworks/libraries in this tutorial:
 * [Numpy](http://www.numpy.org/) and [Pandas](http://pandas.pydata.org/) for data manipulation
 * [Matplotlib](http://matplotlib.org/) and [Seaborn](http://stanford.edu/~mwaskom/software/seaborn/) for visualization
 * You may also use [Plotly](https://plot.ly/python/) or [Altair](https://altair-viz.github.io/)
 * [Scikit-learn](http://scikit-learn.org) for simple machine learning

Everything except `Plotly` and `Altair` is already included in Anaconda by default.
This repository contains a `requirements.txt`,listing all dependencies. You can install them manually with
```
conda install plotly
```
or all from the file with:
```
conda install --yes --file requirements.txt
```

Some packages recquire to use [another channel](https://conda.io/docs/user-guide/tasks/manage-channels.html) like [conda-forge](https://conda-forge.org/#about):
```
conda install -c conda-forge altair vega_datasets
```

## Deployment

Deploying Jupyter Notebooks is easy. [mybinder.org](http://mybinder.org) provides you with a free service that turns a Github repository into a collection of interactive notebooks that are accessible online. 

**Hint:** If you have an `index.ipynb` notebook inside a directory, this will be used by default.

## Usage

Launch Jupyter Notebooks by opening a command line, navigate to the desired directory and execute: 

```bash
jupyter notebook
```

This will open your web-browser with the Jupyter dev environment in the current working directory. We are using this interactive tutorial as a starting point. Clone it, navigate to it, and launch Jupyter Notebook. 

> TASK: clone this tutorial repository and launch the IPython environment inside of it.

```bash
git clone https://github.com/JKU-ICG/python-visualization-tutorial.git
cd python-visualization-tutorial
jupyter notebook
```

### First Steps

A Juypter Notebook consists of individual cells. There are two major cell types: Code and Markdown. 

Useful keyboard shortcuts: 
* **Enter**: enter edit mode of the selected cell
* **Shift-Enter**: run cell, select below
* **Ctrl-Enter**: run cell
* **Alt-Enter**: run cell, insert a new cell below

Getting Help: 

* **Tab** code completion or indent
* **Shift-Tab** for a function, e.g., argument list
* `function?` query the Python docstring for the given function


### Import Conventions
The Python community has adopted some naming conventions for common libraries:
```
import numpy as np
import pandas as pd

import matplotlib.pylot as plt
import seaborn as sns
import statsmodels as sm
```

In [1]:
#include some package that we use later on
import numpy as np

In [2]:
#test np.ar -> tab
a = np.array([1,2,3,4])
#test np.array -> shift-tab or np.array?
np.array?

### Interactive Python basics

Python is an untyped dynamic language. The last output of a cell line will be printed. Individual values can also be printed using the `print(...)` function. Variables are just declared and assigned. Function are first level objects and Python can be used to program in a functional style. Some simple examples:


In [3]:
1+2

3

In [4]:
3+4
10/2

5.0

In [5]:
print(5+2)
3+2

7


5

In [6]:
a = 5+2
b = 9
a/b

0.7777777777777778

In [7]:
def sum(a,b): #indent is important in Python!
    return a+b
sum(4,4)

8

In [8]:
def sub(arg1,arg2):
    return arg1-arg2
def calc(f, a, b):
    return f(a,b)
#functions are first level objects, e.g., can be passed as argument to another function
print('sum ', calc(sum, a, b))
print('sub', calc(sub, a, b))

sum  16
sub -2


In [9]:
#array
arr = [1,2,3,4]
#maps aka dictionaries/dicts
dictionary = { 'a': 'Alpha', 'b': 'Beta'}

#array transformation
arr2 = [a * 2 for a in arr]
dict2 = {k : v.upper() for k,v in dictionary.items()}

print(arr2)
print(dict2)

[2, 4, 6, 8]
{'a': 'ALPHA', 'b': 'BETA'}


In [10]:
if a < 5:
    print ('small')
else:
    print ('large')

large


In [11]:
c = 'small' if a < 5 else 'large'
c

'large'

In [12]:
#what else: generators, iterators, classes, tuples, ...

## Next

[Data Manipulation with Numpy and Pandas](02_DataManipulation.ipynb)