### Preliminaries

1. Install Anaconda
  - https://www.continuum.io/downloads
2. Download data:
  - http://swcarpentry.github.io/python-novice-gapminder/files/python-novice-gapminder-data.zip
3. Start Jupyter
    - windows: search 'jupyter'
    - mac / linux: 
        - in terminal `cd ~/Downloads/`
        - run `jupyter notebook`

### Exercises
- The way you learn is by writing code


- Most of the exercises in this notebook are from a Software Carpentry course
  - http://swcarpentry.github.io/python-novice-gapminder/
  - https://github.com/katyhuff/2016-07-11-scipy/blob/gh-pages/python/00-python-intro-w-solutions.ipynb
  - More exercises there!


- Beginner exercises:
  - 46 Simple Python Exercises: http://www.ling.gu.se/~lager/python_exercises.html 
  - http://www.practicepython.org/


- Intermediate/Advanced exercises:
  - Python programming exercises: https://github.com/zhiwehu/Python-programming-exercises
  - Google Python Exercises (python 2): https://developers.google.com/edu/python/exercises/basic

### References

- Google Python Class (with Lecture Videos) - https://developers.google.com/edu/python/
- Software Carpentry - http://swcarpentry.github.io/
  - Current Python Lessons: http://swcarpentry.github.io/python-novice-inflammation/
  - New Python Lessons: http://swcarpentry.github.io/python-novice-gapminder/
- Python Crash Course for Scientists - http://nbviewer.jupyter.org/gist/rpmuller/5920182


- Awesome Python - https://github.com/vinta/awesome-python
- Anaconda Python Distribution - https://www.continuum.io/downloads
- Pandas Documentation - http://pandas.pydata.org/pandas-docs/stable/
- Python Documentation - https://docs.python.org/3/


- Pycon 2016 Videos - https://www.youtube.com/channel/UCwTD5zJbsQGJN75MwbykYNw/videos
- Scipy 2016 Videos - https://www.youtube.com/playlist?list=PLYx7XA2nY5Gf37zYZMw6OqGFRPjB1jCy6


- r/learnpython - https://www.reddit.com/r/learnpython/

### Why Python?

- Widely used --> Lots of community support
- Research shows it's easier to learn than other languages
- Language is far less important than learning programming concepts
- Easy to install everything you need with `anaconda`

### Python as a calculator

- addition
    - python prompt in terminal
    - python prompts in jupyter
        - Shift + Enter to run cell
        - `+` button to add more cells below
  
  
- expressions    
- assigning names to values (width, height, area) with `=`
- changing the value of a name --> variable

Exercise:
- Which is a better variable name, m, min, or minutes?

```python
ts = m * 60 + s
tot_sec = min * 60 + sec
total_seconds = minutes * 60 + seconds
```

Exercise:
- To change meters to inches you can use the following formula:

$feet = {meters}\times{3.3}$

- Fill in the blanks
- print the value of `feet`

In [None]:
feet = ____ * ____

### Text data / Strings

- store text data in a variable using `''`
- special name for text data: **string**
- can combine strings with `+`

### Lists 1: Introduction

- we can store multiple values in a single variable using a list
- creating a list
  - syntax: `name = [...]`

In [None]:
days_of_the_week = ["Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday"]

Exercise:
- create a list called `people` and fill it in with the names of 4 people around you
- print the value of `people`

- we pull individual elements out of the list
  - syntax: `name[#]`
  - python uses 0-based indexing so the first element is element 0
  - grabbing elements backwards


```python
["Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday"]
     0        1        2          3          4          5          6
    -7       -6       -5         -4         -3         -2         -1

```


Exercise:
- copy the code you wrote in the cell above defining the value of `people` into a new cell below this one
- print the names of the people starting from the end of the list to the beginning with positive values
- do the same thing using negative values

- we can extract a section of a list by *slicing* it
  - syntax: `name[start:stop]`

`days_of_the_week[2:5]`

```python
["Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday"]
     0        1        2          3          4          5          6
                       2 ........ 3 ........ 4 ........ X          
```

Exercise:
- What does days_of_the_week[2:] (without a value after the colon) do?
- What does days_of_the_week[:3] (without a value before the colon) do?
- What does days_of_the_week[:] (just a colon) do?

### Types

- every variable has a type
- python types
  - `int`: 1, 2, 3
  - `float`: 2.2, -24.0
  - `str`: 'hello', '', '234'
  - `list`: ['hello', '', '234']

    
- find out the type of a variable using the `type()` function
  
  
- a variable's type determines the actions the program can perform on it
  - an int **can** be divided by an int
  - a string **cannot** be divided by an int
  
  
- `int` and `float` can be mixed
- `str` and `int` can't be mixed

In [None]:
days_of_the_week = ["Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday"]

Exercise:
- What is the type of `3.4`?

Exercise:
- What is the type of `1 + 3.4`?

Exercise:
- What type of value (integer, floating point number, or character string) would you use to represent each of the following?
  - Number of days since the start of the year.
  - Time elapsed since the start of the year.
  - Serial number of a piece of lab equipment.
  - A lab specimen’s age.
  - Current population of a city.
  - Average population of a city over time.

### Boolean Type
- boolean variables can take on the value of either `True` or `False`
  - `bool`: `True`, `False`, *nothing else*
  

- used to compare items
  - `==` tells you if two values are equal
  - `x < y < z`


- used to check for membership in a collection
  - `in` tells you if an item is within a collection

### Memory

- variables are stored in memory
- memory is available in all cells
- errors if variable does not exist in memory
- `%whos`: what is in memory
- `%reset`: reset memory -- careful!

### Break 15 mins

### Functions 1: Introduction
- functions are like vending machines
  - you give:
    - money
    - button presses
  - you get:
      - a chocolate bar


- we already used the `type()` function
  - we gave:
      - a variable
  - we got:
      - the type of the variable


- we call the things we give *arguments* or *input*
- we call the things we receive *output*


- we have been using the `print()` function automatically in jupyter
- can pass multiple arguments to `print()`


- another useful function `help()`

### Libraries
- code written by other people that you can use in your own programs
  - pandas for working with data
  - matplotlib for plotting
  

- list of libraries included with anaconda: https://docs.continuum.io/anaconda/pkg-docs

### Pandas 1: Introduction

- pandas is a library for working with data
- allows you to perform statistics and manipulate data easily


- pandas provides you with several useful data structures
  - we've seen one data structure before: `list`
  - pandas provides `DataFrame` -- analogous to an excel worksheet


- pandas has functions which allow us to read data into `DataFrame`'s
  - can specify `index_col`

In [None]:
import pandas
data = pandas.read_csv('gapminder_gdp_oceania.csv')
data

`DataFrame`'s are smart. 
  - They come bundled with a lot of useful functions
  - Since these functions are part of the dataframe itself, they are called *methods*.
  

  - `info()`
  - `columns()`
  - `describe()`
  - `head()`
  - `tail()`

In [None]:
# data.  # tab to see what methods (functions) are available

Exercise:
- Read the data in `gapminder_gdp_americas.csv` into a variable called americas and display its summary statistics.

Exercise:
- After reading the data for the Americas, use `help(americas.head)` and `help(americas.tail)` to find out what `DataFrame.head` and `DataFrame.tail` do.


- What method call will display the first three rows of this data?
- What method call will display the last three columns of this data? (Hint: you may need to change your view of the data.)

Exercise:
- As well as the `read_csv` function for reading data from a file, Pandas provides a `to_csv` function to write dataframes to files. Applying what you’ve learned about reading from files, write one of your dataframes to a file called `processed.csv`. You can use help to get information on how to use `to_csv`.

### Pandas 2: Indexing and Selecting Data

In [None]:
# data = pandas.read_csv('gapminder_gdp_europe.csv', index_col='country')
# print(data.ix["Belgium", "gdpPercap_1957"])
# print(data.ix[2, 1])

In [None]:
# Selecting by row
# serbia_gdp_row = data.ix["Serbia", :]
# serbia_gdp_row

In [None]:
# You can apply functions on columns of data
# serbia_gdp_row.max()

In [None]:
# serbia_gdp_row.idxmax()

In [None]:
# Selecting by column
# gdp_1987 = data.ix[:, "gdpPercap_1987"]
# gdp_1987

In [None]:
# gdp_1987.sort_values()

In [None]:
# gdp_1987 > 15000

In [None]:
# gdp_1987[gdp_1987 > 15000].sort_values()

In [None]:
# Selecting by row and column slices
# data.ix['Italy':'Poland', 'gdpPercap_1962':'gdpPercap_1972']

Exercise:
- Explain what each line in the following short program does: what is in first, second, etc.?

```python
first = pandas.read_csv('data/gapminder_gdp_all.csv', index_col='country')
second = df[df['continent'] == 'Americas']
third = second.drop('Puerto Rico')
fourth = third.drop('continent', axis = 1)
fourth.to_csv('result.csv')
```

Exercise:
- Assume Pandas has been imported and the Gapminder GDP data for Europe has been loaded. Write an expression to select each of the following:
  - GDP per capita for all countries in 1982.
  - GDP per capita for Denmark for all years.
  - GDP per capita for all countries for years after 1985.
  - GDP per capita for each country in 2007 as a multiple of GDP per capita for that country in 1952.

### Plotting

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

In [None]:
data = pandas.read_csv('gapminder_gdp_europe.csv', index_col='country')

In [None]:
data.ix['Belgium'].plot()
plt.xticks(rotation=90)

In [None]:
plt.style.use('ggplot')

In [None]:
data.ix['Belgium'].plot()
plt.xticks(rotation=90)

Rotate a DataFrame using `T`

In [None]:
data.T

In [None]:
data.T[['Belgium','Austria']]

In [None]:
data.T[['Belgium','Poland']].plot()
plt.xticks(rotation=90)
plt.ylabel('GDP per capita')

Exercise:
- Fill in the blanks below to plot the mean GDP per capita over time for all the countries in Europe. 

```python
data_europe = pandas.read_csv('gapminder_gdp_europe.csv')
data_europe.____.plot(label='mean')
plt.legend(loc='best')
```

- Copy the line of code from an earlier cell to rotate the xticks 90 degrees
- Rerun the plot

Exercise:
- This short programs creates a plot showing the correlation between GDP and life expectancy for 2007, normalizing marker size by population:

```python
data_all = pandas.read_csv('gapminder_all.csv')
data_all.plot(kind='scatter', x='gdpPercap_2007', y='lifeExp_2007',
              s=data_all['pop_2007']/1e6)
```

- Run the code
- Using online help and other resources, explain what each argument to plot does.

### How I use Python
- Financial time series example

### Break 15 mins

### Functions 2: Built-in Functions
- `len`


- type conversion
  - `str`, `int`, `float`
  

- `min`, `max`, `round`


- `help`


- Jupyter autocomplete
  - names of variables / functions
  - function parameters

Exercise:
- Which of the following will print 2.0? Note: there may be more than one right answer.

```python
first = 1.0
second = "1"
third = "1.1"
first + float(second)
float(second) + float(third)
first + int(third)
first + int(float(third))
int(first) + int(float(third))
2.0 * second
```

### Lists 2: More Lists!
- using len
- modifying items in lists


- lists are smart like `DataFrames`
  - append
  - list.
  
  
- del
- out of bounds error


- empty list
- +

In [None]:
pressures = [0.273, 0.275, 0.277, 0.275, 0.276]
pressures

Exercise:
- Slicing works on strings too
- What does the following program print?

```python
element = 'lithium'
print(element[0:20])
print(element[-1:3])
```

Exercise:
- If ‘low’ and ‘high’ are both non-negative integers, how long is the list values[low:high]?

Exercise:
- What do these two programs print? In simple terms, explain the difference between sorted(letters) and letters.sort().

```python
# Program A
letters = list('gold')
result = sorted(letters)
print('letters is', letters, 'and result is', result)
# Program B
letters = list('gold')
result = letters.sort()
print('letters is', letters, 'and result is', result)
```

### None

- `None` in python indicates that there is **no value**
- This is different from `False`
- Here the function `sort()` sorts the list in place and returns `None`
  - `sort()` did it's work
  - if you don't specify the return value in python, python sets the return value as `None`
  - here sort had nothing to return, so the result is `None`

### Errors
- Syntax Errors
  - mismatched `''`
  - extra `=`
  - mismatched `()`
  - mistyped function name
  - missing name

### Loops
- repeating actions
- for loop syntax:
  - for keyword
  - loop variable -- can be **anything**
  - list
  - colon
  - indentated code -- can be several lines
  

- errors
  - missing colon
  - indentation

In [None]:
ages = [18, 24, 33, 20]

In [None]:
# print(ages[0])
# print(ages[1])
# print(ages[2])
# print(ages[3])

In [None]:
# for age in [18, 24, 33, 20]:
#     print(age)

In [None]:
# for age in ages:
#     print(age)

In [None]:
# primes = [2, 3, 5]
# for p in primes:
#     squared = p ** 2
#     cubed = p ** 3
#     print(p, squared, cubed)

- range: for loop using `range(start, stop)`
- accumulator pattern: using an external variable `total` when summing squares of numbers
  - remember 0 indexing

In [None]:
total = 0
for n in range(0,5):
    square = n**2
    total = total + square
#     print(square)
total

Exercise: Classifying Errors
- Is an indentation error a syntax error or a runtime error?

Exercise: Practice Accumulating
- Fill in the blanks in each of the programs below to produce the indicated result.
```python
# Total length of the strings in the list: ["red", "green", "blue"] => 12
total = 0
for word in ["red", "green", "blue"]:
    ____ = ____ + len(word)
print(total)
```

```python
# List of word lengths: ["red", "green", "blue"] => [3, 5, 4]
lengths = ____
for word in ["red", "green", "blue"]:
    lengths.____(____)
print(lengths)
```

```python
# Concatenate all words: ["red", "green", "blue"] => "redgreenblue"
words = ["red", "green", "blue"]
result = ____
for ____ in ____:
    ____
print(result)
```

In [None]:
# Create acronym: ["red", "green", "blue"] => "RGB"
# hint: use str.upper()

Exercise:
- Reorder and properly indent the lines of code below so that they print an array with the cumulative sum of data. The result should be [1, 3, 5, 10].

```python
cumulative += [sum]
for number in data:
cumulative = []
sum += number
print(cumulative)
data = [1,2,2,5]
```

Exercise: Identifying Item Errors
- Read the code below and try to identify what the errors are without running it.
- Run the code, and read the error message. What type of error is it?
- Fix the error.

```python
seasons = ['Spring', 'Summer', 'Fall', 'Winter']
print('My favorite season is ', seasons[4])
```

### Functions 3: Writing Functions

- assign several lines of code to a name
- split up large programs so they are more readable
- write once, use many times


- synatax:
  - `def`
  - name
  - `()` parameters: 0 or more
  - `:`
  - indented code
  - optionally `return` some value(s)


- examples:
  - print a number
  - print sum of two numbers
  - return sum of two numbers

Exercise:
- write a function called `maximum_plus_one` which takes to numbers as arguments and returns the largest number plus 1

Exercise: Creating your own library
- create a new file in the same folder as this notebook called `mylibrary.py`
- open the file by clicking on it in jupyter
- copy the following function to the file and save it

```python
def greeting():
    print('Well hellllllo!')
```

- import your library into jupyter using the standard import syntax

```python
import mylibrary
```

- call your greeting function in a cell below your import

### Conditionals

`if` statements
- control program execution

In [None]:
mass = 3.54
if mass > 3.0:
    print(mass, 'is large')

In [None]:
masses = [3.54, 2.07, 9.22, 1.86, 1.71]
for m in masses:
    if m > 3.0:
        print(m, 'is large')

In [None]:
masses = [3.54, 2.07, 9.22, 1.86, 1.71]
for m in masses:
    if m > 3.0:
        print(m, 'is large')
    else:
        print(m, 'is small')

In [None]:
masses = [3.54, 2.07, 9.22, 1.86, 1.71]
for m in masses:
    if m > 9.0:
        print(m, 'is HUGE')
    elif m > 3.0:
        print(m, 'is large')
    else:
        print(m, 'is small')

Conditions are tested in order

In [None]:
grade = 85
if grade >= 70:
    print('grade is C')
elif grade >= 80:
    print('grade is B')
elif grade >= 90:
    print('grade is A')

Variables can change during the loop depending on circumstances

In [None]:
velocity = 10.0
for i in range(5): # execute the loop 5 times
    print(i, ':', velocity)
    if velocity > 20.0:
        print('moving too fast')
        velocity = velocity - 5.0
    else:
        print('moving too slow')
        velocity = velocity + 10.0
print('final velocity:', velocity)

Compound conditions

In [None]:
mass     = [ 3.54,  2.07,  9.22,  1.86,  1.71]
velocity = [10.00, 20.00, 30.00, 25.00, 20.00]

i = 0
for i in range(0, 5):
    print(mass[i])
    if mass[i] > 5 and velocity[i] > 20:
        print("Fast heavy object.  Duck!")
    elif mass[i] > 2 and mass[i] <= 5 and velocity[i] <= 20:
        print("Normal traffic")
    elif mass[i] <= 2 and velocity[i] <= 20:
        print("Slow light object.  Ignore it")
    else:
        print("Whoa!  Something is up with the data.  Check it")

Exercise:
- For each number of cookies `n` in the list `cookies`
  - print `No cookies left` if `n` is zero
  - print `# cookies left` if `n` is less than or equal to 20
  - print `So many cookies!` if `n` is greater than 20

In [None]:
cookies = [11,8,32,0,20,21,199]

### Mutability / Changeability

- mutable *changeable* types
  - `list`
  - `dictionary`
  
- immutable *unchangeable* types
  - `int`
  - `string`
  - `float`
  - `tuple`

Exercise:
- Fill in the comment at the end of each line showing the value of the variable after this line is executed
- In simple terms, what do the last three lines of this program do?

In [None]:
lowest = 1.0       # lowest:
highest = 3.0      # highest:
temp = lowest      # temp:
lowest = highest   # lowest:
highest = temp     # highest:

Exercise
- Fill in the comment at the end of each line showing the value of the variable after this line is executed

In [None]:
initial = "left"   # position:
position = initial # position:
initial = "right"  # position:

Exercise:
- What do these two programs print? In simple terms, explain the difference between `new = old` and `new = old[:]`.

```python
# Program A
old = list('gold')
new = old      # simple assignment
new[0] = 'D'
print('new is', new, 'and old is', old)
# Program B
old = list('gold')
new = old[:]   # assigning a slice
new[0] = 'D'
print('new is', new, 'and old is', old)
```

### Dictionaries
- data structure
- lookup table
- ask for an item by name, retrieve it's stored value
  - the lookup names are called *keys*
  - the value associated with the key is called its *value*
- dictionary syntax:
  - use curly braces `{}`
  - keys are separated from values by a colon `:`
  - each key, value pair is separated by a comma `,`


- get a value
- add a key
- more info: https://docs.python.org/3/tutorial/datastructures.html

In [None]:
my_suitcase = {
    'shirts' : 7,
    'socks' : 20,
    'shoes' : 2,
    'sandals' : 1
}

### Requesting data from online sources
- `requests` library
- list of public APIs: https://github.com/toddmotto/public-apis


- IP geolocation
  - https://ipinfo.io/developers/specific-fields
  - api endpoint: http://ipinfo.io/json
  
  
- Foreign Exchange Rates API
  - docs: http://fixer.io/
  - api endpoint: 

In [None]:
import requests
# r = requests.get('http://ipinfo.io/json')

Exercise:
- Find the api endpoint for the latest foreign exchange rates on http://fixer.io
- use the requests library to get data from the endpoint
- use dictionary lookups to determine the latest rate for the Swedish Krona (SEK)

### List Comprehensions
- compact syntax for making lists


- start with square brackets
- add for loop syntax
- to the left of this add body


- calculate squares
- only consider squares grater than 20

In [None]:
numbers = [1,2,3,4,5,6]

### Programming Style
- choosing good variable names
- choosing good function names
- using language specific convensions
  - PEP8
- commenting


- careful constructing complex list comprehensions

### Assertions

In [None]:
def print_name_age_times(name, age):
    assert isinstance(name, str)
    assert age > 0
    name = name + ' '
    print(name * age)

def can_ride_rollercoaster(height, min_height=1):
    assert isinstance(height, int)
    return height >= min_height

### Debugging

Allows you to examine program execution while it's running

```python
import pdb; pdb.set_trace()

from IPython.core.debugger import Tracer
Tracer()() #this one triggers the debugger
```

Example of `print_name_age_times` with conditions.

### Recursion
- a function that calls itself
- must specify the terminal case (or else function won't know when to stop)

In [None]:
def sum_iterative(numbers):
    total = 0
    for num in numbers:
        total = total + num
    return total

a = [1, 2, 3, 4]
sum_iterative(a)

```
sum_iterative([1, 2, 3, 4])
= 1
= 3
= 6
= 10
```

In [None]:
def sum_recursive(numbers):
    
    # If we haven't sliced off all the items in our list
    if len(numbers) > 0:
        
        # Slice off the first one
        first_number = numbers[0]
        
        # Prepare the remainder of the list for further slicing
        remaining_numbers = numbers[1:]
        
        # Return sum of the first number + the unfinished calculation
        return first_number + sum_recursive(remaining_numbers)

    # If we have sliced off all the items in our list
    else:
        return 0

a = [1, 2, 3, 4]
sum_recursive(a)

```
sum_recursive([1, 2, 3, 4])
= 1 + sum_recursive([2, 3, 4])
= 1 + 2 + sum_recursive([3, 4])
= 1 + 2 + 3 + sum_recursive([4])
= 1 + 2 + 3 + 4 + sum_recursive([])
= 1 + 2 + 3 + 4 + 0
= 1 + 2 + 3 + 4
= 1 + 2 + 7
= 1 + 9
= 10
```

### Exercises
- The way you learn is by writing code
- Most of the exercises in this notebook are from a Software Carpentry course
  - http://swcarpentry.github.io/python-novice-gapminder/
  - More exercises there!


- Beginner exercises:
  - 46 Simple Python Exercises: http://www.ling.gu.se/~lager/python_exercises.html 
  - http://www.practicepython.org/

- Intermediate/Advanced exercises:
  - Python programming exercises: https://github.com/zhiwehu/Python-programming-exercises
  - Google Python Exercises (python 2): https://developers.google.com/edu/python/exercises/basic