# Outline day 1 afternoon 
# _python crash course_

1. Data types
  * Variables
  * Numbers
  * Strings
  * Regular expressions
2. Exercises 03: data types
3. Data containers
  * Lists
  * Tuples
  * Sets
  * Dictionaries
4. Exercises 04: Data containers
5. Control structures
  * Logical operators
  * If statements
  * For loops
  * While looops
6. Exercises 05: control structures
7. Functions and libraries
  * custom functions
  * the python standard library
  * numpy
  * matplotlib

By the end or this lecture you will
* have a basic understanding of how programming work
* know how to read and write files and deal with file names
* be able to write your own functions and code snippets to deal with more complicated tasks
* know where to find third party libraries that solve your problems for you

# 1. Data types

## Variables

A variable holds a value.  
Variables are known throughout the whole document below the point where they were defined.

##### Example

In [162]:
a = 10
print(a)

10


You can change the value of a variable at any point. Use ```=``` to assign values to variables.

In [163]:
b = 100
print(b)
b = 50
print(b)
b = b/2
print(b)

100
50
25.0


##### Naming rules

- Variables can only contain letters, numbers, and underscores. Variable names can start with a letter or an underscore, but can not start with a number.
- Spaces are not allowed in variable names, so we use underscores instead of spaces. For example, use student_name instead of "student name".
- You cannot use [Python keywords](http://docs.python.org/3/reference/lexical_analysis.html#keywords) as variable names (more on that later).
- Variable names should be descriptive, without being too long. For example MPI_rooms is better than just "rooms", and "number_of_rooms_at_the_MPI".

##### NameError 

An error you will almost certainly experience at some point

In [164]:
MPI_rooms = 83
print(MPI_roms)

NameError: name 'MPI_roms' is not defined

The anatomy of an error message:
* The type of the error (NameError)
* The file that was responsible for the error (ipython-input...)
* the line in which the error occured
* the cause of the error

Make sure to spell variables right and always assign values to them before you use them!

## Strings

* Strings are sets of characters (letters and symbols), they are mostly used to store text.  
* We declare strings by using either double or single quotes, then writing text and finally 'closing' the string again with double or single quotes respectively.
* Use the escape character '\' to insert linebreaks into strings.
* Use triple quotes for multiline strings

##### Example 

In [None]:
quote = "Linus Torvalds once said, \
    'Any program is only as good as it is useful.'"

In [None]:

multiline_string = '''This is a string where I 
can confortably write on multiple lines
without worring about to use the escape character "\\" as in
the previsou example. 
As you'll see, the original string formatting is preserved.
'''

print(multiline_string)

##### Combinint strings (concatenation) 

The 'plus' sign combines two strings into one, we call this **concatenation**

In [None]:
first_name = 'jana'
last_name = 'lasser'
full_name = first_name + ' ' + last_name
print(full_name)

There is no limit to the number of strings you can combine. You can also create strings from numbers to print nicer looking result messages to the output:

In [None]:
a = 10
b = 3
c = a*b
print('if you multiply ' + str(a) + ' by ' + str(b) + \
          ', the result is ' + str(c))

##### Some string formatting 

String concatenation is a bit clumsy sometimes. A very powerful and more elegant tool is string formatting.

In [None]:
string_template = 'The result of the calculation of {calc} is {res}'
print("String Template: ", string_template)

print(string_template.format(calc='(3*4)+2', res=(3*4)+2))

There's much more than that!
For further information about *String formatting*, see the official online documentation about the [`string`](https://docs.python.org/3/library/string.html) module.

###### Whitespaces

* "whitespace" refers to characters that the computer is aware of, but are invisible to readers.
* Tabs and newlines are represented by special character combinations.
* The two-character combination "\t" makes a tab appear in a string whereas "\n" starts a new line.

In [None]:
#tabs
print('Hello world!')
print('\tHello world!')
print('Hello \tworld!')

In [None]:
#leading newline
print('\nHello world!')

In [None]:
#trailing newline
print('Hello world!\n')

In [None]:
#line break in the middle of a string
print('Hello \nworld!')

##### Stripping whitespaces 

Whitespaces can be annoying when working with strings. Thats why there is functionality to do away with them:

In [None]:
name = ' jana '

print('-' + name.lstrip() + '-')
print('-' + name.rstrip() + '-')
print('-' + name.strip() + '-')

## Numbers

##### Integers

You can do all of the basic operations with integers, and everything should behave as you expect. Addition and subtraction use the standard plus and minus symbols. Multiplication uses the asterisk, and division uses a forward slash. Exponents use two asterisks.

In [None]:
print(10+3)

In [None]:
print(10-3)

In [None]:
print(10*3)

In [None]:
print(10/3)

In [None]:
print(10**3)

As with every calculator, you can use paranthesis (only round ones!) to modify the order of operators

In [None]:
standard_order = 2+3*4
changed_order = (2+3)*4
print(standard_order)
print(changed_order)

##### Floating point numbers

As you might have noticed, integers get converted to floats automatically when using the division operator. **This only happens in Python 3, not in Python 2!**  
You can reproduce integer divison without 'remainder' by manually casting a number to integer:

In [None]:
a = 10
b = 3
c = a/b
d = int(a/b)
print(c)
print(type(c))
print(d)
print(type(d))

**HINT:** you can ask what 'type' a variable is by using ```type(variable)```

## Comments

Comments allow you to write in English, within your program. In Python, any line that starts with a pound (#) symbol is ignored by the Python interpreter.

In [None]:
# This line is a comment.
#this 
#is 
#not

print("This line is not a comment, it is code.")

##### What makes a good comment?
- It is short and to the point, but a complete thought. Most comments should be written in complete sentences.
- It explains your thinking, so that when you return to the code later you will understand how you were approaching the problem.
- It explains your thinking, so that others who work with your code will understand your overall approach to a problem.
- It explains particularly difficult sections of code in detail.

##### When should you write a comment?
- When you have to think about code before writing it.
- When you are likely to forget later exactly how you were approaching a problem.
- When there is more than one way to solve a problem.
- When others are unlikely to anticipate your way of thinking about a problem.

# 2. Exercises 03: data types

1. **Strings**
  1. Choose a famous scientist (from Göttingen of course..), store their first name, last name and a quote from them in variables
  2. Use concatenation to make a sentence with the person and the quote, print the quote.
  3. Use strip to remove all whitespaces from the quote and print it.
2. **Numbers**
  1. Write a program that prints out the results of a calculation for each basic operation.
  2. Format the output so it is meaningful.
  3. Investigate about long decimals and numerical accuracy in python. 
  4. Find an example where limited numerical accuracy can mess with your results.
  5. How can you avoid/fix this behaviour?
3. **Comments**
  1. Write a program that uses every new concept you have learned in this notebook so far.
  2. Write comments that explain each section of your program.
  3. For each new thing the program does, write a line of output that explains what happened.

# 3. Data containers

## Lists

* lists are collections of items, stored in a variable
* there is no restrictions on what can be stored in a list
* lists are declared using squared brackets
* items are separated with commas

##### Example 

In [None]:
data_science = ['reproducible', 'transparent', 'open', 'scalable']
for trait in data_science:
    print ("data science is {}".format(trait))

##### List index

You can access a specific item in a list using its **index** (i.e. position) in the list.  
**NOTE:** the list index starts with zero.

In [None]:
first_trait = data_science[0]
print(first_trait)

You can also use negative indices to traverse the list from end to start.

In [None]:
last_trait = data_science[-1]
print(last_trait)

##### Index error 

In [None]:
data_science[4]

If you try to access an index that lies outside the length of the list, you will get an **IndexError**.  
**HINT:** you can check the number of elements in a list using the ```len()``` functionality.

In [None]:
num_elements = len(data_science)
print('data science has {} traits'.format(num_elements))

## Accessing list elements

##### Looping over lists 

This is one of the most important concepts related to lists. You can have a list with a million items in it, and in three lines of code you can write a sentence for each of those million items.  

We use a loop to access all the elements in a list. A loop is a block of code that repeats itself until it runs out of items to work with, or until a certain condition is met. In this case, our loop will run once for every item in our list.

In [None]:
for trait in data_science:
    print(trait)

What happened?
* ```for``` is a keyword in python that is used to start a _loop_
* ```trait``` is a temporary variable that is only known to the program _within_ the loop
* the temporary variable has the value of each of the items in the list for each run of the loop respectively
* the loop will run four times as the number of items in the list is four and then finish.  

The ```for``` keyword is also an example of a control structure (more on that later).  
**NOTE:** The colon after the first line is how python knows that the control element just ended.  

##### Inside and outside loops

Python uses indentation to decide which parts of the code are inside and outside a loop. Use at least a ```tab``` to make your code clearly readable!

In [None]:
for trait in data_science:
    print('important trait:')
    print(trait + '\n')

print("It's time to change the example list...")

In [None]:
cool_tools = ['jupyter', 'git', 'xenodo', 'bokeh']

##### Enumerating 

While looping through a list, you might be interested in the index of the current item. There are two neat ways of achieving that:

In [None]:
#access the index of a certain item in the list
print(data_science.index('open'))

In [None]:
#access every index of every item in the list
for index, tool in enumerate(cool_tools):
    print('{}. cool tool is {}'.format(index, tool))

We can also modify values the loop _returns_ to us:

In [None]:
print("actually zero doesn't make much sense for an index...\n" )
for index, tool in enumerate(cool_tools):
    print('{}. cool tool is {}'.format(index + 1, tool))

##### Iteration keywords 

You can use two additional keywords _inside_ loops like a ```for``` and a ```while``` loop:  
* **break** will immediately _end_ the loop and exit it
* **continue** will _skip_ to the next iteration step

##### Slicing lists 

A very powerful concept for accessing lists is slicing. It allows you to access any subset of the items in the list. For this we access the list using
* the start index
* the stop index
* the stepsize (which is optional and 1 per default)  

```[start : stop (: step)]```

In [191]:
numbers = [1, 2, 3, 4, 5, 6, 7]
#items in the middle of the list using only start and stop
print(numbers[2:5])

#all even numbers 
print(numbers[1:-1:2])

#first two items
print(numbers[0:2])

#last four items (notice: ommiting 'end' means 'until list end')
print(numbers[-4:])

[3, 4, 5]
[2, 4, 6]
[1, 2]
[4, 5, 6, 7]


##### Copying lists

Note that you can assign list slices to variables and you can make a copy of the whole list like this:

In [198]:
#assign a new variable to a slice
strings = ['here', 'there', 'and', 'all', 'over']
view = numbers[0:2]
print(view)

#copy the list using the slicing syntax
copied_strings = strings[:]
print(copied_strings)

['here', 'and']
['here', 'there', 'and', 'all', 'over']


## List operations

##### Modifying elements 
You can change the value of any list element by accessing it via its index:

In [None]:
print(cool_tools)
cool_tools[2] = 'scipy'
cool_tools[3] = 'numpy'
print(cool_tools)

##### Checking for existence 

If you don't know, if a certain item is in a list, you can use the **in** keyword to check

In [None]:
print('numpy' in cool_tools)
print('xenodo' in cool_tools)

##### Adding items to a list 

If you want to add items to a list you can use
* **insert()** to add an item at a certain position
* **append()** to add an item to the end of the list
* **extend()** to append another list to your list

In [None]:
string_list = ['a', 'b', 'c']
number_list = [1, 2, 3]

In [None]:
#insert
string_list.insert(2,'z')
print(string_list)

In [None]:
#append
string_list.append('y')
print(string_list)

In [None]:
#extend
string_list.extend(number_list)
print(string_list)

##### Removing items from a list 

If you want to remove items from a list, you can use
* **remove()** to remove a certain item by value
* **del** to remove an item at a certain index
* **pop()** to remove and return the last item in the list

In [180]:
cool_tools = ['jupyter', 'git', 'xenodo', 'bokeh']
cool_tools.remove('jupyter')
print(cool_tools)

['git', 'xenodo', 'bokeh']


In [181]:
del cool_tools[1]
print(cool_tools)

['git', 'bokeh']


In [182]:
last_item = cool_tools.pop()
print(last_item)

bokeh


##### Empty lists 

Sometimes it can be useful to initialize an empty list that can be filled with items later on.

In [177]:
letters = ['a', 'b', 'c', 'd']
#here we initialize an empty list
alphabet = []

#we fill the new list in the loop
for letter in letters:
    alphabet.append(letter + letter)
    
print(alphabet)

['aa', 'bb', 'cc', 'dd']


##### Sorting and reversing lists 

We can sort lists depending on their content. Strings will be sorted alphabetically by default, numbers numerically. We have two options:  
* **sorted()** will keep the original order or the elements
* **sort()** will modify the order of the elements

In [173]:
#using sorted()

letters = ['b','a','z','c','y']
#Print the letters in alphabetical order but keep the original order
print('Letters in alphabetical order')
for letter in sorted(letters):
    print(letter)
    
print('\nLetters in reverse alphabetical order')    
#Print the letters in reverse alphabetical order but keep original order
for letter in sorted(letters, reverse=True):
    print(letter)
    
print('\nOriginal list order')    
#Show that the original order is preserved
for letter in letters:
    print(letter)

Letters in alphabetical order
a
b
c
y
z

Letters in reverse alphabetical order
z
y
c
b
a

Original list order
b
a
z
c
y


In [175]:
numbers = [10, 2, 5, 3]

#reverse() reverses the order of the list and is permanent
numbers.reverse()
print(numbers)

[3, 5, 2, 10]


In [201]:
#increasing order using sort()
numbers.sort()
print(numbers)

#decreasing order
numbers.sort(reverse=True)
print(numbers)

['all', 'and', 'here', 'over']
['over', 'here', 'and', 'all']


##### Functionality for numerical lists

For lists containing only numbers, we have some special helper functions:
* **range(start, stop, step)** helps us create large lists of numbers
* **min()** returns the smallest item in a list
* **max()** returns the largest item in a list
* **sum()** returns the sum of all items in the list

In [206]:
#the range() function to print the first ten odd numbers:
for number in range(1,21,2):
    print(number)

1
3
5
7
9
11
13
15
17
19


To turn this into a list, we can use the **list()** function

In [208]:
#create a list of the first 15 numbers. NOTE: starts with 0!
numbers = list(range(15))
print(numbers)

#print min, max and sum
print('the minimum is: {}\nthe maximum is: {}\nthe sum is: {}'\
     .format(min(numbers), max(numbers), sum(numbers)))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
the minimum is: 0
the maximum is: 14
the sum is: 105


## List comprehensions

Creating more complicated lists manually is tedious. If we wanted to create a list of the first ten square numbers, it takes us three lines of code:

In [211]:
#make an empty list to store the squares
squares = []

#loop through the numbers, square them and append them to the list
for number in range(10):
    squares.append(number**2)
    
#make sure the result is correct    
print(squares)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


We can do that in a more _pythonic_ way, using a _list comprehension_:

In [213]:
squares = [number**2 for number in range(10)]
print(squares)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


What did just happen?
* the part after the **for** keyword initializes a loop that runs ten times and returns the next number every time
* the part before the **for** keyword performs an operation on the number
* the square brackets "channel" the results into a list which is then assigned to the variable ```squares```

##### A few more examples

In [219]:
#dividing every number by 2
double = [number/2 for number in numbers]
print(double)

#adding the sum of numbers to every item
sums = [number + sum(numbers) for number in numbers]
print(sums)

#works with strings too
names = ['deb', 'dimitra', 'jana']
narcicists = [name + ' the great' for name in names]
print(narcicists)

[0.0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 6.5, 7.0]
[105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119]
['deb the great', 'dimitra the great', 'jana the great']


## Tuples

* Basically, tuples are lists that can never be changed (e.g. _immutable_). 
* This can be handy in some cases where you want to make sure, that the vaules/positions in your container stay consistent.
* Tuples are denoted by round brackets ().

In [247]:
very_important_parameters = (1,2,3)
print(very_important_parameters[0])

1


## Sets

* Sets are **unordered** collections of **unique** objects
* Sets are very efficient for membership testing
* Set can be used to perform mathematical operations like union and intersection
* Sets are denoted by curly brackets {}
* Sets can be created from other containers using the **set()** function

In [246]:
#sets contain only unique elements
letters1 = ['a','b','c','d','b','a']
letters1 = set(letters1)
print(letters1)

letters1.add('e')
print(letters1)

letters1.add('a')
print(letters1)

{'a', 'c', 'd', 'b'}
{'a', 'c', 'd', 'b', 'e'}
{'a', 'c', 'd', 'b', 'e'}


In [244]:
#membership testing
'b' in letters1

True

In [242]:
#operations on sets:
letters2 = {'c','d','e','f','g','h'}

#intersection
print(letters1.intersection(letters2))

#union
print(letters1.union(letters2))

{'c', 'd', 'e'}
{'g', 'b', 'f', 'a', 'd', 'h', 'c', 'e'}


## Dictionaries

* Dictionaries are a way to store information that is connected in some way.
* Dictionaries store information in key-value pairs
* Dictionaries do not store information in any particular order
* Syntax: ```{key : value} ```
* like with indices in lists, we can access values by using the key to access the dictionary

##### Example 

In [259]:
#a dictionary of the word and meaning of different data containers
#in python
concepts = {'list': 'A mutable, ordered collection of values',
            'tuple': 'An immutable, ordered collection of values',
            'set': 'An unordered collection of unique values',
            'dictionary': 'An unordered collection of {key:value} pairs'}

#access the value by using dictionary[key]
for key in concepts.keys():
    meaning = concepts[key]
    print('Word: {}'.format(key))
    print('Meaning: {}\n'.format(meaning))

Word: set
Meaning: An unordered collection of unique values

Word: tuple
Meaning: An immutable, ordered collection of values

Word: dictionary
Meaning: An unordered collection of {key:value} pairs

Word: list
Meaning: A mutable, ordered collection of values



##### Adding and removing items

In [260]:
#add a new entry
concepts['function'] = 'A named set of instructions'

#remove an existing entry
del concepts['tuple']

#print the dictionary
for key in concepts.keys():
    print('{} : {}'.format(key, concepts[key]))

set : An unordered collection of unique values
function : A named set of instructions
dictionary : An unordered collection of {key:value} pairs
list : A mutable, ordered collection of values


# 4. Exercises 04: data containers

1. **Lists and loops**
  1. Create a list containing strings that form a sentence
  2. Concatenate all the list items into a single string by looping over the list
  3. Create a list containing integers
  4. Perform a mathematical operation on every item in the list by looping over the list
  5. Print the result of the calculation for every iteration of the loop
  6. Make a list with all integers from zero to ten.
  7. Print all even and then all odd numbers in the list. Print only the first 4 and then the last 4 elements in the list.
  8. Set every second element in the list to zero using a slice.
2. **List operations**
  1. Create a list containing strings using some combination of ```append()```, ```insert()``` and ```extend()```
  2. Create a second empty list. Loop over the first list and multiply each item with its index. Store the results in the new list.
  3. Print the length of the new list and the total number of characters in all strings of the list.  
  HINT: strings themselves can be perceived as lists and have a ```len()```.
  4. Create a list of the first 50 even numbers using ```range()```. Calculate the sum of the numbers once using a for loop and once using the ```sum()``` function.
3. **List comprehensions**
  1. Create a list of the first ten cubes using a list comprehension
  2. Make a list with the first 100 elements of the fibonacci series. What are the problems when trying to use a list comprehension for this?
4. **Tuples and sets**
  1. Try to change the value of an item in a tuple.
  2. Make two sets of ten letters each, five letters should be similar, five different for each set.
  3. Use set operations to show that the symmetric difference of two sets is equivalent to the union of both relative complements of the sets.
5. **Dictionaries**

# 5. Control structures

# 6. Exercises 05: control structures

# 7. Functions and libraries

# 8. Exercises 06: functions and libraries

TODO:
* regular expressions?
* data containers body
* data containers exercises
* control structures body
* control structures exercises
* functions body
* functions exercises
* link outline to headings
* proofread