# The Python Way
This notebook will give you an idea of what Python is. Any cell in the notebook can be executed, by pressing `Shift + Enter`

## What is Python?
From the excellent [Python tutorial](https://docs.python.org/3/tutorial/): 

"Python is an easy to learn, powerful programming language. It has efficient _high-level data structures_ and a simple but effective approach to object-oriented programming. Python’s _elegant syntax_ and _dynamic typing_, together with its _interpreted nature_, make it an ideal language for scripting and rapid application development in many areas on most platforms."

I would add that Python is indeed easy to learn, but hard to master!

This notebook will start explaining the _italic concepts_ in the above descripton.

### 1 Interpreted nature
Python is interpreted. What does that mean? Python is a program that reads your code and performs the operations you have programmed. Reading and performing your code happens at the same time and is done by the Python interpreter. The Python interpreter makes assumptions and guesses about what you mean. Python can also reflect on the operations it is performing.



In [None]:
1 + 1

After pressing `Shift + Enter` in the previous notebook cell, the Python intepreter is passed the string `1 + 1`. It tries to understand what it means, it computes the result and returns the result when ready. In our case, the jupyter notebook communicates with the Python kernel, which contains a Python interpreter.

### 2 Dynamic typing
A data type can be things such an integer, float, string or list. In many programming languages types are in your face when writing code, with Python this is not the case.<br>
Yes Python has types, but the general idea is not to think about them too much. Why so? Python typing is what they call 'duck typing'. It goes like this:

_If it walks like a duck._<br>
_It it quaks like a duck._<br>
_It is most likely a duck._<br>

In other words: Your Python code should focus on the properties and capabilities of the values you are manipulating and not on its specific type.

**T**his is all nice and philosphical, but can you be more concrete?<br>

Python checks data types on the fly, so you will get hit by errors about mismatching [types](https://docs.python.org/3/library/stdtypes.html) while running a program.

Python has a really flexible type system. This means It will not complain often about wrong data types, but it might be doing something else then you expect.

Some examples:

In [None]:
num_repeats = 10  # integer
pattern = 'snake'  # string
print(pattern * num_repeats)  # integer * string!??

In [None]:
bad_monty_python_movies = None  # did you know Python comes from Monty Python?
bad_monty_python_movies + 1  #  Causes TypeError, can not add None Type with Integer Type

### 3 Elegant syntax
Python code can be beautiful in an elegant way. The following code blocks have the same functionality, which one do you like best?

example 1

In [None]:
patients = ['A', 'B']
for patient in patients:
    print('subject {patient}'.format(patient=patient))

In [None]:
patients = ['A', 'B']
for i in range(len(patients)):
    print('subject ' + patients[i])

example 2

In [None]:
number_of_missing_values = 0  # this is assumed to be a positive integer
if number_of_missing_values > 0:
    print('there are missing values')

In [None]:
number_of_missing_values = 0
if number_of_missing_values:
    print('there are missing values')

example 3

In [None]:
number_of_examples = 2
number_of_examples = number_of_examples + 1
print('The number of examples is ' + str(number_of_examples) + '.')

In [None]:
number_of_examples = 2
number_of_examples += 1
print('The number of examples is {number_of_examples}.'.format(number_of_examples=number_of_examples))

### 4 High-level data structures
'High-level' is an often misunderstood word. In the context of Python it means: easy to work with data structures with advanced features.<br>
The set of available data structures is limited, but powerful enough to match most needs.<br>
Python relies heavily on its **Dictionary** data type.

### 4.1 Dictionary
Think about a language dictionary. How does it work. There is a keyword like _'hash'_ and a list of meanings, in this case one is _'a mess, jumble, or muddle'_. A Python dictionary is a key and a value container and contains these (key, value) pairs in an _efficient_ manner. The time it takes to get a value out of a dictionary with a key is approximately constant for all keys.

In [None]:
# A dictionary literal is recognisable by the two mustaches
{}
# it can also be defined by the function
dict()

# to put a key, value pair in a dictionary you can write it as a literal
{'key': 'value'}

# you can create a dictionary from such a literal
medicine = {'headache': 'ubiprofen'}

# you can extend an existing dictionary by adding a value, while registering its key
medicine['fever'] = 'aspirine'

Let's write a basic program with dictionaries

In [None]:
sickness = {'A': 'fever', 'B': 'headache'}
for patient in sickness:
    print('patient {patient} needs {medicine}.'.format(patient=patient,
                                                       medicine=medicine[sickness[patient]]))

#### 4.1.1 Lazy dictionary keys and values
Python 3 has moved to a heavier use of lazy iterators compared to Python 2. One clear difference is the behavior of the keys and values attributes of a dictionary. In Python 2 these were straight lists.

The lazy iterators are more optimal, because a dictionary is really dynamic and the keys it contains can change at any moment.

In [None]:
print('lazy collection of keys:', medicine.keys())

# The keys attribute of a dictionary are of type 'dict_keys', how fitting.
print(type(medicine.keys()))

# to create a list of keys we have to force it to a list
print('list of keys:', list(medicine.keys()))

print('lazy collection of values:', medicine.values())
print('list of values:', list(medicine.values()))

### 4.2 Nesting
Python data structures can be nested. Here we will show an example for the dictionary.

In [None]:
inception_review = {'summary': 'It is ok. It got hyped a lot.'}
movie_reviews = {'inception': inception_review}
product_reviews = {'movies': movie_reviews}

print(product_reviews['movies']['inception']['summary'])


##### Warning: Keep your data structures as flat as is appropriate
Yes we can nest these two dictionaries, but do we want to?
Yes we can put a list of list in a dictionary of a dictionary, but what does it mean?<br>
Always ask yourself questions like these when nesting data structures. If you are not critical enough, you might end up with a hard to understand data structure, which becomes slow to proces.

Appropriate

In [None]:
personal_data = {'first_name': 'Alice', 'partner': 'Bob', 'credit_card_number': '5105105105105100'}
personal_data_partner = {'first_name': 'Bob', 'credit_card_number': None}

persons = {'Alice': personal_data, 'Bob': personal_data_partner}

Inappropriate

In [None]:
personal_data_partner = {'first_name': 'Bob', 'credit_card_number': None}
personal_data_alice = {'first_name': 'Alice', 'partner': personal_data_partner, 'credit_card_number': '5105105105105100'}
print(personal_data_alice)

### 4.3 List
The List is the next most powerful data structure in Python, but not the most efficient. It is a **dynamic array**.<br>
It is an Array, because data elements can be inserted and deleted with an index value (starting at 0). And it is dynamic, because items can be inserted and deleted from anywhere in the list. In this sense a Python `list` is similar to a Javascript Array.

**nota bene** The elements in a list have a defined order (from left to right). The elements in a vanilla dictionary do not have any (guaranteed) order.

In [None]:
# A list can be written as the literal
[]
# or the function
list()

# elements are ordered from left to right
fruit = ['apples', 'pears', 'oranges']

# first element
print(fruit[0])

# second element
print(fruit[1])

# the odd comparison
print('apples vs. pears:', fruit[0] == fruit[1])

# length of a list
print('fruit length:', len(fruit))
assert len(fruit) == 3  # assert that there are three elements in a list

fruit[4]

Whenever you need order in your program you will probably use a list or a tuple (up next). A list is dynamic and can be indexed, sliced and cut. You will often iterate over a list, meaning getting each element one-by-one and performing some block of code on it.

Let's look at how a for loop can consume a list.

In [None]:
my_top_five_favorite_foods = ['pizza', 'hamburgers', 'fried egg', 'quinoa salad']

for food in my_top_five_favorite_foods:
    print('one of my favorite foods is: {food}'.format(food=food))


for index, food in enumerate(my_top_five_favorite_foods):
    print('favorite food number {index} is {food}'.format(index=index, food=food))

Please note the use of `enumerate` to get the index of an element in a sequence while looping over it. This makes things a lot more elegent.

### Zero indexing!!! Zero indexing!!!
hmm what does favorite food number `0` mean? Is it my favorite or my least favorite?<br>
Everything in Python starts at `0`. This is a major difference between languages like C, Python and Matlab, Julia.
Thus it is my most favorite food :)

Please be aware of zero-indexing and try to see if any bugs you have in your code depend on you assuming one-indexing.

### 4.3.1 Slicing sequences
A List is a [Sequence](https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range) of elements. It can be sliced! Slicing means cutting the Sequence with a start, end and a step. Slicing is done by way of [Ranges](https://www.pythoncentral.io/pythons-range-function-explained/). 



In [None]:
# Let's slice some pizza!
pizza_ingredients = ['pineapple', 'salami', 'cheese', 'mushrooms', 'ham']

# Can I have the pizza without the pineapple please?
start_index = 1
my_pizza = pizza_ingredients[start_index:]  # starting from 1 to the end of the list
print('no pinapple', my_pizza)

# Can I have the pizza without mushrooms and ham?
end_index = 3
my_pizza = pizza_ingredients[:end_index]
print('no mushrooms and ham', my_pizza)

# Can I have only pineapple and cheese?
start_index = 0
end_index = 4
step_size = 2
my_slices = pizza_ingredients[start_index:end_index:step_size]
print('I will eat', my_slices)
print('I will eat', ' and '.join(my_slices))




Look at the last example, why was 'ham' not included in the slice?

Weird eh?<br>
This is due the nature of zero-indexing.<br>
If I ask python to give me the first 4 element, 
it will give me:<br>
element 0, 1, 2, 3 (4 is **not** included)<br>
If I ask Matlab or Julia I get:<br>
element 1, 2, 3, 4 (4 is included)

so:

In [None]:
list(range(0, 5, 2))

In [None]:
assert list(range(4)) == [0, 1, 2, 3]
assert len(pizza_ingredients[0:4]) == 4
assert list(range(1,4)) == [1, 2, 3]
assert len(pizza_ingredients[1:4]) == 3

### 4.4 Tuple
The tuple is a highly optimized data structure. Once you created a Tuple, you can never go back. It is **immutable**. Mutability will be described later on.

A tuple can only be created and never changed. If you try to alter a tuple in Python code it will throw a `TypeError`

In [None]:
# A tuple literal is complicated. It can not be empty and it demands an ','
(1)  # is not good, because it can not be distinguished from redundant braces
(1,)  # is a tuple literal with one element

# The tuple function can create an empty tuple, which is quite useless :)
tuple()

# Let's create a nested tuple and see if we can alter it after creation
twins = (('harry', 'richard'), ('alice', 'eve'))
twins[0][0] = 'linda' # Let's overwrite harry to linda.

## Exercise: Data types and structures
Suppose you are the IT guy at a local school.
What would be a good way to store all the grades for a course for one student (f.i. History)? (grades are on a scale from 1 to 10).<br>
Provide an example below.

Most courses have multiple grades and re-exams. How will you store the grades for each course?<br>
_Remember, explicit is better than implicit_

Of course there are more than one subject in high school. How would you store grades across subjects? 

The school has many students. How will you store all the grades across students?

## 5 Compare and manipulate data structures

### 5.1 Test if an element is in a container
Comparing the container (data structure) with other containers is not so useful.<br>
Checking whether a certain element exists within a container certainly is!

In [None]:
blocked_countries = ('US', 'AU', 'GB', 'CA')

# Check if country is blocked
country = 'GB'
country_is_blocked = country in blocked_countries
print('{country} is blocked: {is_blocked}'.format(
     country=country, is_blocked=country_is_blocked))

# Check if customer is blocked
ipv6 = '001:0db8:85a3:0000:0000:8a2e:0370:7334'
blocked_ips = [ipv6]

customer = {'id': '100000521',
            'country': 'NL',
            'ip': ipv6}

customer_is_blocked = customer['country'] in blocked_countries or customer['ip'] in blocked_ips
print('customer {customer_id} is blocked: {is_blocked}'.format(
    customer_id=customer['id'], is_blocked=customer_is_blocked))

## Exercise
Generate a list of 10 elements  of random integers between 0 and 100 an test whether a number between 0 and 100 is in the list. (hint: check the `random` library [link](https://docs.python.org/3/library/random.html))

### 5.2 Identity and equality
To compare identity and equality are different things in Python. A one-egg human twin might be almost equal, but they are not identical.<br>
In the same sense two mutable objects might be equal, they will never be identical. Unless there is actually only one underlying object.

Equality is checked with the `==` operator.
Identity is checked with the `is` operator

In [None]:
transactions = ['gas', 'food', 'rent', 'transport']
copy_of_transactions = transactions[:]  # slicing all elements in a Python list makes a copy of the list

print('equal:', transactions == copy_of_transactions)
print('identical:', transactions is copy_of_transactions)


It is adviced to **only** use `is` with atomic values like `None`, `True`, `False`.

In [None]:
age = None

if age is None:
    print('missing value for age')

Using `is` where you actually want to check for equality can lead to subtle bugs.

In [None]:
assert 1 is 1

In [None]:
assert {'check': True} is {'check': True}

### 5.3 Lets convert between dictionary, list and tuple
There are many ways to switch between a dictionary, list and tuple.

One of the nicest one is to `zip` various Sequences and create a dictionary out of them.

In [None]:
ingredients = ['ham', 'bacon', 'spam']
quantities = [1, 3, 2]

zipped = zip(ingredients, quantities)  # this creates an iterator of (ingredient, quantity) pairs

# let's empty the iterator into a tuple
zipped = tuple(zipped)
print('tuple:', zipped)

print('dictionary:', dict(zipped))

Let's convert the keys and values in a dictionary to lists.

In [None]:
shopping_list = dict(zip(ingredients, quantities))

keys = list(shopping_list.keys())
print('keys', keys)

values = list(shopping_list.values())
print('values', values)

## 6 Advanced
### 6.1 Advanced: Mutability
A mutable data structure can be changed without loosing its identity. In Python terms the 'identity' of an object is a unique, constant integer (or long integer) that exists for the length of the object's life.<br>

When you update a dictionary or list you keep the same dictionary or list.<br>
But, when you update a string or tuple, you get a completely new string or tuple.

If an object can be updated without changing its identity it is 'mutable'.


Let's look at a few examples

In [None]:
name = 'John'
print('before', id(name))

name = name + ' Smith'
print('after', id(name))

The long integer is the `id` of the object.<br>
Observe that the `id` is different after you updated the `name` variable with the last name ' Smith'.

In [None]:
fruits = ['apples', 'pears']
print('before', id(fruits))

fruits.append('oranges')
print('after', id(fruits))

Observe that the `id` is **identical** after you updated the `fruits` variable, by appending 'oranges'.

### 6.2 Advanced: Hashable
Hashable data structure types are strongly related to immutable data structure types. Aha, two weird words, interesting! 

**Main point:** To function as a key in a dictionary a data type has to be hashable and thus immutable.

A list such as `fruits` is not hashable. It's magic hash function `__hash__` is not defined and is equal to `None`.

In [None]:
fruits.__hash__ is None  # a list is not hashable

In [None]:
name.__hash__ is not None  # a string is hashable

A string is hashable. Calling its hash function returns a weird number.

**note** you can read more about hash functions [here](https://en.wikipedia.org/wiki/Cryptographic_hash_function).

In [None]:
name.__hash__()

Let's see if this number changes when we transform the string to a new string.

In [None]:
name = 'John'
print('before', name.__hash__())

name = name + ' Smith'
print('after', name.__hash__())

The hash value of two immutable objects is identical only if the immutable objects have the same value or content.

In [None]:
name = 'John Doe'
same_name = 'John Doe'
different_name = 'John Smith'

assert name.__hash__() == same_name.__hash__()
assert name.__hash__() != different_name.__hash__()

### 6.3 Advanced: Dictionary keys, immutability and hashable?

So we found out that immutabile types can not be altered after they have been created. This ensures that during their lifetime they will always have the same hash value.

A dictionary is also called a HashMap (f.i. Java). The key of a dictionary is hashed before use in the internals of the dictionary. Thus only immutable/hashable types can be used as key in a dictionary.


In [None]:
routing_scheme = {}

domain = 'rabobank.nl'
ip = '10.100.10.1'
routing_scheme[domain] = ip
routing_scheme[ip] = domain

print(routing_scheme)

domain = 'internetbankieren.nl'
ip = (10, 100, 10, 2)
routing_scheme[domain] = ip
routing_scheme[ip] = domain

print(routing_scheme)

domain = 'internalsystems.nl'
ip = [10, 100, 10, 3]
routing_scheme[domain] = ip
routing_scheme[ip] = domain

At the moment that we try to use a mutable type, in this case a list, as a key in a dictionary we get a `TypeError` (unhashable type: 'list')

## 7 Wrap-up
This notebook has given you some insights in the nature of Python. These will become more relevant as you start your journey to write more and more Pythonic code. Thank you for your attention.

## Answers Data types and structures
Suppose you are the IT guy at a local school.
What would be a good way to store a grade for a course (f.i. History)? (grades are on a scale from 1 to 10).<br>
Provide an example below.

In [None]:
history_grade = 7.5

Most courses have multiple grades and re-exams. How will you store the grades for each course?
Remember, explicit is better than implicit

In [None]:
history_grades = {'exam_first_semester': [4, 7], 'exam_second_semester': [7.5]}

_Does it matter in what order the elements in the above dictionary are written?_<br>
Yes it does matter, but only for readability. Try to be alphabetical.<br>
It does not matter for the actual value of the dictionary, because vanilla dictionaries do not have any garuanteed internal ordering.


Of course there are more than one subject in high school. How would you store grades across subjects? 

In [None]:
biology_grades = {'exam_first_semester': [5, 8], 'exam_second_semester': []}
grades = {'history': history_grades, 'biology': biology_grades}

The school has many students. How will you store all the grades across students?

In this solution we use f-strings (Python >= 3.6).<br>
Take a look at `defaultdict` and see if you can improve the code.

In [None]:
import copy

history_grades = {'exam_first_semester': [], 'exam_second_semester': []}
biology_grades = {'exam_first_semester': [], 'exam_second_semester': []}
grades = {'history': history_grades, 'biology': biology_grades}

batch_of_students = 10
students = [f'2018{10000 + num}' for num in range(batch_of_students)]
student_grades = {student_id: copy.deepcopy(grades) for student_id in students}

student_grades['201810000']['biology']['exam_first_semester'].append(7.5)

In [None]:
import copy

from collections import defaultdict

history_grades = defaultdict(list)
biology_grades = defaultdict(list)
grades = {'history': history_grades, 'biology': biology_grades}

batch_of_students = 10
students = [f'2018{10000 + num}' for num in range(batch_of_students)]
student_grades = {student_id: copy.deepcopy(grades) for student_id in students}

student_grades['201810000']['biology']['exam_first_semester'].append(7.5)
