# Python Basics

This is an interactive Jupyter Notebook explaining the very basics of Python that you'll need for the Machine Learning course.

Please make sure you understand everything mentioned here and are able to solve the exercises in the next notebook!

This is only a very brief introduction, mostly intended for people who already know a bit about computer science. If you feel this does not provide enough information for you, please consider looking at these additional references:
- [Official Python Tutorial](https://docs.python.org/3/tutorial/)
- [W3 School Tutorials](https://www.w3schools.com/python/default.asp)
- [Video Tutorials on YouTube](https://www.youtube.com/playlist?list=PL-osiE80TeTt2d9bfVyTiXJA-UTHn6WwU)
- [Beginners Computer Science & Python Course on Udacity](https://eu.udacity.com/course/intro-to-computer-science--cs101)

For bonus points, also have a look at the official [**Python Code Style Guide**](https://www.python.org/dev/peps/pep-0008/). 

You can execute the code in the cells by clicking into a cell and pressing "shift"+"enter" or clicking the "run" button at the top. Feel free to experiment a bit by changing the code in the examples to see what happens. Don't worry if you get any error messages - consider them hints on how to do better ;-)

Instead of just running through the examples, **it helps to first think about what you would expect the output of a code cell to be and then execute it to see if you were right**. If you weren't, please make sure you understand why and what is happening in the code!

If you see a number in the square brakets next to a cell (e.g. `In [1]`), then the cell was already executed. If the brakets are empty, run the cell to execute the code in it and see the output. If you want to try out something else but not erase the given code, press the "+" button at the top to add a new cell below the currently selected cell. 

You have to execute all the cells starting at the beginning for everything to work as expected. If you change any variable names or their values, be prepared for different outcomes. You can execute cells multiple times and jump back and forth; the number next to the cell tells you the order of execution - if a cell below changes a variable used in the cell above it and you execute the cell above again, the output might be different from before as it uses the updated variable value. 

Should the code get stuck (`In [*]` means the cell is currently executing the code in it; in our examples this should never take more than a few seconds), click the "stop" button at the top or select "Kernel > Interrupt" from the menu bar.

### Using Python as a Calculator

Python understands basic math - just enter an expression and execute the cell to get your result below:

In [1]:
1 + 1

2

In [None]:
# This is a comment. It does not get executed. You can use it to explain your code.
# These are the symbols you can use in calculations (in addition to parenthesis "()"):
# +: addition
# -: subtraction
# *: multiplication
# /: division
# **: to the power of
(2 * 5)**3

In [None]:
# regular division returns a decimal number
5 / 2

In [None]:
# with "//" the divison result is rounded down
5 // 2

In [None]:
# with the modulo operator "%" you get the remainder of a division
5 % 2  # 5 = 2*2 + 1

### Variables

You already know variables from math. There, a placeholder like $x$ can stand for any kind of value. In programming this is similar, you just define a name and assign it a value and then you can work with these values in a more convienient way by adressing the variables that store them. This is especially useful if you want to change a value somewhere: since it's assigned to the variable, everywhere where you use the variable it uses the new value, you don't have to copy & paste it a hundred times :)

In [None]:
# define a "width" and a "height" variable by assigning them values with "="
width = 20
height = 5 * 9
# compute something with the variables instead of using the values directly
width * height

In [None]:
# Python only knows variables that you have previously defined;
# if you use a name that you haven't assigned a value yet, you'll get an error
n

In [None]:
# after we state what we mean by "n", we don't get an error anymore
n = 5
n

### "Hello world!"
The standard example in any programming language is to have your code say "Hello world!". In Python this is really simple: you can output anything by just wrapping it in `print( )` like this:

In [None]:
print("Hello world!")

In [None]:
# Exercise: change the print statement below to greet yourself:
print("Hello John!")

In [None]:
# in a Jupyter notebook cell, you only get the output from the last line,
# i.e., this will only give you back the value of height, not width:
width
height  # - what is the output if you switch the two?

In [None]:
# with print you can now output the values of multiple variables, not just the last one
print(width)
print(height)

### Datatypes

Your variables can store different kinds of data, e.g., numbers, text, or even more complex objects. Some basic datatypes are:
- `int`: Integers are whole numbers, i.e., without a decimal point
- `float`: Floating point numbers include the places after the decimal point
- `str`: "String" is just a fancy word for text
- `bool`: A Boolean contains either `True` or `False`, i.e. it's a binary value
- `None`: This symbolizes a missing value

#### Numbers

In [None]:
# this is an int
i = 8
# if you're unsure what kind of data a variable contains, check it with "type()"
type(i)

In [None]:
# this is a float
f = 1.6
print(f)  # print output of the variable value
type(f)   # regular output of the variable type

In [None]:
# you can explicitly transform one type of variable into another type,
# this is called casting
print(type(f))  # the f we defined before was a float
g = f           # g now contains the same thing as f
print(g)
print(type(g))
g = int(f)      # now g contains the integer version of f
print(g)
print(type(g))

#### Strings & String Formatting

In [None]:
# strings are created by putting quotes (single ' ' or double " ") around some text:
s = "hello"
s

In [None]:
type(s)

In [None]:
# be careful, some letters have a special meaning, e.g., "\n" means new line:
s = 'C:\some\name'
print(s)

In [None]:
# you can either "escape" these special symbols with an additional "\"
s = 'C:\some\\name'
print(s)

In [None]:
# or put an "r" in front of the string to signal that it is a "raw" string
s = r'C:\some\name'
print(s)

In [None]:
# note that Python internaly also escapes some special symbols.
# the regular output gives you the internal representation of the variable,
# while the print statement interprets it the way you would write it out
s

In [None]:
# by the way: it is very bad style to use such short variable names
# (like "s"), usually they should be a bit more descriptive. 
# but make sure to not use any special keywords for your variables,
# i.e. don't call a variable with a string "str"!
my_string = "this is a variable with a more descriptive name"

In [None]:
# you can't just add numbers, but strings as well
my_string + " " + "AND HERE IS SOMETHING EXTRA"

In [None]:
# but Python will complain if you add variables of different data types
my_string + 4

In [None]:
# you could manually transform (i.e. cast) the number into a string to add them
my_string + str(4)

In [None]:
# a nicer way to do something like this is string formatting
# this allows you to include the values of other variables in a string
# see here for an overview: https://zetcode.com/python/fstring/
formatted_string = f"This is the value of width: {width}"
formatted_string

In [None]:
# this also works for multiple variables
f"These are the values of width: {width} and height: {height}"

In [None]:
# this works with different datatypes
a_float = 5.896
a_str = "FOO\nBAR"
print(a_float)
print(a_str)
print(f"This is the printed output of the float a_float: {a_float} and the str a_str: {a_str}")

In [None]:
# by using repr() you can print the internal representation of the object
print(f"This is the printed output of the float a_float: {repr(a_float)} and the str a_str: {repr(a_str)}")

In [None]:
# with :f, additional float formatting option can be supplied
# by default the floats are shown with 6 places after the decimal point
# but the output can be adapted with .x:
print(f"This is the default output of the float: {a_float:f}")
print(f"This is the float rounded to 2 places after the decimal point: {a_float:.2f}")

In [None]:
# strings also have a bunch of methods that can be called on them to change the string
# for a complete list check the documentation: 
# https://docs.python.org/3/library/stdtypes.html#string-methods
a_str = "This is a sentence. And here we have another sentence, which is longer."
# everything in lower case
print(a_str.lower())
# calling such a method on a string returns a new string, the original variable remains unchanged
print(a_str)
# everything in upper case
print(a_str.upper())
# every word starting with a capital letter
print(a_str.title())
# replace spaces with underscores
print(a_str.replace(" ", "_"))
# split the string at spaces to get a list of words
print(a_str.split())
# split the string at the word "is"
print(a_str.split(" is "))  # exercise: what happens if you remove the spaces around "is"? try it!

#### Booleans & None

In [None]:
# booleans can be set to "True" or "False" explicitly
a_bool = True
print(a_bool)
type(a_bool)

In [None]:
# truthvalues are also created e.g. when two things are compared
a_bool = 5 >= 4
print(a_bool)
type(a_bool)

In [None]:
# also works with strings
a_bool = "HELLO" == "hello"
a_bool

In [None]:
# bools can also be returned by other methods
a_str = "This is new."
a_bool = a_str.startswith("This")
print(a_bool)
# check if a string only contains letters
a_bool = "Hey There!".isalpha() # exercise: change the string so that the boolean is True
print(a_bool)
# check if a string only contains numbers
a_bool = "110".isnumeric()
print(a_bool)

In [None]:
# None is mostly used as a placeholder
unknown_var = None
print(unknown_var)
type(unknown_var)

In [None]:
# the prefered way to check if a variable is set to None is to use the keyword "is", not "=="
a_bool = unknown_var is None
a_bool

### Collections
More complex datatypes allow us to store multiple values at a time. These include:
- `tuple`: A tuple contains multiple elements separated by `,`. Usually, these elements are grouped by `()`. A tuple is an immutable datatype, i.e., once it's been defined, the values in a tuple can't be changed anymore.
- `list`: A list is very similar to a tuple but with `[]` surrounding the items and the elements in a list can be changed after its creation.
- `set`: In a set, every item occurs only once and the elements are not in a specific order.
- `dict`: A dictionary (similar to a hash table in other programming languages) contains a mapping between keys and values. While the values can be arbitrarily complex datatypes, the keys have to be something "hashable", like an int, string, or tuple.

Check the [official documentation](https://docs.python.org/3/tutorial/datastructures.html) for more details, e.g., functions you can call on lists etc.

#### Lists

In [None]:
# a list can contain arbitrary elements, e.g., numbers, strings, or previously defined variables
a_list = ["hello", 14, 0.054, a_bool]
a_list

In [None]:
# by indexing the list you can get specific elements from it
# note: Python starts counting at 0!
a_list[0]  # get the first element

In [None]:
a_list[1]  # get the second element

In [None]:
a_list[-1] # get the last element

In [None]:
a_list[-2] # what will this return?

In [None]:
# with : you can specify a range of elements to be selected
a_list[1:] # get all elements starting with the second one until the end of the list

In [None]:
a_list[1:3] # the end point is exclusive, i.e. this gives you a_list[1] and a_list[2]

In [None]:
a_list[:2] # up to, but not including the 3rd element

In [None]:
a_list[0:4:2] # select from the first till the last element every other element (shorthand: [::2])

In [None]:
# check how many elements a list contains
len(a_list)

In [None]:
# add an element to a list
a_list.append(15)
a_list  # the list was changed in place

In [None]:
# add a list to the list (also in place)
a_list.extend([16, 17, 18])
a_list

In [None]:
# lists can also be added together, but this is not done in place but returns a new list
a_list + [19, 20]

In [None]:
# the original a_list is still without the two new elements
a_list

In [None]:
# the variable needs to be explicitly updated to contain the new elements
a_list += [19, 20]  # += is shorthand for a_list = a_list + [19, 20]
a_list

In [None]:
# you can also change elements in the list by selectively replacing them
a_list[0] = "HELLO"
a_list

In [None]:
# multiple values can be replaced by a new list
a_list[-3:] = [0, 0, 0]
a_list

In [None]:
# check if an element is in a list with the keyword "in"
print(14 in a_list) # bool
# if the element is in the list, you can also ask for its index (the first occurence if it's in there multiple times)
print(a_list.index(14)) # int: --> a_list[a_list.index(14)] gives you 14!

In [None]:
# a lot of operations on lists also work on strings.
# basically you can think of a string as a list of characters
a_str = "oh, hello world!"
# for example, the "in" check works
print("hello" in a_str)
# index gives you the starting position of the substring in the string
i = a_str.index("hello")
print(i)
# just like list indexing, you can also select parts of a string
print(a_str[i:i+len("hello")]) # what is happening here?

In [None]:
# count how often an element occurs in a list
print(['a', 'b', 'r', 'a', 'c', 'a', 'd', 'a', 'b', 'r', 'a'].count("a"))
# count how often a substring occurs in a string
print("abracadabrababababa".count("aba"))

#### Tuples

In [None]:
# tuples are really similar to lists but written without [] (usually with () for clarity)
a_tuple = (1, 2, "hey")
a_tuple

In [None]:
# you can also select elements
a_tuple[1:]

In [None]:
# but you can't change them!
a_tuple[0] = 15

In [None]:
# python supports assigning values to multiple variables at once
var0 = 8, "hey"        # var0 is a tuple with both values
print("var0:", var0)
var1, var2 = var0  # the 2 values are unpacked into the two variables
print(f"var1: {var1},   var2: {var2}")
# this is really handy for switching the values of 2 variables, as you don't need a temp variable
var1, var2 = var2, var1 + 1
print(f"var1: {var1}, var2: {var2}")

#### Sets

In [None]:
# the defining characteristic of sets is that they contain every element only once
# and the elements don't have a guaranteed order
# you can create a set from a list or with {}
a_set = set(['HELLO', 14, 0.054, True, 15, 16, 17, 0, 0, 0])
a_set  # the multiple 0 are removed

In [None]:
# intersections, unions, and differences can be computed on sets (remember math?)
b_set = {16, 17, 18}
print(a_set.union(b_set))        # all elements from a + b
print(a_set.intersection(b_set)) # all elements in both a and b
print(a_set.difference(b_set))   # all elements in a, but not b
print(a_set)  # these operations always returned new sets; the original a_set is unchanged

#### Dictionaries

In [None]:
# dictionaries map from arbitrary (hashable, i.e., "simple") elements (the "key") to anything else (the "value")
# the keys have to be unique!
a_dict = {"key": [1, 2, 4], 15: 16, 4: "ho", (0, "a"): 42}
a_dict

In [None]:
# instead of indexing a dict with positions, you use the keys to get the values
a_dict[4]

In [None]:
a_dict[(0, "a")]

In [None]:
# check if a key is in the dict (does not work for values!)
print(42 in a_dict)
# you'll get an error if you try to get an element for a non-existing key
a_dict[42]

In [None]:
# instead of asking with a key directly with [], you can also use .get()
# this allows you to specify a default value that is returned instead
# if the key was not found (otherwise you get None instead of an error)
print(a_dict.get(4))    # key exists
print(a_dict.get(42))   # key doesn't exist
print(a_dict.get(42, "Key not found"))  # return default value instead

In [None]:
# with a_dict.keys(), .values(), or .items() you get all the keys, values, or key-value-pairs in the dict
print(a_dict.keys())
print(a_dict.values())
print(a_dict.items())

In [None]:
# by accessing a dict with a key, the associated value can also be updated
print(a_dict[4])
a_dict[4] = "ha"
print(a_dict[4])

In [None]:
# just like for the other collections, len() gives you the number of key-value pairs in the dict
len(a_dict)

In [None]:
# you can add new elements by specifying a new key and the associated value
a_dict["new key"] = "hello world!"
len(a_dict)

#### range and zip

`range()` is a shortcut to get a list of integers. As we'll see later, it's optimized for iteration (e.g. used in a for-loop), but right now we'll just use it to create an example list.

`zip()` merges two lists and can be used as a shortcut to create a dictionary.

In [None]:
# use range to get 10 numbers, from 0 to 9
# if you want the numbers as a list, you need to cast it explicitly
# otherwise it's just something that can be iterated over
numbers_list = list(range(10))
numbers_list

In [None]:
# you can also specify start and (exclusive) end points
list(range(5, 16))

In [None]:
# you can also skip numbers, e.g., take every 5th number
list(range(0, 16, 5))

In [None]:
# zip is a shortcut to create dictionaries by merging two lists
# cast it into a dict to use it as a dictionary
list1 = list(range(6))      # numbers from 0 to 5
list2 = list(range(5, 11))  # numbers from 5 to 10
a_dict = dict(zip(list1, list2))  # mapping from 0 to 5, 1 to 6 etc
a_dict

#### min, max, and sorting

`min()` and `max()` are builtin functions to give you the smallest or largest element from a collection.
`sorted()` returns a sorted list from the given collection. There needs to be a meaningful way to compare the elements in the collection for this to work, i.e., all elements need to be of the same datatype.

In [None]:
# the functions do what you would expect...
max(-10, 15, 8.3)

In [None]:
max(-10, "hey", 15, 8.3)

In [None]:
# they also work directly on collections
a_list = list(range(11))            # numbers from 0 to 10
print(min(a_list))                  # smallest element in a_list
print(max(a_list))                  # largest element in a_list
print(sorted(a_list))               # list sorted from smallest to largest
print(sorted(a_list, reverse=True)) # list sorted from largest to smallest

In [None]:
# similar for strings (output is always a list though!)
print(sorted("This is a string. HELLO!"))

In [None]:
# sorted words by first splitting the string at " "
print(sorted("This is a string. HELLO!".split()))

In [None]:
# create a dictionary mapping from numbers to strings
a_dict = dict(zip(range(5), ["hello", "oh", "world", "John", "john"]))
a_dict

In [None]:
print(max(a_dict))                 # get the largest key
print(max(a_dict.values()))        # get the largest value

In [None]:
print(sorted(a_dict))              # sorted keys
print(sorted(a_dict.values()))     # sorted values

In [None]:
# when calling the min/max/sorted functions with a collection, you can pass another function
# as the "key" argument, which is then called and evaluated on every element from the collection
# and instead of using the elements themselves to determine the largest or smallest element,
# whatever is returned by the given "key" function for each element is used instead
# (more on this later when we discuss functions).
# you have already learned that a_dict.get() gives the value in the dict for a specific key.
# we can use this here to sort the keys in a dictionary based on their values:
print(max(a_dict, key=a_dict.get))     # get the key associated with the largest value
print(sorted(a_dict, key=a_dict.get))  # keys sorted by values

#### join

`join()` is a shortcut to concatenate strings in a list into a full string with arbitrary text between the given strings. It is the counter part to `split()`, which creates a list from a string by removing pieces of text in the string to split it.

In [None]:
list_of_strs = ["hello", "oh", "world", "John", "john"]
# join the strings in the list with nothing in between
"".join(list_of_strs)

In [None]:
# join sorted strings with newline and more text
new_str = "\nWORD ".join(sorted(list_of_strs))
# notice how the additional text is only placed in between the elements of the list,
# not at the beginning or end
print(new_str)

### Loops & Conditionals

So far we have only created and manipulated individual variables. Now we're taking a step towards actual programming by using loops (`for`, `while`) and conditionals (`if`/`else`).

As you have already noticed, multiple lines of code get executed sequentially from top to bottom. Loops can be used as shortcuts to execute the same code multiple times. Conditional statements add additional flexibility by allowing us to exectue some code only if certain conditions are met.

The syntax of loops and conditional statements is really simple, you just need to make sure you don't forget the `:` at the end of the statement and then indent the following code by 4 spaces.

`for`-loops are used to iterate over a predefined list of elements and in each iteration you have access to the current element.
```python
for XXX in YYY:
    # do something with the current XXX
# whatever you write outside of the indentation is executed after the loop is finished.
```

`while`-loops are less common; they are used if you don't know in advance when you'll be done with whatever you're computing in the loop.
```python
while ZZZ:
    # you land and stay inside here if ZZZ evaluates to True.
    # make sure that some point in the code you change something so that ZZZ
    # will eventually evaluate to False so you can exit the loop again and finish.
```

Conditional statements always need an `if` part (first check), then they can contain multiple `elif` ("else if") statements with other checks (if the first check has failed), and finally they can have an `else` statement, which catches all other cases. The `elif` and `else` statements are optional, but `else` always has to come last.
```python
if AAA:
    # do something if AAA evaluates to True
elif BBB:
    # if AAA was False but BBB is True, do something here
elif CCC:
    # if AAA and BBB were False but CCC is True, do something here
else:
    # if AAA, BBB, and CCC were False, you end up here
```

#### `for` loops

In [None]:
a_list = ["hello", 14, 0.054, False]
# iterate over the elements in the list with a for loop
for elem in a_list:
    # just print out the element
    print(elem)

In [None]:
a_dict = {1: "hello", 2: 14, 3: 0.054, 4: False}
# when you iterate over a dict, you iterate over the keys
for key in a_dict:
    # print out the key and its associated value
    print(f"key: {key}; value: {a_dict[key]}")

In [None]:
# the range() function is really handy if you want to execute some code X times
fib_numbers = [0, 1]
for i in range(10):
    fib_numbers.append(fib_numbers[-2] + fib_numbers[-1])
# do you know what we did here?
print(fib_numbers)

In [None]:
# another handy shortcut for for-loops is the enumerate() function
# with it you get both the element of a list as well as the index of this element in the list
for i, elem in enumerate(a_list):
    # i contains the index of elem
    print(f"Element {i}: {elem}")

#### `while` loops

In [None]:
# you can do the same thing with while loops as with for loops
# e.g. iterate through a list
i = 0
# you stop if i is larger than the last index of the list
while i < len(a_list):
    print(f"Element {i}: {a_list[i]}")
    # don't forget to increase i, otherwise you would print the same line until the end of times!
    i += 1  # shortcut for  i = i + 1

#### `if` (/`elif`/`else`) statements

In [None]:
# exercise: set i to different values and see what happens
i = 42
if i < 10:
    print(f"{i} is less than 10")
elif i <= 100:
    print(f"{i} is between 10 and 100")
else:
    print(f"{i} is greater than 100")

In [None]:
# with if/elif/else blocks, only one of the code pieces is exectued
# (the first one where the condition is satisfied).
# if you have multiple if (instead of elif) statements below each other (without additional indentation),
# they all get checked sequentially and possibly executed.
i = 4
if i < 10:
    print(f"{i} is less than 10")
if i < 100:
    print(f"{i} is less than 100")

#### `and`, `or`, `not`, and nothing

As we have seen with `while` loops and `if` statements, booleans (i.e. anything that evaluates to `True` or `False`) critically determine the flow of our code. 

With `and` and `or` we can construct even more complex expressions by checking for multiple conditions at once. 

`not` is a keyword to invert the value of a boolean.

Furthermore, checks at `while` loops and `if` statement also work on variables directly, e.g., an empty list (`[]`), and empty string (`""`), the number `0`, or `None` all implicitly evalue to `False`.

In [None]:
i = 24
# "AAA and BBB" is only True if both AAA and BBB evaluate to True
if i >= 10 and i <= 100:
    print(f"{i} is between 10 and 100")

In [None]:
# "AAA or BBB" is True if either AAA, BBB, or both AAA and BBB evaluate to True
if i < 10 or i > 100:
    # no print output means we didn't get inside here
    print(f"{i} is either smaller than 10 or greater than 100")

In [None]:
# "not" can make your code more readable
stop = False
i = 0
while not stop:
    i += 1
    if i > 10:
        stop = True
# what value does i have? (remove the # before the print statement to see if you're right)
# print(i)
# exercise: put the line with i += 1 after the if block - what is the value of i then?
# (careful with indentation! *after*, not *inside* the if block...)

In [None]:
a_list = list(range(10))
# a non-empty list implicitly evaluates to True
while a_list:
    # pop removes and returns the last element
    elem = a_list.pop()
    print("Popped", elem)
print(a_list)

In [None]:
a_list = []
i = 1
while len(a_list) < 10:
    # "i % 2" returns the remainder if you divide the variable i by 2.
    # since the number 0 also evalues to False we enter the if block
    # if i is a number that is divisible without rest by 2
    if not (i % 2):
        a_list.append(i)
    i += 1
# which numbers does a_list contain and how many?
# print(a_list)
# print(len(a_list))

#### `any` and `all`

`any()` and `all()` are functions that can be applied to a list and work similar to `and` and `or`:

`any(a_list)` evaluates to `True` if any of the elements in `a_list` evaluate to `True`

`all(a_list)` evaluates to `True` if all the elements evaluate to `True`, i.e., the list does not contain e.g. `0`, `False`, `None`, `""`, etc.

In [None]:
a_list = [10, "hey", -0.2, True]
print("original list all:", all(a_list))
print("original list any:", any(a_list))
a_list.append(None)  # exercise: try something else here
print("modified list all:", all(a_list))
print("modified list any:", any(a_list))

#### `break` and `continue`

`break` and `continue` are special keywords that can be used inside loops to stop the computation early (`break`) or skip the computation for an iteration (`continue`).

In [None]:
# before you execute this: what do you expect the output to be?
for i in range(15):
    if i >= 10:
        break
    if i % 2:
        continue
    print(i)
print("Done with the loop!")

### List Comprehensions

List (and dictionary) comprehensions are a really neat way to quickly generate a list or dictionary in a single line. It's also more efficient than appending elements to a list or dictionary in a loop (due to the way memory is allocated). It might look a bit complicated at first, but once you get it you'll use it all the time!

The key to understanding list comprehensions is to think about what they would look like as ordinary for loops. They always have the structure:
```
[(expression) (for loops and if statements)]
```
Which is the same as
```
for loops and if statements:
    expression
```
i.e., the for loops and if statements are always in the same order, only that normally they are spread out over multiple lines with indentation, while in a list comprehension they are in a single line one after another and without a `:` between them. The expression that is normally inside one or multiple loops is at the beginning of the list comprehension.

Check out the [documentation](https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions) for more complex examples.

In [None]:
# simple example:
a_list1 = [i**2 for i in range(10)]
print(a_list1)

# the same in a for loop
a_list2 = []
for i in range(10):
    a_list2.append(i**2)
print(a_list2)

In [None]:
# you can also include if statements to filter the elements
a_list1 = [i**2 for i in range(10) if i % 2]
print(a_list1)

# the same in a for loop
a_list2 = []
for i in range(10):
    if i % 2:
        a_list2.append(i**2)
print(a_list2)

In [None]:
# or use multiple for loops
[(x, y) for x in [1, 2, 3] for y in [1, 2, 3] if x != y]

In [None]:
list_of_lists = [range(5), range(14), range(7), range(22)]

# a more complicated example that combines multiple for loops and if statements
a_list1 = [x for inside_list in list_of_lists if len(inside_list) > 5 for x in inside_list if x > 10]
print(a_list1)

a_list2 = []
# go through the list of lists
for inside_list in list_of_lists:
    # continue only if the inside list has more than 5 elements
    if len(inside_list) > 5:
        # go through the elements of the inside list
        for x in inside_list:
            # continue only if the element is larger than 10
            if x > 10:
                a_list2.append(x)
print(a_list2)

In [None]:
# dictionary comprehensions work the same, you just use {} instead of []
# and map key to values with k: v
a_dict = {i: i**2 for i in range(10)}
print(a_dict)

### Error handeling

As you've seen before, sometimes our code can throw an error, which stops the computation right then and there and the code exits with this error. Often, this is good, since it shows us that we didn't think of something and should go back and improve our code. Other times though, errors are expected behavior and we don't want them to cause our program to come to a complete halt. Instead, we can wrap code that might throw an error at some point (e.g. a request to a website that can be temporarily unavailable) in a `try`-`except`-block:
```python
try:
    # code that might throw an error
except:
    # the error has occured! what now?
```
Above is the simplest way of handeling all errors that might occur. Usually you want to be more specific, e.g., if you expect a specific type of error to be thrown in the `try` part, then only catch this kind error. This way you still might notice ways to improve your code since you still get all the unexpected error messages that signal that you missed something.

If you write
```python
except:
    pass
```
the error will simply be ignored and we move on.

In [None]:
sentence = "this is a string with many words some words occur many times this this is is is a a a a a"
# count how often each word occurs in the string above
# the dict should contain a mapping with {word: count}
count_dict = {}
for word in sentence.split():
    # for each word in the sentence we want to increase its count by one
    try:
        count_dict[word] += 1
    # you already know that we get a KeyError if we try to access or manipulate
    # the value associated with a key that doesn't exist yet.
    # here we can use this to initialize the count for this word in these cases
    except KeyError:
        count_dict[word] = 1
# we're done counting! create some nice output by printing the words with decreasing frequency
for word in sorted(count_dict, key=count_dict.get, reverse=True):
    # :10 is similar to :.2f: we tell the string how many spaces it should take up
    print(f"{word:10}: {count_dict[word]}")

### Functions

Functions are great to organize your code to avoid copy & paste of frequently used computations and reuse code in other projects. A function should always have a single clear purpose and be self-contained.

A function consists of at least two lines: the function signature, specifying the name of the function and its arguments, i.e., values you give the function based on which it computes something, and a return value, typically the result of the computation:
```python
def name_of_function(argument1, argument2, arg_with_default1=10, arg_with_default2="hey"):
    # some code
    return "Hello World!"
```
When you call a function, the order of the arguments matters (or you have to specify the argument names) and you have to provide at least as many values as there are arguments without default values. E.g., the above function could be called with:
```python
new_var = name_of_function(13, 5.7, arg_with_default2="ha")
```
Since we're leaving `arg_with_default1` at it's default value 10, we need to specify the name of the following argument to give it a new value, otherwise the function would just go by the ordering of the arguments and "ha" would be assigned to `arg_with_default1` instead.

`new_var` now contains the result of the function, i.e., whatever was returned (in this case just always the string "Hello World!", no matter what you had passed as arguments). A function can also return multiple values, in this case just catch the results with multiple variables:
```python
var1, var2 = function_with_2_return_values()
```

Careful: when you write a function, only access variables in that function that you have either defined in the function itself or passed as arguments. While it doesn't give you an error here in the notebook, normally (i.e. in regular scripts), the code inside a function does not know about any variables defined in the surrounding code. This also makes your code easier to debug.

In [None]:
# a simple function to add 2 numbers
def add_numbers(a, b=10):
    print(f"[add_numbers] a is {a}, b is {b}")
    return a + b

c = add_numbers(5, 9)
print(c)
# without the second argument we just add the default value to the given first value
d = add_numbers(5.5)
print(d)

In [None]:
# not specifiying enough arguments produces an error
e = add_numbers()

In [None]:
def square(x):
    return x**2

# functions can be used in many places, e.g., when sorting a dictionary, we have already seen that
# we can sort the keys by the values. With custom functions, we can sort by anything!
a_list = [0, 7, -5, 19, -6, 89, -100]
# sort the list based on the return values of the square function
# i.e., compute square(i) for every i in the list and then use these values to sort the original values
sorted(a_list, key=square)

In [None]:
# functions can also have multiple return statements - the first one that is encountered will be taken
def is_greater_than_10(a_number):
    if a_number > 10:
        # if it's greater than 10, we enter the block and exit the function early
        return True
    # otherwise we end up here and can return something else
    print("[is_greater_than_10] skipped the if block!")
    return False

print("return value for -5:", is_greater_than_10(-5))
print("return value for 10:", is_greater_than_10(10))
print("return value for 11:", is_greater_than_10(11))

#### Lambda functions

Lambda functions are anonymous functions, i.e., they are not defined by name and instead of the keyword `def`, they are defined with `lambda`. Usually, they are very short (one liners!) and straightforward, and therefore often used to define functions that are passed as arguments to other functions such as `sorted()` or `max()`.

For example, in the cell above, we sorted `a_list` by passing the `square()` function to the `key` argument to sort the values in the list based on their squared values. This can be done in a more concise way by using a lambda function instead of defining a separate function (especially if this function is not used anywhere else in the code):

In [None]:
# x is the argument that is passed to the function and what comes after the : is the return value
# (also works with multiple arguments, i.e., could also define, e.g., lambda x, y: x + y)
sorted(a_list, key=lambda x: x**2)

### Classes

Classes, the cornerstone of "object-oriented programming", are very useful to collect a bunch of variables (in this case refered to as "attributes") and functions (here called "methods") that belong together in a meaningful way. After having defined a class, we can then create instances of this class, which are variables that are of the type of the class and on which we can call the methods that we have defined for this class. 

In the end, classes are not much different from the data types that you have already encounterd. For example, lists and strings also come with pre-defined methods that can only be called on variables of this type (e.g. `a_str.split()`).

In [None]:
# code to define a simple class

class Rectangle():
    
    def __init__(self, side1, side2=10):
        # this function is called automatically when you create a new
        # instance of the class Rectangle. You need to pass the respective 
        # values if no defaults are specified for the arguments.
        # "self" is a special argument that references the object itself and is passed implicitly.
        # when initializing a new object, you save the values that were passed 
        # and thereby create attributes of the object
        self.side1 = side1
        self.side2 = side2
        
    def get_area(self):
        # this is a method that can be called on every instance of the class Rectangle
        # since it only has "self" as an argument, you don't need to pass any values when calling it
        return self.side1 * self.side2

In [None]:
# create an instance of the class defined above
rect = Rectangle(20)
# its type is the class that we defined
print(type(rect))
# we can access the object's attributes
print(rect.side1, rect.side2)
# and call the get_area method on the object
print(rect.get_area())

In [None]:
# you can create as many instances of the class as you want
rect_new = Rectangle(4, 5)
print(rect_new.side1, rect_new.side2)
print(rect_new.get_area())

### Accessing code defined outside this notebook

The great thing about Python is that a lot of other people have written a lot of very useful code that you can use for your own projects. However, by default, your code only knows the variables and functions that you have defined here yourself (together with some very basic standard functions like `len()` and `max()`). To get access to all the other great code, you need to **import** it, so your code knows how to call on it.

These are the standard ways to import code:
```python
import some_library
# you can now access all the functions, classes, etc. defined in some_library
# by calling some_library.function_that_i_need()

# with "as" you can define a shortcut name on which you can call the functions of the library
import some_library_with_a_very_long_name as long_lib
# so instead of calling "some_library_with_a_very_long_name.function_that_i_need()"
# you can just call "long_lib.function_that_i_need()"

# sometimes you only need a very specific function or class from a library
# then you can also import it directly
from some_library import function_that_i_need
# now instead of calling "some_library.function_that_i_need()"
# you just call "function_that_i_need()" directly, i.e., as if you had defined it in your own code here

# sometimes the functions that you need are in a submodule of the library
# this is accessed with a ".". Also, multiple things that you want to import
# from a (sub)module can be separated by commas:
from some_library.submodule_name import another_function, Some_Class
# you can now call "another_function()" and "Some_Class()" directly

# Never do this:
from some_library import *
# this would import all the functions, classes, variables etc. from some_library, 
# but now it can easily happen that you accidentally call something from the library
# even though you wanted to access something else by the same name, but you didn't
# know that you had imported it with the wildcard *
```

In [None]:
# import the date submodule from the datatime library:
from datetime import date

# call a function on what you have imported 
date.today()

#### Importing your own code

If you write some functions or classes that you would like to use in other projects, you can copy them to a text file that you save with a .py ending (e.g. `my_script.py`) and then you can put this file in the same folder as your project notebook and then in the notebook import the functions or classes that you want to use from your script
```python
from my_script import my_function
```
By storing general purpose code outside of the notebook, it is easier to reuse and makes your notebook more readable by reducing the code to the essentials.

## You're done!
#### Go check out the exercises!