# Final stretch

 * Today: some minor topics yet to be covered
 * Next week: Git
 * In two weeks: Last lecture, preparation for the exam

The exam is going to take place on **Thursday 19.12.2024** some time between 11:30 and 14:00. More information will follow.

Until then: practice, practice, practice!

# Supertab

In the lecture of 10.12.2024 (in two weeks), we will be testing Supertab in-class. **Participation is mandatory**, but you do not have to come physically: you may participate from home. Detailed instructions will follow next week and via Email.

# Important Notice 

Soon, you will **no longer be able to use SMS** for two-factor authentication (2FA) on edu-ID. If you're already using an authenticator app, then you're OK. But if you receive an SMS code when logging in, then you must change your authentication method now. Make sure you do this before your exams! More information:

[https://www.uzh.ch/blog/zi/2024/11/22/switch-edu-id-mit-zwei-faktor-authentifizierung-sms-option-faellt-weg/](https://www.uzh.ch/blog/zi/2024/11/22/switch-edu-id-mit-zwei-faktor-authentifizierung-sms-option-faellt-weg/)

## Recap: Recursion
With recursion, **a whole can be broken up into smaller pieces which resemble the whole**.

Here is how one might implement the functionality of getting the sum of a list `numbers` iteratively and recursively:

In [None]:
def iter_sum(numbers):
    res = 0
    for n in numbers:
        res += n
    return res

def rec_sum(numbers):
    if not numbers:
        return 0
    return numbers[0] + rec_sum(numbers[1:])           # the recursive call happens BEFORE the addition (+), which is the last thing that happens

l = [1, 2, 3]
print(iter_sum(l))
print(rec_sum(l))

Know that a *tail-recursive* function has the recursive call as *the very last thing that really happens* in the function. Note that this is **not** the case in `rec_sum`, where the last thing that really happens is the addition using `+`. A tail-recursive solution usually needs to provide an accumulator which is passed down into each recursive call:

In [None]:
def tail_sum(numbers, res=0):
    if not numbers:
        return res
    return tail_sum(numbers[1:], res + numbers[0])     # the recursive call is truely what happens last
print(tail_sum(l))

In both recursive implementations, there are three features common to most recursive code:
 1. one or more base cases
 2. one or more recursive calls
 3. somehow combining *this* data with *the recursive rest*

The most common ways you will encounter recursion is in recursive data structures ("files and folders").

## A few words on testing

It is very important that a test only tests **one specific thing**. For example, you were required to implement a test for the "Game Moves" task which ensures that a given implementation raises an exception if the game world dimensions are invalid. You might have come up with a test case like this:

```python
    def test_invalid_rowlength(self):
        state = (
            "#####   ",
            "### o  #",
            "#      #",
            "  "          # invalid row, not the same length as the others!
        )
        with self.assertRaises(Warning):
            move(state, "up")
```

But there's a problem with this test case. An implementation which **does not** check for invalid row dimensions, but **does** check for invalid moves will "correctly" raise a `Warning` here. In this test, the player is running `UP` into a wall, so a `Warning` will be raised anyway! So this test will **pass** even for a buggy implementation that does not check if each row has the same length! 

Here is a test case that does not suffer from this problem:

```python
    def test_invalid_rowlength(self):
        state = (
            "#####   ",
            "### o  #",
            "#      #",
            "  "          # invalid row, not the same length as the others!
        )
        with self.assertRaises(Warning):
            move(state, "down")
```

Now, the only thing wrong when calling `move` is that the world has rows of unequal length, everything else is valid. This way, the specific bug can actually be identified!

<p style="height:100px"></p>
<hr>
<p style="height:100px"></p>





## `map` and `filter`

In Python, if you want to *change* each value in a collection, or *filter* a collection, you would typically use a comprehension. For example, you might convert a list of strings to a list of integers like this:

In [None]:
l = ["1", "3", "5"]
[int(n) for n in l]

Or to get rid of all strings which are not numbers, you might do:

In [None]:
l = ["1", "a", "5", "11", "five"]
[n for n in l if n.isdigit()]

However, comprehensions are a fairly Python-specific idea. In most other programming languages, the functionality of comprehensions is covered by two functions called `map` and `filter`. Here is how we could implement these functions ourselves:

`my_map` takes a function and a collection of values. It calls the function for each value in the collection and returns a list with the resulting values. We say that `fun` is *applied* to each value in the collection.

In [None]:
def my_map(fun, collection):
    res = []
    for elem in collection:
        res.append(fun(elem))      # apply fun to elem -> transform
    return res                     # list of transformed values

l = ["1", "3", "5"]
my_map(int, l)                     # equivalent to [int(n) for n in l]

Likewise, `my_filter` takes a function and a collection of values. It calls the function for each value in the collection to determine whether to keep the element of to discard it and returns a list containing only those values from the original collection, where the function call evaluated as truthy.

In [None]:
def my_filter(fun, collection):
    res = []
    for elem in collection:
        if fun(elem):              # apply fun to elem -> filter
            res.append(elem)
    return res                     # list of values that passed the filter

l = ["1", "a", "5", "11", "five"]
my_filter(str.isdigit, l)          # equivalent to [n for n in l if n.isdigit()]

When comparing `my_filter(str.isdigit, l)` with `[n for n in l if n.isdigit()]`, remember that 

In [None]:
"5".isdigit()

is just syntactic sugar for:

In [None]:
str.isdigit("5")

So to illustrate that it's still `str.isdigit` which is applied to each value, you could rewrite the comprehension as

In [None]:
[n for n in l if str.isdigit(n)]

Python ships with these functions built-in. 

In [None]:
l = ["1", "3", "5"]
map(int, l)

In [None]:
l = ["1", "a", "5", "11", "five"]
filter(str.isdigit, l)

Notice that these are lazy-loading iterators, so to "see" their contents in Jupyter Notebook, we need to convert the result to a collection.

In [None]:
l = ["1", "3", "5"]
list(map(int, l))               # wrap in list(...) to see the result

In [None]:
l = ["1", "a", "5", "11", "five"]
list(filter(str.isdigit, l))    # wrap in list(...) to see the result

As you can see, `map` and `filter` are very simple functions that implement the modification and filtering capabilities provided by comprehensions in Python. The Python designers indeed recommend using comprehensions over the use of these functions. Most other languages, however, don't have comprehensions and simply provide `map` and `filter`. You should get comfortable with the idea.

The most important thing to understand is that the first parameter given to `map` and `filter` is always a function that takes one parameter.
 * In case of `map`, the return value (`5`) of the function call (`int("5")`) replaces the original element (`"5"`) in the resulting collection.
 * In case of `filter`, the return value (`True`) of the function call (`str.isdigit("5")`) determines whether the original item (`"5"`) should be kept in the resulting collection.

Let's see a few more examples:

In [None]:
# Get lengths of strings
words = ["apple", "banana", "cherry", "date"]
list(  map(len, words)         )          # [len(w) for w in words]

In [None]:
# Filter for strings containing only whitespace
l = ["  ", "hello", " ", "world", "\n", "\t"]
list(  filter(str.isspace, l)  )          # [s for s in l if s.isspace()]

In [None]:
# Round floating point numbers
decimals = [3.14159, 2.71828, 1.41421]
list(  map(round, decimals)    )          # [round(n) for n in decimals]

In [None]:
# Filter for truthy values
values = [0, None, False, "", 42, "hello", [], [1,2], {}]
list(  filter(bool, values)    )          # [v for v in values if v]

As you can see in each of those examples, the first parameter given to `map` or `filter` is a function that takes one parameter:

 * `map(len, words)`: `len` takes a collection and returns an integer
 * `filter(str.isspace, l)`: `str.isspace` takes a string and returns a boolean
 * `map(round, decimals)`: `round` takes a number and returns an integer
 * `filter(bool, values)`: `bool` takes anything and returns a boolean

However, what can we do if our transformation function for `map`, or our criteria for `filter` are more complicated than just a plain function? Let's say in the following example, that we want to remove all elements from a collection unless they are a list, tuple or dictionary. Instead of providing a ready-made function to `filter`, we could just implement a new function that we can pass as the first parameter to `filter`:

In [None]:
values = [0, None, False, "", 42, "hello", [], [1,2], {}]

def criteria(it):
    return isinstance(it, (list, tuple, dict))  # returns a boolean
    
print(list(  filter(criteria, values)   ))      # criteria is applied to each element

As you can see, we implemented a function `criteria` which takes one argument. This function is called for each element in the collection to determine whether to keep it or not. The `criteria` function returns True if `it` is one of the desired types.

Note that we cannot simply use `isinstance(it, (list, tuple, dict))` as the first parameter to `filter`, because that's not a function:

```python
# Here, the first parameter to filter is NOT a function
# This wouldn't work because what is *it*?
print(list(  filter(isinstance(it, (list, tuple, dict)), values)   ))
```

So that's why we had to create a function (`criteria`) first, and use it as a parameter to `filter`.

Here is a similar example for `map`. Say we want to add `1` to each number in a collection. Even for something so simple, we would still need to implement a separate function:

In [None]:
values = [1, 3, 5]
def adder(x):
    return x + 1
print(list(  map(adder, values)   ))

Again, we couldn't just use `x+1` as the first parameter:

```python
# Here, the first parameter to map is NOT a function
# This wouldn't work because what is *x*?
print(list(  map(x + 1, values)   ))
```

Having to define a new function and give it a name is a little bit cumbersome, especially if we're only going to use that function as part of a `map` or `filter` call but nowhere else. All we want to tell the `map` function is that "given a value, add one", or `x + 1`, without having to define the `adder` function.

## Lambdas / Anonymous functions

In a situation where we just need some functionality that we could put in a function (like `adder` above), we can make use of a language feature called *anonymous functions*. In Python and some other languages, this is referred to as a *lambda function*. We can turn our non-working example from above by simply adding `lambda x : ` in front of the `x + 1` which we want to pass into the `map` function:

In [None]:
values = [1, 3, 5]
print(list(  map(lambda x: x + 1, values)   ))    # the lambda acts the same as 'adder'

As you can see, it works exactly the same as if we were to pass in the `adder` function. However, instead of implementing `adder` as a separate, *named function*, we simply defined a lambda as the first parameter to `map`. The general syntax for lambdas in python is:

```python
lambda arguments: expression
```

`lambda` is really just another way to define functions. We could define a lambda function and then assign it to a variable. We would then have ended up with a regular function:

In [None]:
adder = lambda x: x + 1       # this is a bit silly
adder(3)

In [None]:
def adder(x):                 # identical
    return x + 1
adder(3)

However, if we do not assign the lambda to a variable, that means we cannot refer to it anywhere else (hence it is "anonymous"). Lambdas are a popular choice when defining ad-hoc functionality that is really only needed in place, such as
 * changing each value when calling `map`
 * filtering criteria when calling `filter`
 * sorting criteria  when calling `sorted` or `.sort()`

Let's look at some more examples.

In [None]:
# Filter list, tuple, dict from a collection 
values = [0, None, False, "", 42, "hello", [], [1,2], {}]
list(  filter(lambda x: isinstance(x, (list, tuple, dict)), values)   )  # instead of defining the `criteria` function

In [None]:
# Transform each string into a 2-tuple containing the length and upper-case version
words = ['hello', 'world', 'python']
list(  map(lambda x: (len(x), x.upper()), words)                      )

In [None]:
# Filter strings containing only whitespace
l = ["  ", "hello", " ", "world", "\n", "\t"]
list(  filter(str.isspace, l)                )    # no lambda needed

In [None]:
# Filter strings NOT containing only whitespace
l = ["  ", "hello", " ", "world", "\n", "\t"]
list(  filter(lambda x: not x.isspace(), l)  )    # lambda to invert the condition

Here's a function which simply compares if two functions produce the same result for a given argument:

In [None]:
def same_result(f, g, it):
    return f(it) == g(it)
print( same_result(len, sum, [1, 1]) )
print( same_result(str.title, str.upper, "world") )
print( same_result(bool, str.isdigit, "1") )

Now that you know about lambdas, you would also be able to use this function like so:

In [None]:
print( same_result(abs, lambda x: x if x > 0 else -x, -100) )

You do not necessarily need to use lambdas, but you must be able to recognize what they do and how they work.

## `sorted` and `.sort`

Another function which works rather similarily to `map` and `filter` is `sorted`. In the simplest case, it just takes a collection and returns it sorted according to the comparison operators (`__gt__` etc.) of the elments. Note that `sorted` does **not** modify the input collection. Here, it sorts a list of strings alphabetically (because that's how ordering of strings is implemented):

In [None]:
words = ["python", "c", "java", "javascript", "ada"]
sorted(words)

However, `sorted` can also take an optional keyword argument `key`:

In [None]:
words = ["python", "c", "java", "javascript", "ada"]
sorted(words, key=len)

As you can see, `key` is also expected to be a function (just like the first parameter to `map` or `filter`). In the example above, the function is `len`, which takes a collection as an argument and returns a number. `sorted` then uses that number to order the values in the collection. Thus, it sorts the strings by length, instead of alphabetically.

This is another case where `lambda` comes in handy. Say we want to sort students by their grade:

In [None]:
students = [('Alice', 5.75), ('Bob', 3.75), ('Charlie', 5.00)]
sorted(students, key=lambda x: x[1]) # does not modify students but returns a new collection

Some collections also provide a method `.sort`. It works the same, but actually modifies the collection in-place:

In [None]:
students = [('Alice', 5.75), ('Bob', 3.75), ('Charlie', 5.00)]
print(students)
students.sort(key=lambda x: x[1])   # sorts in-place, modifying students
print(students)

It's best to be familiar with `sorted` and `.sort` because it's such a common thing to do.

## `continue` and `break`

`for` loops usually are intended to run for each element in a collection, and `while` loops usually run until the condition becomes false:

In [None]:
# for loop typically iterates for each element
for i in ["Bob", "Anne", "Alice"]:
    print(i)

# while loop typically iterates until the condition is false
n = 3
while n > 0:
    print(n)
    n -= 1

When a using a `for` or `while` loop, you have some additional ways to control the flow of execution.

 * `continue` will end the current iteration of the loop and jump back to the beginning to execute the next loop (if there is any)
 * `break` will end the iteration entirely

In the following example, where a `for` loop iterates over each element in a list of strings, `continue` is used to jump to the next iteration if the string is essentially empty:

In [None]:
responses = [
    "Great service!",
    "",
    "   ",
    "Food was cold",
    "\n",
    "Will come back again"
]

for feedback in responses:
    if not feedback.strip():          # if, after stripping whitespace, there's nothing left, this is an invalid feedback
        continue                      # go to the next iteration
    print(feedback)

Similarily, the following example uses `break` to completely end the loop early under certain conditions:

In [None]:
responses = [
    "Good service",
    "Waiter was friendly",
    "Found hair in my food!",
    "Nice atmosphere",
    "Enjoyed the music"
]
for feedback in responses:    
    if "hair" in feedback.lower() or "bug" in feedback.lower():
        print(f"ALERT: Critical issue found: {feedback} -- act immediately!")
        break                        # end the for loop right here
    print(feedback)

Note that in many cases, your functionality will be contained in functions. If that's the case, then it's quite likely that you can simply use an early `return` instead of `break`, unless you want to continue with some more code after the `for` loop.

In [None]:
def analyze_feedback(responses):
    for feedback in responses:    
        if "hair" in feedback.lower() or "bug" in feedback.lower():
            print(f"ALERT: Critical issue found: {feedback} -- act immediately!")
            return                   # end the function right here
        print(feedback)
analyze_feedback(responses)

## Data classes

It happens with some frequency that a class you create mostly exists as a simply object to store some data. Here is an example:

In [None]:
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

p1 = Person("Bob", 33)
print(p1)

To make this class behave nicely, we would probably also want to implement `__str__`, `__repr__` and maybe some other methods. However, since this use case is very common, Python provides a helper called `dataclass` to create such classes more easily:

In [None]:
from dataclasses import dataclass

@dataclass
class Person:
    name: str
    age: int

p1 = Person("Bob", 33)
print(p1)

To create a data class, we simply annotate the class with `@dataclass`. Then, we list the desired attributes, in this case `name: str` and `age: int`. For data classes, type annotations are mandatory. As you can see, `@dataclass` automatically creates `__init__`, `__str__`, `__repr__`, `__eq__`, and some other functionality - very convenient!

In [None]:
p2 = Person("Bob", 33)
p3 = Person("Alice", 33)
print(p1)          # automatically implemented __str__
print(p2)
print(p3)
print(p1 == p2)    # automatically implemented __eq__
print(p1 == p3)
print([p1, p3])    # automatically implemented __repr__

It can even automatically implement other comparison operators, if you provide it with some additional arguments. Here, we say that instances of `Person` are "frozen" (i.e., immutable), and that we also want to automatically implement the comparison functions for ordering as well as `__hash__`:

In [None]:
@dataclass(order=True, frozen=True, unsafe_hash=True)
class Person:
    name: str
    age: int
    
p1 = Person("Bob", 33)
p3 = Person("Alice", 33)
print(hash(p1))
print(p1 > p3)

#p1.name = "Bobby"        # will generate an exception, because Persons are now immutable ("frozen")

You can also provide default values in the data class definition, and you can still implement additional methods as usual:

In [None]:
@dataclass
class Person:
    name: str
    age: int = 0

    def reach_birthday(self):
        self.age += 1
        
p1 = Person("Bob")
p3 = Person("Alice", 33)
print(p1)
p1.reach_birthday()
print(p1)

Just don't be confused by the syntax: `name` and `age` are NOT class attributes. You *could* add class attributes, but they must not have any type annotations:

In [None]:
@dataclass
class Person:
    name: str
    age: int = 0

    example_class_attribute = "Like a Toyota's serial_counter, this attribute only exists once."
        
p1 = Person("Bob")
p3 = Person("Alice", 33)
print(p1.name == p3.name)
print(p1.example_class_attribute == p3.example_class_attribute == Person.example_class_attribute)

You do not need to use data classes, but you should be able to recognize them and know how they work.

## Working with files

Python can read and write files on your machine. When working with files **be careful not to accidentally overwrite or delete your work**. You can destroy your operating system and trash all your files if you're not careful.

Let's say we have some data that we want to save to a file. In the following example, we open a file `cities.csv` in *write mode* (`'w'`). We can then write text to this file. Note that this will **overwrite** `cities.csv` as soon as something is written to it. We decide to store the data using simple CSV (comma-separated-values), where each value in the tuple will be separated by a comma `,` and each row by a newline `\n`.

In [None]:
cities = [
    (35.6839,  139.7744, "Tokyo"),
    (40.6943,  -73.9249, "New York"),
    (19.4333,  -99.1333, "Mexico City"),
    (18.9667,   72.8333, "Mumbai"),
    (-23.5504, -46.6339, "Sao Paulo"),
]

with open('cities.csv', 'w') as f:                          # open the file, referencing it as 'f'
    for line in cities:
        f.write(",".join([str(it) for it in line]) + '\n')  # write each row of our data

You can now open `cities.csv` in a text or spreadsheet editor to confirm that the file was indeed created. If you want to add data to a file, instead of overwriting it, you can specify *append mode* `('a')`:

In [None]:
with open('cities.csv', 'a') as f:                          # append mode 'a' instead of 'w'
    f.write("47.3764,8.5432,Zurich\n")

Reading a file works similarily. Let's read the file back into a data structure using *read mode* (`'r'`):

In [None]:
cities = []

with open('cities.csv', 'r') as f:                              # read mode 'r'
    content = f.read()                                          # read the entire file
    for line in content.splitlines():                           # split on newlines
        row = line.strip().split(',')                           # split on commas
        lat, lon, name = float(row[0]), float(row[1]), row[2]   # convert the numbers from strings where necessary
        cities.append((lat, lon, name))                         # add the resulting tuple to the cities list

print(cities)

A couple of things to note:

 * The `with ... as ...:` statement is called a *context manager*. It ensures that when the `with` block ends, the file is properly closed. There are other context managers in Python, but for now, you only need to know this one for reading and writing files.
 * When reading or writing binary data, you may need to use the `"wb"`, `"ab"` and `"rb"` modes. We did not really discuss binary data. Just be aware of this.
 * Above, we used `f.read()` to read the entire file content and then we split it. You could also use `f.readlines()` or `list(f)` to obtain the lines directly.
 * If you have a very large file, reading all of it at once may not be possible. You can use `f.readline()` to read just one line or `f.read(n)` to read `n` characters (or bytes when in binary mode). You can also move to a different position in the file using `f.seek(...)`.
 * Python ships with many libraries to work with specific file types, such as [CSV](https://docs.python.org/3/library/csv.html) [ZIP files](https://docs.python.org/3/library/zipfile.html) or other archives, [emails](https://docs.python.org/3/library/email.html), [JSON](https://docs.python.org/3/library/json.html), [HTML](https://docs.python.org/3/library/html.parser.html#module-html.parser), [XML](https://docs.python.org/3/library/xml.html), [SQLite databases](https://docs.python.org/3/library/sqlite3.html), and many more. In many of those cases, you can provide `f` directly to the library, and it will deal with actually reading and writing the files.

To learn all the more complex aspects of using files in Python, follow [the documentation](https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files).

## Duck typing

You might have heard that Python uses "Duck Typing". The saying goes:

*"If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck."*

The idea here is important to understand: When writing programs in a *pythonic* fashion, type checks are not really really performed pre-emptively. Or in other words: **The *type* of an object is less important that its *behavior***.

In the following example, there is no explicit relationship between the three classes `Container`, `Bicycle` and `Goldbar`. The `Container` class simply assumes that all things it contains support the **same behavior**, namely that calling `.value()` on a stored thing will produce a number. A `Container` can store *anything* that exhibits this behavior:

In [None]:
class Container:
    def __init__(self):
        self.storage = []

    def store(self, thing):
        self.storage.append(thing)

    def total_value(self):
        return sum(thing.value() for thing in self.storage)    # Container assumes all things stored support the .value() behavior

class Bicycle:
    def __init__(self, make, model, price):
        self.make = make
        self.model = model
        self.price = price

    def value(self):
        return self.price

class Goldbar:
    def __init__(self, grams):
        self.grams = grams
        
    def value(self):
        from random import randrange
        return self.grams * randrange(50, 70)

container = Container()
bike = Bicycle("Trek", "Outrunner", 650)
bar = Goldbar(20)
container.store(bike)
container.store(bar)
print(container.total_value())

In other words:

*"If it has a `value()` method that produces a number, it must be something that can be stored"*

This allows for a lot of flexibility when implementing "open" programs. It stands in strong contrast to *strongly-typed* languages like Java or C++, where the two classes `Bicycle` and `Goldbar` would likely need to inherit from something like `Storable` and where `Container` would only accept to store objects that inherit from `Storable`.

Nonetheless, a Python programmer may decide to also implement a `Storable` class, just to make this relationship explicit. Naturally, this class may be *abstract* and specify that subclass must implement `value`:

In [None]:
class Container:
    def __init__(self):
        self.storage: list['Storable'] = []

    def store(self, thing: 'Storable'):
        self.storage.append(thing)

    def total_value(self):
        return sum(thing.value() for thing in self.storage)

from abc import ABC, abstractmethod
class Storable(ABC):
    @abstractmethod
    def value(self):
        pass

class Bicycle(Storable):
    def __init__(self, make, model, price):
        self.make = make
        self.model = model
        self.price = price

    def value(self):
        return self.price

class Goldbar(Storable):
    def __init__(self, grams):
        self.grams = grams
        
    def value(self):
        from random import randrange
        return self.grams * randrange(50, 70)

container = Container()
bike = Bicycle("Trek", "Outrunner", 650)
bar = Goldbar(20)
container.store(bike)
container.store(bar)
print(container.total_value())

## Multiple inheritance

You already know that if a class `Sub` inherits from a class `Super`, then `Sub` inherits ("copy-pastes") all functionality from `Super`. `Sub` can then override certain attributes, or may have to provide some explicitely (those inherited as `@abstractmethod`).

In the following example, `TextFile`, `ImageFile`, and `Folder` all inherit from `File`. They each inherit `__init__` but then override it with their own implementation, each of which calls `super().__init__`.

In [None]:
class File:
    def __init__(self, filename):
        self.filename = filename

class TextFile(File):
    def __init__(self, filename, text):
        super().__init__(filename)
        self.text = text

class ImageFile(File):
    def __init__(self, filename, pixels):
        super().__init__(filename)
        self.pixels = pixels
                         
class Folder(File):
    def __init__(self, filename):
        super().__init__(filename)
        self.content = []

Let's consider the following problem:

Email is a specified in [RFC 5322](https://datatracker.ietf.org/doc/html/rfc5322#section-2.1) as a text-only medium. Seriously! An email message may only contain US-ASCII (ANSI.X3-4.1986) characters. However...
  * your email client allows you to attach pictures or PDFs when sending a message
  * You can use Umlauts and other non-ASCII characters
  * when you receive a message, it might come as more than just plain text, like colors and styling

How is this possible? Here is what an email looks like when it contains an image attachment:

```
From: sender@example.com
To: recipient@example.com
Subject: =?UTF-8?Q?Gr=C3=BC=C3=9Fe_mit_Uml=C3=A4uten?=
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="boundary123"

--boundary123
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Gr=C3=BC=C3=9Fe aus M=C3=BCnchen! Hier ist ein sch=C3=B6nes Bild f=C3=BCr Sie=

--boundary123
Content-Type: image/gif
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="g.gif"
R0lGODdhAwADAIABAAAAAP///ywAAAAAAwADAAACBAyCoFAAOw=

--boundary123--
```

As you can see, both the subject and message, as well as the image gets encoded as text somehow. The `Content-Transfer-Encoding` fields indicates that the text is encoded using a strategy called `quoted-printable` and the image is encoded as `base64`.

Taking some data and turning it into strings is a common idea called **serialization**. You *serialize* a value when turning into some kind of string, and you *deserialize* the string to end up with the original data. This is what your email client does when sending and receiving emails.

In our example, we may define a new interface for classes which support serialization. In Python, we'll do this by defining an abstract class `Serializable`. Here it is with a few type annotations.

Side note: to indicate that `deserialize` returns a `Serializable`, we want to add it as a type hint to the function, i.e. `-> Serializable`. But because this happens *inside* `Serializable`, it is not really defined yet. Because of this, Python (since version 3.7) supports using a string instead of a type, i.e. `-> 'Serializable'` which will be resolved once the class is fully defined.

In [None]:
from abc import ABC, abstractmethod

class Serializable(ABC):
    @abstractmethod
    def serialize(self) -> str:                     # serialize turns the object into a string representation
        pass

    @staticmethod                                   # static, because there is no object before deserialization
    @abstractmethod
    def deserialize(data: str) -> 'Serializable':   # deserialize turns a string into an object
        pass

In our example, we may not be able to support serialization for all file types. We can't assume that every `File` is serializable. Also, `Folder`s are probably more difficult to serialize. For now, we'll only support serialization for `TextFile` and `ImageFile`. To do this we:

 1. make them inherit from `Serializable`
 2. implement `serialize` and `deserialize`. We will choose an appropriate way to encode the text or image data as a string.

In [None]:
from abc import ABC, abstractmethod

class Serializable(ABC):
    @abstractmethod
    def serialize(self) -> str:                     # serialize turns the object into a string representation
        pass

    @staticmethod                                   # static, because there is no object before deserialization
    @abstractmethod
    def deserialize(data: str) -> 'Serializable':   # deserialize turns a string into an object
        pass

class File:
    def __init__(self, filename):
        self.filename = filename

class TextFile(File, Serializable):                 # TextFile now inherits from TWO classes, first File, then Serializable
    def __init__(self, filename, text):
        super().__init__(filename)
        self.text = text

    def __str__(self):
        return f"{self.filename}:\n{self.text}"

    def serialize(self):
        return f"{self.filename};{self.text}"       # serialization: it's easy: we just separate the file name and data by a semicolon

    @staticmethod
    def deserialize(data):
        name, text = data.split(';')                # deserialization: split the string on a semicolon to get the data back
        return TextFile(name, text)                 # construct the TextFile instance

class ImageFile(File, Serializable):                # ImageFile now inherits from TWO classes, first File, then Serializable
    def __init__(self, filename, pixels):
        super().__init__(filename)
        self.pixels = pixels

    def __str__(self):
        res = '\n'.join(["".join(["░" if cell == 0 else "█" for cell in row]) for row in self.pixels])
        return f"{self.filename}:\n{res}"

    def serialize(self):
        res = ""
        # separate file name by ; rows by , and pixels by .
        return f"{self.filename};" + ",".join(".".join(str(cell) for cell in row) for row in self.pixels)

    @staticmethod
    def deserialize(data):
        name, pixels = data.split(';')              # filename and data
        rows = pixels.split(',')                    # each row of pixels
        image = ([int(p) for p in row.split('.')] for row in rows)
        return ImageFile(name, image)               # construct the ImageFile instance
                         
class Folder(File):
    def __init__(self, filename):
        super().__init__(filename)
        self.content = []

t1 = TextFile("haiku.txt",
"""Indentation woes
Syntax errors multiply
Debug, try again""")

i1 = ImageFile("smile.img", (
    (0, 1, 0, 1, 0),
    (0, 0, 0, 0, 0),
    (1, 0, 0, 0, 1),
    (0, 1, 1, 1, 0),
))

t1_serialized = t1.serialize()                       # t1_serialized is now a string that contains all necessary data to restore t1
print(f"Serialized: {t1_serialized}")
t1_loaded = TextFile.deserialize(t1_serialized)      # we can greate an identical TextFile from the data
print(t1_loaded)

i1_serialized = i1.serialize()                       # i1_serialized is now a string that contains all necessary data to restore i1
with open("i1.txt", "w") as f:                       # we can store the serialized representation in a file
    f.write(i1_serialized)
with open("i1.txt", "r") as f:                       # we can load the serialized representation from a file
    i1_from_file = ImageFile.deserialize(f.read())
print(i1_from_file)

This is a practical example, where the additional super-class `Serializable` marks its descendants as supporting specific types of operations, i.e. being serialized into and reconstructed from a string. While many programming languages do not support multiple inheritance like Python, they usually support "implementing multiple interfaces" (`TextFile` could implement both the `File` and `Serializable` interfaces) or "including multiple mixins" (`TextFile` could inherit from `File` but include `Serializable` as a mixin).



`Serializable` itself only demands that two methods are implemented without providing any functionality on its own. But additional superclasses don't have to be abstract and could also just provide ready-made implementations. In the most trivial case, it could be entirely unrelated functionality:

In [None]:
class DiscoMode:
    def party(self):
        print("🎈🎉 💃 ✨ 🕺 🎈 💃 ✨ 🕺 🎉🎈")
        
class File:
    def __init__(self, filename):
        self.filename = filename
        
class TextFile(File, DiscoMode):
    def __init__(self, filename, text):
        super().__init__(filename)
        self.text = text

    def __str__(self):
        return f"{self.filename}:\n{self.text}"

t1 = TextFile("password.txt", "hunter2")
t1.party()

But it could also make some assumptions about the kinds of classes which will inherit from it to provide ready-made functionality. In the following example, `Rot13` provides a method `encode`, which will return the `self.text` property encoded using rot13. It does not by itself set this! `Rot13` does not provide a constructor at all, just the `encode` method. Any class that has a `self.text` string attribute could inherit from `Rot13` to get this functionality *"mixed in"*. In many languages, this is called a "mixin". Strictly speaking, mixins are a language feature where functionality is added *without inhertance*, so in Python, this is not strictly speaking a mixin, but it illustrates how a mixin could be used in other languages:

In [None]:
class Rot13:
    def encode(self):
        import codecs
        return codecs.encode(self.text, "rot13")   # Rot13 could be mixed in with any class that has a self.text string attribute!

class File:
    def __init__(self, filename):
        self.filename = filename
        
class TextFile(File, Rot13):                       # Rot13 is technically a superclass, but semantically a mixin
    def __init__(self, filename, text):
        super().__init__(filename)
        self.text = text

    def __str__(self):
        return f"{self.filename}:\n{self.text}"

from dataclasses import dataclass
@dataclass
class Book(Rot13):                                 # The Book class has nothing to do with files, but the Rot13 mixin is compatible!
    author: str
    title: str
    text: str
    
t1 = TextFile("password.txt", "hunter2")
print(t1.encode())
b1 = Book("George Orwell", "Animal Farm", "Mr. Jones, of the Manor Farm, had locked the hen-houses...")
print(b1.encode())

To summarize, so far we've seen the following examples for how multiple inheritance might be used:

 * Marking classes as supporting a certain interface (like `Serializable`), where the inheriting classes must provide implementations.
 * Adding unrelated functionality (`DiscoMode`) to a class.
 * Adding functionality to certain compatible classses (`Rot13` is compatible with `TextFile` and `Book`, even though there is no semantic "family relationship" between `TextFile` and `Book`).

In the use cases above, there is little risk of problems when inheriting from mutliple super-classes (or in other languages: implementing multiple interfaces or including multiple mixins).

But what happens if there is an overlap between attribute names in different super classes? In the following example, `Book` inherits from both `Rot13` and `Screamer`, both of which provide a method called `encode`. When this happens, and we call `encode` on a `Book` instance, the *Method Resolution Order (MRO)* in Python is to use whichever method was imported *first*:

In [None]:
class Rot13:
    def encode(self):
        import codecs
        return codecs.encode(self.text, "rot13")

class Screamer:
    def encode(self):
        return self.text.upper()
        
from dataclasses import dataclass
@dataclass
class Book(Rot13, Screamer):
    author: str
    title: str
    text: str

b1 = Book("George Orwell", "Animal Farm", "Mr. Jones, of the Manor Farm, had locked the hen-houses...")
print(b1.encode())  # calls Rot13.encode, because that was inherited from FIRST!

If we wanted to use `Screamer`'s method, we would have to call the method by naming the class explicitely:

In [None]:
print(Screamer.encode(b1))

The same is true when calling `super()` methods. Below is an example where `House` inherits both `Taxable` and `Insurable`. There are two problems with this:
 1. both superclasses provide a method called `calculate`, so which one to use when calling `house.calculate()`?
 2. when creating a `House`, we need to call **both** super constructors, but `super().__init__` can only be one of them. Which one?

Here, Python will pick the `Taxable` implementations of `calculate` and `__init__` because `House` inherits from `Taxable` before `Insurable`. We solve the two problems as follows:
 1. If we want the `Insurable` implementation, we must call `Insurable.calculate(house)`
 2. Instead of `super().__init__` we refer to `Taxable.__init__` and `Insurable.__init__` directly. Note that now we have to also provide `self` as an argument!

In [None]:
class Taxable:
    def __init__(self, value, tax_rate):
        self.value = value
        self.tax_rate = tax_rate

    def calculate(self):
        return self.value * self.tax_rate

class Insurable:
    def __init__(self, value, copay):
        self.value = value
        self.copay = copay

    def calculate(self):
        return self.value - self.copay
        
class House(Taxable, Insurable):
    def __init__(self, address, value, tax_rate, copay):
        Taxable.__init__(self, value, tax_rate)               # instead of using super().__init__
        Insurable.__init__(self, value, copay)
        self.address = address

house = House(
    "112 Mercer Street, Princeton, New Jersey 08540",
    1650000,
    0.0214,
    8200
)

print(house.calculate())                                      # same as Taxable.calculate(house)
print(Taxable.calculate(house))
print(Insurable.calculate(house))

As a general piece of advice, while inheritance can be a fun way to come up with seemingly elegant abstractions, consesus in the developer community is shifting to a preference for flat and simple inheritance structures.

Otherwise, you might end up with something like the Scala collections types (pre Scala 2.13):

<div>
<img src="scala.svg" alt="Scala types" width=800 />
</div>

The Scala developers have since simplified the types a bit:

<div>
<img src="scala2-13.svg" alt="Scala types" width=600/>
</div>

# Summary

#### `map`, `filter`, `sorted`, `.sort`
* `map` takes a function and applies it to each element in a collection. It returns a new collection of transformed values.
* `filter` takes a function and determines it to each element in a collection to determine whether to keep or discard it. It returns a new collection of filtered values.
* `sorted` returns an ordered collection. A function can be optionally be provided as a `key=` argument. The function will be called for each element in the collection. The resulting values will be used to sort the collection.
* `.sort` sorts in-place.

#### `lambda`
* Anonymous functions can be provided ad-hoc without creating a named function. This is often done when a function argument must be a function, but it is only used once. The syntax is `lambda arguments: expression`.

#### `continue`, `break`,
* `continue` ends the current iteration of a `for` or `while` loop and goes to the beginning of the next (if any) iteration.
* `break` ends the execution of a `for` or `while` loop entirely.
  
#### `@dataclass`,
* `@dataclass` as imported by `from dataclasses import dataclass` can be used to automatically generate constructors, str/repr and many other functions by annotating a class. The attributes of each instant are defined as type-annotated attributes. These are not to be confused with class attributes in normal classes.

#### Duck typing
* Duck typing refers to the idea that the type of an object is less important than its implemented behavior. As long as an object can provide the expected methods and/or attributes, it could be considered a "compatible type" without using any typing.

#### Multiple inheritance
* If a class inherits from multiple classes which provide attribute with the same name, the first-imported one will be used
* To access other super-implementations with the same name, explicitely naming the superclass is necessary
* When having to call super-methods from multiple parents, `super()` will only ever provide the first-imported implementation.