# Introduction to Python for Data Science
### Tomasz Rodak
## Lab VII

2024/2025, winter semester

---

## Literature


* [The Python Tutorial](https://docs.python.org/3/tutorial/index.html)
* [Dive Into Python 3](https://diveintopython3.net/index.html)
* [Automate the Boring Stuff with Python](https://automatetheboringstuff.com/)
* [Python 3 documentation](https://docs.python.org/3/index.html)



## Higher order functions

A higher order function is a function that takes another function as an argument or returns a function as a result. In Python, functions are first-class objects, which means that they can be passed around as arguments to other functions, returned as values from other functions, and assigned to variables or stored in data structures.

## Built-in higher order functions

These are some of the built-in higher order functions in Python:

* `max()` and `min()` with a key function
* `sorted()` with a key function
* `list.sort()` with a key function
* `filter()` with a predicate function
* `map()` with a transformation function

All of these functions take another function as an argument. In case of `max()`, `min()`, `sorted()`, and `list.sort()` this is optional, but in case of `filter()` and `map()` it is mandatory.

### `max()` and `min()` with a key function

The `max()` and `min()` functions return the largest and the smallest element of an iterable, respectively. If the iterable is empty, a `ValueError` is raised. The `key` argument is a function that is called on each element of the iterable before the comparison is made. The default value of the `key` argument is `None`, which means that the elements are compared directly.

Example:

```python
>>> max([1, 2, 3, 4, 5])
5
>>> def neg(x):
...     return -x

>>> max([1, 2, 3, 4, 5], key=neg)
1
>>> max("Lorem ipsum dolor sit amet".split())
'sit'
>>> max("Lorem ipsum dolor sit amet".split(), key=len)
'Lorem'
```





### Exercise 7.1

Create a function called `count_vowels(s: str) -> int` that counts and returns the number of vowels in a given string `s`. The vowels are `a`, `e`, `i`, `o`, and `u`.
Use this `count_vowels()` function in combination with the `max()` function and its `key` argument to identify the word from the list of Debian release names that contains the most vowels.

The list of Debian release names is as follows:
```
>>> names = ["buzz", "rex", "bo", "hamm", "slink", "potato", "woody", "sarge",
...          "etch", "lenny", "squeeze", "wheezy", "jessie", "stretch", "buster",
...          "bullseye", "bookworm", "trixie", "sid"]
```

Additionally, determine which word or words in the list have the *fewest* vowels, and find all such words.

---

### `sorted()` and `list.sort()` with a key function

The key function here serves the same purpose as in the case of `max()` and `min()`. The `sorted()` function returns a new list with the elements of the iterable sorted in ascending order. The `list.sort()` method sorts the list in place and returns `None`. Both functions have an optional `reverse` argument that, if set to `True`, sorts the elements in descending order, the default value is `False`. The `key` argument is optional, the default value is `None`. If the `key` argument is not `None`, the elements are sorted based on the values returned by the key function for each element.

Example:

```python
>>> sorted([1, 2, 3, 4, 5])
[1, 2, 3, 4, 5]
>>> sorted([1, 2, 3, 4, 5], key=neg)
[5, 4, 3, 2, 1]
>>> sorted("Lorem ipsum dolor sit amet".split())
['Lorem', 'amet', 'dolor', 'ipsum', 'sit']
>>> sorted("Lorem ipsum dolor sit amet".split(), key=len)
['sit', 'amet', 'Lorem', 'ipsum', 'dolor']
>>> sorted("Lorem ipsum dolor sit amet".split(), key=len, reverse=True)
['Lorem', 'ipsum', 'dolor', 'amet', 'sit']
>>> a = [1, 3, 2, 5, 4]
>>> a.sort()
>>> a
[1, 2, 3, 4, 5]
>>> a.sort(key=neg)
>>> a
[5, 4, 3, 2, 1]
```



### Exercise 7.2

Use the `sorted()` function with the `key` argument to sort the list of Debian release names by the number of vowels in each name. 

---

### Exercise 7.3

Create a list of Debian release names as provided earlier. Use the `list.sort()` method with the `key` argument to sort the names based on their length (number of characters).

Explain the difference between the `sorted()` function and the `list.sort()` method.

---

### `filter()` with a predicate function

The `filter(function, iterable)` function returns an iterator over the elements of the `iterable` for which the `function` returns `True`. The `function` is called with one argument, the element of the `iterable`. 

Example:

```python
>>> def is_even(x):
...     return x % 2 == 0

>>> list(filter(is_even, range(10)))
[0, 2, 4, 6, 8]
>>> list(filter(bool, [0, 1, 2, 3, 4]))
[1, 2, 3, 4]
```


### Exercise 7.4

Create a list of all Debian release names that contain the letter o. Use the `filter()` function with a predicate function to check if the letter o is present in each name.

Next, create a list of all Debian release names that have an even number of characters.

---

### `map()` with a transformation function

The `map(function, iterable)` function returns an iterator over the elements of the `iterable` after applying the `function` to each element. The `function` is called with one argument, the element of the `iterable`.

Example:

```python
>>> def square(x):
...     return x ** 2

>>> list(map(square, range(5)))
[0, 1, 4, 9, 16]

>>> list(map(str.upper, "Lorem ipsum dolor sit amet".split()))
['LOREM', 'IPSUM', 'DOLOR', 'SIT', 'AMET']
```

The `map()` function can be replaced by list or generator comprehensions, which are generally considered more readable. However, the pattern of applying a function to each element of an iterable is so common that `map()` remains widely used. In scenarios involving multithreading or multiprocessing, `map()` might be the preferred or only viable option, as it integrates seamlessly with functions like `ThreadPoolExecutor.map()` or `Pool.map()`.

### Exercise 7.5

Create a list of the lengths of all Debian release names. Use the `map()` function with a transformation function to calculate the length of each name.

Do the same using a list comprehension.

---

### Exercise 7.6

Finish the implementation of the `distant_point(seq)` function.

```python
def distant_point(seq):
    """
    Given a sequence of points represented as tuples (x, y), return the point that is farthest from the origin (0, 0).

    The distance from the origin is calculated using the Euclidean distance formula:
        distance = sqrt(x^2 + y^2)

    Examples:
    >>> distant_point([(1, 1), (2, 3), (3, -3), (4, 4)])
    (4, 4)
    >>> distant_point([(1, 1), (2, 3), (3, -3), (4, 4), (0, 0)])
    (4, 4)
    """
    pass
```

---

### Exercise 7.7

Finish the implementation of the `sort_by_distance(seq, ascending=True)` function.

```python
def sort_by_distance(seq, ascending=True):
    """
    Given a sequence of points represented as tuples (x, y), return the sequence sorted by the distance from the origin (0, 0).

    The distance from the origin is calculated using the Euclidean distance formula:
        distance = sqrt(x^2 + y^2)

    The `ascending` argument determines the sorting order. If `ascending` is True, the sequence is sorted in ascending order, otherwise in descending order.

    Examples:
    >>> sort_by_distance([(1, 1), (2, 2), (3, 3), (4, 4)])
    [(1, 1), (2, 2), (3, 3), (4, 4)]
    >>> sort_by_distance([(1, 1), (-10, 10), (3, 3), (4, 4)])
    [(1, 1), (3, 3), (4, 4), (-10, 10)]
    >>> sort_by_distance([(1, 1), (-10, 10), (3, 3), (4, 4)], ascending=False)
    [(-10, 10), (4, 4), (3, 3), (1, 1)]
    """
    pass
```

---

### Exercise 7.8

Complete the implementation of the `sort_by_neglex(seq, ascending=True)` function.

```python
def sort_by_neglex(seq, ascending=True):
    """
    Sorts a sequence of strings or tuples based on the lexicographic order of the elements, 
    but treating each element as if it were reversed.

    - For strings, the reverse of the string is used for comparison.
    - For tuples, the reverse of the tuple (element order reversed) is used for comparison.

    The `ascending` parameter controls the sorting order:
    - If `ascending=True` (default), the sequence is sorted in ascending order.
    - If `ascending=False`, the sequence is sorted in descending order.

    Examples:
    >>> sort_by_neglex(["Lorem", "ipsum", "dolor", "sit", "amet"])
    ['Lorem', 'ipsum', 'dolor', 'amet', 'sit']
    >>> sort_by_neglex(["Lorem", "ipsum", "dolor", "sit", "amet"], ascending=False)
    ['sit', 'amet', 'dolor', 'ipsum', 'Lorem']
    >>> sort_by_neglex([(1, 2, 3), (3, 2, 1), (2, 3, 1), (1, 3, 2)])
    [(3, 2, 1), (2, 3, 1), (1, 3, 2), (1, 2, 3)]
    """
    pass
```

---

### `functools.reduce()`

The `functools.reduce(function, iterable, initializer=None)` function applies the `function` cumulatively to the items of the `iterable`, from left to right, so as to reduce the iterable to a single value. The `initializer` argument is optional. If it is not `None`, the `initializer` is placed before the items of the `iterable` in the calculation, and serves as a default value when the `iterable` is empty.

[Here](https://docs.python.org/3/library/functools.html#functools.reduce) you can find the documentation of the `functools.reduce()` function and equivalent pure Python implementation.

Example:

```python
>>> from functools import reduce
>>> def add(x, y):
...     return x + y

>>> reduce(add, [1, 2, 3, 4, 5])
15
>>> reduce(add, [1, 2, 3, 4, 5], 10)
25
```

### Exercise 7.9

Create a function called `product(seq)` that calculates the product of all elements in a given sequence `seq`. Use the `functools.reduce()` function to implement the `product()` function.

---

### `lambda` functions

A `lambda` function is a small anonymous function defined with the `lambda` keyword. It can take any number of arguments, but can only have one expression. `lambda` functions can be used wherever function objects are required.

Example:

```python
>>> f = lambda x: x ** 2
>>> f(5)
25
>>> g = lambda x, y: x + y
>>> g(3, 4)
7
```

### Exercise 7.10

Use `functools.reduce()` and a `lambda` function to implement the `product()` function from the previous exercise.

---

### Exercise 7.11

Finish the implementation of the following functions. Make use of the `functools.reduce()` function and a `lambda` statement or a named function. Do not use a `for` loop or a built-in functions or methods that would make the implementation trivial.

```python
def concatenate(seq):
    """
    Given a sequence of strings, return a single string that is the concatenation of all strings in the sequence.

    Examples:
    >>> concatenate(["Lorem", "ipsum", "dolor", "sit", "amet"])
    'Loremipsumdolorsitamet'
    >>> concatenate(["a", "b", "c", "d", "e"])
    'abcde'
    """
    pass

def join(seq, sep=""):
    """
    Given a sequence of strings, return a single string that is the concatenation of all strings in the sequence, separated by the `sep` string.

    Examples:
    >>> join(["Lorem", "ipsum", "dolor", "sit", "amet"])
    'Loremipsumdolorsitamet'
    >>> join(["Lorem", "ipsum", "dolor", "sit", "amet"], " ")
    'Lorem ipsum dolor sit amet'
    >>> join(["a", "b", "c", "d", "e"], "-")
    'a-b-c-d-e'
    """
    pass

def longest_string(seq):
    """
    Given a sequence of strings, return the longest string.

    Examples:
    >>> longest_string(["Lorem", "ipsum", "dolor", "sit", "amet"])
    'Lorem'
    >>> longest_string(["a", "ab", "abc", "abcd", "abcde"])
    'abcde'
    """
    pass

def maximum(seq):
    """
    Given a sequence of comparable elements, return the maximum element.

    Examples:
    >>> maximum([1, 2, 3, 4, 5])
    5
    >>> maximum("Lorem ipsum dolor sit amet".split())
    'sit'
    """
    pass

def flatten(seq):
    """
    Given a sequence of sequences, return a single sequence that is the concatenation of all sequences in the input sequence.

    Examples:
    >>> flatten([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
    [1, 2, 3, 4, 5, 6, 7, 8, 9]
    >>> flatten([(1, 2, 3), (4, 5, 6), (7, 8, 9)])
    (1, 2, 3, 4, 5, 6, 7, 8, 9)
    """
    pass
```

---