# Session 17

[![Open and Execute in Google Colaboratory](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/astrojuanlu/ie-mbd-python-data-analysis-i/blob/main/sessions/Session%2017.ipynb)

- How functions encapsulate code for reuse
- User-defined functions
- Lambda functions

## How functions encapsulate code for reuse

As discussed at the very beginning,

> A function can be considered a "black box" that takes some inputs, returns some outputs, and encapsulates all the internal behavior so that it's "invisible".

![Function as a black box](../img/function-black-box.png)

You've been using functions (and methods) all the time. And the powerful thing is that you don't need to understand how they work, what variables they use, or anything of that sort. You pass an input, and receive an output. It's that simple!

By defining your own functions, you can build a small library of reusable behaviours, to form the basis for any well-structured codebase.

## User-defined functions

To define custom Python functions, beyond those built-in or available from third-party libraries, use the `def` keyword:

In [None]:
def concatenate_strings(input_1, input_2):
    output = input_1 + input_2
    return output

concatenate_strings("This is", " the output")

<div class="alert alert-danger">Notice that the variables <code>input_1</code>, <code>input_2</code>, and <code>output</code> only exist <em>inside</em> the function!</div>

In [None]:
input_1

In [None]:
output

### Returning several outputs

Can a function return several outputs? Yes: returning a sequence. See how sequence unpacking works:

In [None]:
a, b = 1, 2
print(a)
print(b)

Hence, you can leverage that with functions:

In [None]:
def return_1_2():  # No inputs
    return 1, 2

a, b = return_1_2()
print(a)
print(b)

### Passing parameters

So far, you are calling the functions without naming the parameters, but you can specify their names too. The rule is that, once you name a parameter, all the following ones should be named:

In [None]:
def add_three_numbers(a, b, c):
    return a + b + c

In [None]:
add_three_numbers(1, 2, 3)

In [None]:
add_three_numbers(1, b=2, c=3)

In [None]:
add_three_numbers(a=1, 2, 3)  # Syntax error!

### Documenting

To document a function, add a string right below the function definition. This is called the function "docstring".

In [None]:
def add(a, b):
    "Add two numbers"
    return a + b

In [None]:
add?

Usually multi-line strings are used, and it's also a good practice to declare the types of the parameters and the return value too:

In [None]:
def add(a: float, b: float) -> float:
    """Add two numbers

    Parameters
    ----------
    a : float
        First number.
    b : float
        Second number.

    """
    return a + b

In [None]:
add?

## Exercises

### 1. Sum and product of numbers

Create a function `sum_sequence` that takes a sequence of numbers (a list, a tuple, a range) and returns the sum of all of them as a number. You can't use the `sum` built-in function - instead, use it to cross-check your result.

Next, create a `prod_sequence` that takes a sequence of numbers and returns the product of all of them as a number. Cross-check your result with [`math.prod`](https://docs.python.org/3.10/library/math.html#math.prod).

### 2. Mean and median

Using your `sum_sequence` function defined above, create a function `mean_sequence` that returns the mean of an sequence of numbers (a list, a tuple, a range). Cross-check its result with `statistics.mean`.

Next, create a `median_sequence` that returns the median of a sequence of numbers. Cross-check its result with `statistics.median`.

## `lambda` (inline) functions

Inline functions are created with the `lambda` keyword. They are exactly like normal functions, with one limitation: they cannot contain statements (conditionals, loops). In simple terms, the body of the function can only be one line.

In [None]:
def add(a, b):
    return a + b

add(1, 2.0)

In [None]:
add = lambda a, b: a + b

add(1, 2.0)

However, `lambda` functions are not usually stored in variables: they are used when quick, small functions are required and we don't want to bother defining a normal function. For example:

In [None]:
seq = [("Hello", 0), ("World", 1), ("!", 2)]
sorted(seq)

In [None]:
# Works, because tuples, like all sequences, are compared lexicographically
("!", 2) < ("World", 1)

In [None]:
# If I want to sort by number:
sorted(seq, key=lambda seq: seq[1])

Let's unpack that:

In [None]:
def get_second_element(seq):
    return seq[1]

get_second_element(("Hello", 10))

In [None]:
sorted(seq, key=get_second_element)

<div class="alert alert-info">To add conditionals and loops to <code>lambda</code> functions, you can use <a href="https://docs.python.org/3/reference/expressions.html">conditional expressions</a> and <a href="https://docs.python.org/3/reference/expressions.html">comprehensions</a> respectively. But it's better to keep <code>lambda</code> functions as short as possible, or they become too difficult to understand and debug.</div>

## Exercises

### 3. Find the string with the maximum length

You have a list of strings. Write a function that returns the longest one.

```python
def longest_string(list_of_strings: list[str]) -> str:
    ...

longest_string(["c", "bb", "aaa"]) == "aaa"

### 4. Clean tenders data

Read the tenders data into a pandas DataFrame. Create a function `clean_tenders_data` that takes the data in its original form and returns a new DataFrame with the following modifications:

- Instead of the `type` column, a `type_slug` column that contains the `slug` property of `type`
- Instead of the `awarded` column, an `awarded_date` column that contains the `date` property of the first list of every dictionary in the `awarded` column
- Instead of a `purchaser` column, a `purchaser_id` column that contains the `id` of the `purchaser` as an integer
- A `deadline_length_days` converted to `float`
- The `id` column used as index

In [None]:
TENDERS_DATA_URL = (
    "https://github.com/astrojuanlu/ie-mbd-python-data-analysis-i/"
    "raw/main/data/tenders.es.json"
)