<img src="https://github.com/Center-for-Health-Data-Science/PythonTsunami/blob/spring2022/figures/HeaDS_logo_large_withTitle.png?raw=1" width="300">

<img src="https://github.com/Center-for-Health-Data-Science/PythonTsunami/blob/spring2022/figures/tsunami_logo.PNG?raw=1" width="600">

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Center-for-Health-Data-Science/PythonTsunami/blob/spring2022/Functions/Functions.ipynb)

# Encapsulating Code with Functions


## What is a function?

You can think of functions like 'mini-programs' that take some input and convert it into the desired output: 

**Input -> [Function] -> Output**

You have already used functions such as `print()` and `int()`. We call these functions _built-in_ since they already exist in python. 

We can also define our own functions using the keyword `def`! Here we have a user-defined function:

In [None]:
#define the function
def say_hi():
    print('Hi!')

User defined functions always start with the keyword `def`, followed by a pair of parathesis and a colon. The code inside a function must be indented, just like the code inside a loop. This is the way python knows which code is part of the function since we don't wrap bracets around it.

In [None]:
#execute/call the function
say_hi()

Hi!


You may notice that we use parentheses when we execute the function, even if we don't pass any input to the function.

## Why use functions?

The main motivation behind using functions is to simplify code. They are **references** to the block of code we assign to them:

```python
def say_hi():
  print('Hi!')
```

When we call `say_hi()` what will happen is that the code inside, `print('Hi!')`, will be executed. This is useful because we can put there many lines of code that perform a complex task and execute them repeatedly by calling the name of the function.




## The `return` statement

If you want your function to actually produce output, say a variable you can continue to work with, you need to assign the output to a variable name:

```python
my_output_var = function(input1, input2)
```

We will discuss more about inputs later. For now you can see that they go inside the parentheses that follow the function name.

What you also need is to tell your function what the return value should be by using the `return` statement:

In [None]:
def add_three(my_int):
  return my_int + 3

In [None]:
my_result = add_three(5)
print(my_result)

8


Actually, all functions in python have a return value, even if you do not specify `return`. In that case, they will return the `None` object.

When Python encounters a `return` statement, it exits the function **immediately**, and passes the value on the right hand side to the calling context. Code that follows after the `return` statement is not executed! 

In [None]:
def add_three(my_int):
  return my_int + 3
  print('Yo!')

In [None]:
my_result = add_three(5)
print(my_result)

The same function can incoporate several return statements:

In [None]:
from datetime import datetime

def are_you_working():
    #Thursday is day 3, Friday is day 4
    if datetime.today().weekday() == 3 or datetime.today().weekday() == 5:
        return 'No, am learning python!'
    else:
        return 'Yes, working hard.'

print(are_you_working())


Yes, working hard.


But you can only ever `return` once! In other words, you can never reach both return statements. The first one that is reached will end the function, no matter what comes after.

Then how can you return more than one result? Well, you can only encounter one return statement, but the object you return can be complex, i.e. a list, a dict, a tuple. In the examples above we're returning strings. The example below returns a list:

In [None]:
def fun_with_strings(my_string):
    my_list = [my_string.upper(), my_string.lower(), my_string[::-1]]
    return my_list

In [None]:
fun_with_strings('Hello World!')

['HELLO WORLD!', 'hello world!', '!dlroW olleH']


## Exercises 1 - 2

Work in groups for ca 15 minutes on the following two exercises:

### Exercise 1

#### My first function


Write a function that squares the input number. Look at the example of `add_three` for how to handle simple input. We'll talk about input more in depth later.

Function:

In [None]:
def calc_square(my_number):
    pass # placeholder statement where your code goes

### Exercise 2

A common return mistake is returning too early in a loop.
The following function sums all the even numbers in a list.

What is wrong here? Create some test data like `my_list = [1,2,3,4,5,6]` and test the function by calling it on the list: `sum_even_numbers(my_list)`. Do you think the result is correct?
How can you change the code to get the correct result?

In [None]:
def sum_even_numbers(list_of_numbers):
    total = 0
    for number in list_of_numbers:
        if number % 2 == 0:
            total += number
        return total

## Parameters vs Arguments

Some functions work without arguments, like our say_hi() defined above. But usually we want to pass data into the function that the function should work on. Like when we call `print` :

```python
print('Hi!')
```

In the above example, the string 'Hi!' is called an __argument__. It is what we want `print()` to work on. Any value passed to a function is an [argument](https://docs.python.org/3/glossary.html#term-argument).

But then what are __parameters__?

[Parameters ](https://docs.python.org/3/glossary.html#term-parameter) are the names that we want to use inside the function. They appear in the function definition. Consider this:


In [None]:
def greet_me(my_name):
    print(f'Hello {my_name}!')

In [None]:
greet_me('Henrike')

Hello Henrike!


_my_name_ is the parameter. It is what the variable is called __inside__ the function. The string 'Henrike' is the argument I pass when I call the function. 

You can also think of it as the argument as being the actual value and the parameter being its alias.

Some more examples:
https://docs.python.org/3/faq/programming.html#faq-argument-vs-parameter

**Example**: Write a function that returns the sum of two numbers.  Use it to add x and y below:

In [None]:
def add(a,b):
    print("a+b: ", a+b)

Data:

In [None]:
x = 3
y = 5

Calling function on data:

In [None]:
add(x, y)

a+b:  8


What are the parameters and what are the arguments in the above example?

### Default arguments

Values that the corresponding parameters take if no argument value is passed during the function call. Assign the default value by using the operator `=` in the function header. Can also be called optional arguments since there is no need to specify values for them when you call a function.

In [None]:
def add(a=10, b=20):
    return a+b

In [None]:
add()

30

In [None]:
add(a=1,b=10)

11

In [None]:
add(b=10, a=20)

30

Having default parameters allows you to avoids errors with incorrect parameters and also generates more readable examples!

### Required arguments

Arguments that are mandatory to pass during the function call and in precisely the right order. If you switch them around, the result might be different.

In [None]:
def divide(a, b):
    return a/b

In [None]:
divide()

TypeError: ignored

In [None]:
divide(20, 10)

2.0

In [None]:
divide(10,20)

0.5



By default, arguments are positional-or-keyword. If you do not specify which argument was given to which parameter they will be assigned in the order that was defined in the function definition. 

In this example

```python
def divide(a, b):
    return a/b
```

If you call `divide(20,10)` the parameter `a` will take the value 20 and the parameter `b` will take the value 10 since they are specified in that order in the function definition.


You can also specify which parameter an argument should be assigned to by using it's name (keyword) during the call:

In [None]:
divide(b=20, a=10)

0.5

Keyword arguments are different from default parameters

* When you **define** a function and use an **=** you are setting a **default parameter**
* When you **call** a function and use an **=** you are making a **keyword argument**

## Exercises 3 - 4

20 mins


### Exercise 3

#### Indicate sign of difference

Write a function that takes two integers or floats as input (you can test for type if you feel up for it) and calculates their difference. It should return "Positive" if their difference is positive, "Negative" if their difference is negative and 'Same' if there is no difference.

Function:

Test data:

In [None]:
x = 10
y = 5

### Exercise 4

#### Calculate sum and differences

Write a function that returns **both** the sum of the first two inputs and the difference between the second and third input: 

Function:

Test data:

In [None]:
a = 10
b = 15
c = 20

## Scope

Variables created in functions are scoped inside that function! They do not exist outside the function.

Example:


In [None]:
def say_hello():
    instructor = 'Henrike'
    return f'Hello {instructor}'

say_hello()

print(instructor) # NameError

NameError: name 'instructor' is not defined

Similarly, anything you do to a variable inside a function and do not return has no consequence on the outside. Consider:

In [None]:
my_int = 10

In [None]:
def my_function(my_int):
    my_int *= 2
    print(my_int)

In [None]:
my_function(my_int)

20


In [None]:
my_int

10

Why is my_int still 10?

## Using Functions to Wrap Code

As said above, one of the main reasons for using user defined functions is to wrap blocks of code we want to execute many times into one command.

Usually, the process of producing this code format follows three steps:
  1. Write code that works.  (Focus on the specifics and test it)
  2. Wrap it in a function.  (Encapsulate it)
  3. Call the function in your script (Abstract it)
 
 
For example:
We want to take a list of numbers and calculate the sum of squares, that is we square every element of the list and then sum them. Imagine we want to do this calculation for many lists of data.

"Raw" code:
```python
data = [2, 6, 3, 7, 8, 9, 1]
squares = []
for item in data:
    square = item ** 2
    squares.append(square)
sum(sum_of_squares)
```

Wrapped in a function:

```python
def sum_of_squares(data):
    """Comute the sum of squares."""
    squares = []
    for item in data:
        square = item ** 2
        squares.append(square)
    return sum(squares)
```

We call the function which is shorter and much less error prone(!) than copying the code block:
```
data = [2, 6, 3, 7, 8, 9, 1]
sum_of_squares(data)
```

Let's practice this with a few exercises.

### Exercise 5

Work again in groups for ~20 minutes.

Imagine we want to count observations of holiday destinations (and we don't know what pandas and dataframes are). We ask 20 people about their preferred holiday destination. Now, we want to summarize that info into a dictionary. Because we want to do the same for other data later, such as favorite color and food, we write a function.

The function shall do the following:
* take as a required argument the holiday destination (as a string)
* take as an optional argument the dictionary we want to collect the info in. If no dictionary is passed, the function shall create a new one.
* if the destination already exists in the dictionary, count the observations (values) up by one.
* if the destination does not exist yet, create a new key-value pair for it in the dictionary and set the number of observations to 1.
* In the end return the (updated) dictionary

Hints:
* have a look at the section `default arguments` to see how to create an optional argument
* a good default value for empty or non-existing objects is generally `None`
* you can have a look at the conditions notebook for how to test whether an object is "empty"

## Documenting functions

- Describe what your function does, which parameters it takes, and its return values.

- Use `""" """` to wrap it.

- Placed in the immediate line after the function header.

- Essential when writing complex functions.

- There are several formats. For more detail check out some Github repositories of Python packages like [scikit-learn](https://github.com/scikit-learn/scikit-learn/tree/main/sklearn) or [pandas](https://github.com/pandas-dev/pandas/tree/master/pandas), with plenty of examples.

Example:

```python
def full_name(first, last):
    """A function that takes as arguments a first and last name and prints a full name"""
    print (f"Your name is {first} {last}")
```

Adding parameter descriptions using the [numpy style](https://numpydoc.readthedocs.io/en/latest/format.html):

```python
def full_name(first, last):
    """A function that takes as arguments a first and last name and prints a full name
    
    Parameters
    ----------
    first : str
        The first name of someone.
    last : str
        The last name of someone.
    """
    print (f"Your name is {first} {last}")
```