<img src="../figures/HeaDS_logo_large_withTitle.png" width="300">

<img src="../figures/tsunami_logo.PNG" width="600">

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Center-for-Health-Data-Science/PythonTsunami/blob/intro/Functions/Functions.ipynb)

# Encapsulating Code with Functions

*prepared by [Katarina Nastou](https://www.cpr.ku.dk/staff/?pure=en/persons/672471) and [Rita Colaço](https://www.cpr.ku.dk/staff/?id=621366&vis=medarbejder)*

## Objectives

* Describe what a function is and how they are useful
* Explain exactly what the return keyword does and some of the side effects when using it
* Add parameters to functions to output different data
* Understand how scope works in a function
* Add keyword arguments to functions

## What is a function?

Just as we've learned that Python does a lot of work "under the hood"--translating its "syntactic sugar" into many lines of Python, interpreting that code into the C language, then compiling it into machine language, and so on--we too can wrap our code into simple functions, thereby making those lines easier to understand, debug, and reuse.  The basic unit of encapsulation in programming is called a [**function**](https://docs.python.org/3/glossary.html#term-function), and they can mostly be thought of as a pipeline:

**Input(s) -> [Some Code] -> Output(s)**

## Why use functions?

The main motivation behind using functions is to stay **DRY** - **D**on't **R**epeat **Y**ourself!  

Functions are essential to programming because they **simplify** them!  Good functions are simple to reason about, help provide an **abstraction** of that code's purpose, and help **teach** about the nature of the tasks that code is carrying out, and are **testable** (i.e.  they can be checked to see if they are working properly).

There are three types of functions in Python:

- Built-in functions, such as help(), min() or print()

- User-defined functions (UDFs), created by the users

- Anonymous functions, called lambda functions because they are not declared with the standard `def` keyword

In Python, a function is any object that is **callable** (i.e. can use the parentheses after the name).  When **calling** functions, Python's syntax is like so:

```python
output = function(input1, input2)
```

## Defining functions

Creating a function is done using the same code block style we've seen in for-loops and if-loops:

1. Define the keyword `def`.

2. Add parameters within the parentheses of the function. End the line with a colon.

3. Ass statements the function should execute.

4. End the function with a `return` statement if there should be an output.


Example 1:

```python
def say_hi():
    print('Hi!')
```

Example 2:

```python
def say_hi_return():
    return 'Hi!'
```

> The body of a function is indented, as Python uses indentation for grouping statements.

### Example

In [None]:
def say_hi():
    print('Hi!')
    
def say_hi_return():
    return 'Hi!'

In [None]:
say_hi()

In [None]:
say_hi_return()

## The `return` statement

Neccessary if you want to continue to work with the result of your function and run other operations on that result.

When Python encounters a `return` statement, it exits the function immediately, and passes the value on the right hand side to the calling context.

What is wrong here?

In [None]:
def say_hi():
    'Hello!'

print(say_hi()) # None

This also doesn't seem to work

In [None]:
def say_hi():
    print('Hello!')

result = say_hi() 

print(result) # None

In [None]:
def say_hi():
    return 'Hi!'

greeting = say_hi()

print(greeting) # 'Hi!

### Quiz

This function sums all the even numbers in a list.


```python
def sum_even_numbers(list_of_numbers):
    total = 0
    for number in list_of_numbers:
        if number % 2 == 0:
            total += number
        return total
```

What is wrong here?

A common return mistake is returning too early in a loop.

## Parameters vs Arguments

* A parameter is a variable in a function or method definition.
* When a method is called, the [arguments are the data](https://docs.python.org/3/glossary.html#term-argument) you pass into the method's parameters.
* [Parameter](https://docs.python.org/3/glossary.html#term-parameter) is variable in the declaration of function.
* Argument is the actual value of this variable that gets passed to function.

## Parameters

[Parameters](https://docs.python.org/3/glossary.html#term-parameter) are the placeholders that get assigned when you call the function. You can call your parameters anything!

**Example**: Write a function that returns the sum of two numbers.  Use it to add x and y below:

In [None]:
def add(a,b):
    print("a+b: ", a+b)

Data:

In [None]:
x = 3
y = 5

Calling function on data:

In [None]:
add(x, y)

## Scope

- Variables created in functions are scoped in that function! 
- Arguments are assigned to the named local variables in a function body.

In [None]:
def say_hello():
    instructor = 'Colt'
    return f'Hello {instructor}'

say_hello()

print(instructor) # NameError

## Arguments

There are four types of arguments that Python User-defined functions can take:

- Default arguments
- Required arguments
- Keyword arguments
- Variable number of arguments

### Default arguments

Values that the corresponding parameters take if no argument value is passed during the function call. Assign the default value by using the operator `=` in the function header. Can also be called optional arguments since there is no need to specify values for them when you call a function.

In [None]:
def add(a=10, b=20):
    return a+b

In [None]:
add() # 30

In [None]:
add(1,10) # 11

Having default parameters allows you to avoids errors with incorrect parameters and also generates more readable examples!

### Required arguments

Arguments that are mandatory to pass during the function call and in precisely the right order. If you switch them around, the result might be different.

In [None]:
def divide(a=10, b=20):
    return b/a

In [None]:
divide()

In [None]:
divide(20, 10)

### Keyword arguments

To make sure you call all the parameters in the right order, you can use the keyword arguments in you function call. You use these to identify the arguments by their parameter name.

When using keyword arguments order in the function call does not matter and it provides more flexibility.

In [None]:
def full_name(first, last):
    print (f"Your name is {first} {last}")

In [None]:
full_name(first='Jane', last='Doe')

In [None]:
full_name(last='Doe', first='Jane')

Keyword arguments are different from default parameters

* When you **define** a function and use an **=** you are setting a **default parameter**
* When you **invoke** a function and use an **=** you are making a **keyword argument**

### Variable number of arguments

In cases where you don't know the exact number of arguments that you want to pass to a function, you can use the syntax `*args`.

The asterisk (`*`) is placed before the variable name that holds the values of all nonkeyword arguments. `args` can be replaced by other names you want.


In [None]:
def plus(*args):
    return sum(args)

In [None]:
plus(1,4,5)

In [None]:
def plus(*args):
    total = 0
    for i in args:
        total += i
    return total

In [None]:
plus(20,30,40,50)

## Documenting functions

- Describe what your function does, which parameters it takes, and its return values.

- Use `""" """` to wrap it.

- Placed in the immediate line after the function header.

- Essential when writing complex functions.

- There are several formats. For more detail check out some Github repositories of Python packages like [scikit-learn](https://github.com/scikit-learn/scikit-learn/tree/main/sklearn) or [pandas](https://github.com/pandas-dev/pandas/tree/master/pandas), with plenty of examples.

Example:

```python
def full_name(first, last):
    """A function that takes as arguments a first and last name and prints a full name"""
    print (f"Your name is {first} {last}")
```

Adding parameter descriptions using the [numpy style](https://numpydoc.readthedocs.io/en/latest/format.html):

```python
def full_name(first, last):
    """A function that takes as arguments a first and last name and prints a full name
    
    Parameters
    ----------
    first : str
        The first name of someone.
    last : str
        The last name of someone.
    """
    print (f"Your name is {first} {last}")
```

## Breakout rooms

Build functions that carry out the requested task, then use them on the given data. Work in groups for 20mins.

### Exercise 1

#### Calculate absolute difference

Write a function that returns the absolute (positive) difference between two numbers.

Function:

In [None]:
def abs_difference(x, y):
    pass # placeholder statement where your code goes

Data: 

In [None]:
a = 25
b = 65

### Exercise 2

#### Calculate squares

Write a function that returns the square root of the sum of squares of two numbers. 

> Hint: You can use `math.sqrt` to calculate the square root.

Function:

In [None]:
from math import sqrt # see modules and classes notebook for more explanations
sqrt(25)

def square_root_of_squares():
#     YOUR CODE GOES HERE

Data:

In [None]:
x = 5
y = 4

### Exercise 3

#### Indicate sign of difference

Write a function that returns "Positive" if their difference is positive, and "Negative" if their difference is negative

Function:

Data:

In [None]:
x = 10
y = 5

### Exercise 4

#### Calculate sum and differences

Write a function that returns **both** the sum of the first two inputs and the difference between the second and third input: 

Function:

Data:

In [None]:
a = 10
b = 15
c = 20

### Exercise 5

#### Function overloading: Different behaviour for `int` and `str`

Write a function that adds two numbers together if the inputs are both numbers, and concatenates the inputs if they are both strings.

> Hint: You can use built-in functions [`type`](https://docs.python.org/3/library/functions.html#type) or [`isinstance`](https://docs.python.org/3/library/functions.html#isinstance)

Function:

Data:

In [None]:
x = "Hello"
y = "World"

## Using Functions to Wrap Code

In general, we want to make code as abstract as possible, but at the same time we should be specific about what we are trying to accomplish.  These two goals - abstraction and specificity - are often at odds with each other.  Encapsulation solves this problem by allowing us to put specific code inside function definitions, and abstract code in the code that calls it.

Usually, the process of producing this code format follows three steps:
  1. Write code that works.  (Focus on the specifics)
  2. Wrap it in a function.  (Encapsulate it)
  3. Call the function in your script (Abstract it)
 
 
For example:
```python
data = [2, 6, 3, 7, 8, 9, 1]
squares = []
for el in data:
    square = el ** 2
    squares.append(square)
sum(sum_of_squares)
```

will become:

```python
def sum_of_squares(data):
    """Comute the sum of squares."""
    squares = []
    for el in data:
        square = el ** 2
        squares.append(square)
    return sum(squares)


data = [2, 6, 3, 7, 8, 9, 1]
sum_of_squares(data)
```

Let's practice doing this with various types of loops:

## Breakout rooms

Take the following working Python code and rewrite it so that it uses functions. Work again in groups for 20mins.

### Exercise 1

#### Square numbers

The code below squares all of the numbers and removes all of the strings from the list.  Make it into a function, **square_numbers**:

In [None]:
data = [5, "missing", 54, "bad", 3, 6]
good_data = []
good_data_squared = []
idx = 0
while idx < len(data):
    el = data[idx]
    if isinstance(el, int):
        good_data_squared.append(el ** 2)
    idx += 1
good_data_squared

Put the modified code below:

### Exercise 2

#### Calculate the standard deviation

The code below calculates the **standard deviation** of the data.  Put it in a function called **`standard_deviation`** and use the function on the data:

In [None]:
import math

data = [2, 6, 8, 2, 5, 8, 9, 2]

mean = 0
std = 0
for el in data:
    mean += el / len(data)
dev_squareds = []
for el in data:
    dev = (el - mean) ** 2
    dev_squareds.append(dev)
sum_dev_squareds = sum(dev_squareds)
standard_dev = sum_dev_squareds / len(data) * 1.
standard_dev = math.sqrt(standard_dev)
standard_dev

Put the modified code below:

### Exercise 3

#### Bootstrap mean

The code below generates a **bootstrap** sample of the data, getting a random selection of the data **boot_n** times and calculating the mean of that sample, so that many estimates of the mean can be made from a single dataset.  Put it in a function called **bootstrap_means**.

In [None]:
import random

data = [2, 6, 8, 2, 5, 8, 9, 2, 6, 2, 10]
n_boot = 5
means = []
for rep in range(n_boot):
    sample = random.choices(data, k=len(data))
    mean = sum(sample) / len(sample)
    rep = rep * 2
    means.append(mean)
means


Put the modified code below:

### Exercise 4

#### Build a function to modify a file

Based on what we have seen in the previous lectures (Importing data, Conditionals and Loops): 

Build a function that takes as parameter the name of a country, then reads the file 'data/sample.txt', checks if the country exists in the file and if it doesn't the function should add it as a new line and otherwise print out that the country already exists (You can use a formatted string).

## Warning: Don't use mutable types as default parameters

- do not use e.g. an empty list as a default parameter, use `None`

```python
def add_number(x, to=None):
    """add x to a list of numbers. Create a new list if none is passed."""
    
    if to is None:
        to = []
    to.append(x)
    return to

my_numbers = add_number(4)

add_number(9, to=my_numbers)
print(my_numbers)
```

> More prominent examples arise out of the context of classes, e.g. see Ramalho 2015: Fluent Python, p. 236ff.

In [None]:
# example from last section "Using Function to Wrap your code"

def sum_of_squares(data, squares = []):
    for el in data:
        square = el ** 2
        squares.append(square)
    # replace one statement
    # return sum(squares)
    return squares
 

data = [2, 6, 3]
sum_of_squares(data) 

In [None]:
data = [7, 8, 9, 1]
sum_of_squares(data, squares=[])

## Extra

### Functions Applied to Functions

Functions that operate on other functions are called "Higher order functions".

```python

def mult_by_five(x):
    return 5 * x

def call(fn, arg):
    """Call fn on arg"""
    return fn(arg)

def squared_call(fn, arg):
    """Call fn on the result of calling fn on arg"""
    return fn(fn(arg))

```

In [None]:
print(call(mult_by_five, 1),
      squared_call(mult_by_five, 1), 
      sep='\n')

### Refactoring Code: Improving it without breaking it

By modularizing code, we give ourselves the ability to modify small parts of the code without having to worry about the rest of it--so long as the inputs and outputs don't change, anything that happens in the middle doesn't matter to the code that calls the function!

The exercises above all contain things that can be improved, whether that is to make them simpler, more readable, or more reliable, any improvements are helpful.  Let's work through them again and make the following improvements:

  1. **Remove Orphan Code**: Often, code that isn't actually used by the function is left sitting there, lost and forgotten.  Deleting those lines will make it easier to see how everything works and improve readability.
  2. **Change variable names to something clearer**: variables like x and y are not helpful.  Make them something that represents that result of the line!
  3. **Reduce the number of steps**: If there are several lines doing something you think is simple, either compress them to a single line or make a new function that represents that action.  If you know that the function you want already exists in another package, then import that package and use it!
  4. **Convert While loops to For loops**: If you see iteration happening, use a for-loop!  If it's a single action, why not make it a comprehension?

### **Exercises**

Refactor the functions created in the previous section.  Make sure to re-run the code with each change you make to verify that it still works!