<img src="../figures/HeaDS_logo_large_withTitle.png" width="300">

<img src="../figures/tsunami_logo.PNG" width="600">

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Center-for-Health-Data-Science/PythonTsunami/blob/spring2022/Functions/Functions.ipynb)

# Encapsulating Code with Functions

*prepared by [Katarina Nastou](https://www.cpr.ku.dk/staff/?pure=en/persons/672471) and [Rita Colaço](https://www.cpr.ku.dk/staff/?id=621366&vis=medarbejder)*

## Objectives

* Describe what a function is and how they are useful
* Explain exactly what the return keyword does and some of the side effects when using it
* Add parameters to functions to output different data
* Understand how scope works in a function
* Add keyword arguments to functions

## What is a function?

Just as we've learned that Python does a lot of work "under the hood"--translating its "syntactic sugar" into many lines of Python, interpreting that code into the C language, then compiling it into machine language, and so on--we too can wrap our code into simple functions, thereby making those lines easier to understand, debug, and reuse.  The basic unit of encapsulation in programming is called a [**function**](https://docs.python.org/3/glossary.html#term-function), and they can mostly be thought of as a pipeline:

**Input(s) -> [Some Code] -> Output(s)**

You have already used functions such as `print()` and `int()`. We call these functions _built-in_ since they already exist in python. 

There are three types of functions in Python:

- Built-in functions, such as help(), min() or print()

- User-defined functions (UDFs), created by the users

- Anonymous functions, called lambda functions because they are not declared with the standard `def` keyword

## Why use functions?

The main motivation behind using functions is to stay **DRY** - **D**on't **R**epeat **Y**ourself!  

Functions are essential to programming because they **simplify** code!  Good functions are simple to reason about, help provide an **abstraction** of that code's purpose, and help **teach** about the nature of the tasks that code is carrying out, and are **testable** (i.e.  they can be checked to see if they are working properly).

In Python, you can recognize a function by the fact that it is **callable**. This mean you need to use parentheses after the function name.  When **calling** a function, the following syntax is used:

```python
output = function(input1, input2)
```

## Defining functions

Creating a function is done using the same code block style we've seen in for-loops and if-loops:

1. Define the keyword `def`.

2. Add parameters within the parentheses of the function. End the line with a colon.

3. Add statements the function should execute.

4. End the function with a `return` statement if there should be an output.


Example 1:

```python
def say_hi():
    print('Hi!')
```

Example 2:

```python
def say_hi_return():
    return 'Hi!'
```

> The body of a function is indented, as Python uses indentation for grouping statements.

### Example

In [None]:
def say_hi():
    print('Hi!')
    
def say_hi_return():
    return 'Hi!'

In [None]:
say_hi()

In [None]:
say_hi_return()

## The `return` statement

Neccessary if you want to continue to work with the result of your function and run other operations on that result.

When Python encounters a `return` statement, it exits the function immediately, and passes the value on the right hand side to the calling context. Every function in python has a return value even if you do not explictly state `return`. The default return value is `None`. 

Is this what you expected to happen?

In [None]:
def say_hi():
    'Hello!'

my_result = say_hi()

print(my_result) # None
type(my_result)

This prints 'Hello' but the return type is still None.

In [None]:
def say_hi():
    print('Hello!')

result = say_hi() 

print(result) # None

When we use the `return` statement, the object that comes after it becomes the functions return value.

In [None]:
def say_hi():
    return 'Hi!'

greeting = say_hi()

print(greeting) # 'Hi!

The same function can incoporate several return statements:

In [4]:
from datetime import datetime

def are_you_working():
    #Thursday is day 3, Friday is day 4
    if datetime.today().weekday() == 3 or datetime.today().weekday() == 4:
        return 'No, am learning python!'
    else:
        return 'Yes, working hard.'

print(are_you_working())


Yes, working hard.


But you can only ever `return` once! In other words, you can never reach both return statements. The first one that is reached will end the function, no matter what comes after.

Then how can you return more than one result, let's say you're doing something complex? Well, you can only encounter one return statement, but the object you return can be complex, i.e. a list, a dict, a tuple. In the example above we're returning strings.

In [24]:
def string_fun(my_string):
    my_list = [my_string.upper(), my_string.lower(), my_string[::-1]]
    return my_list

In [25]:
string_fun('Hello World!')

['HELLO WORLD!', 'hello world!', '!dlroW olleH']

### Quiz

A common return mistake is returning too early in a loop.
This function sums all the even numbers in a list.


```python
def sum_even_numbers(list_of_numbers):
    total = 0
    for number in list_of_numbers:
        if number % 2 == 0:
            total += number
        return total
```

What is wrong here? Can you correct the code?

## Parameters vs Arguments

Some functions work without arguments, like our say_hi() defined above. But usually we want to pass data into the function that the function should work on. Like when we call `print` :

```python
print('Hi!')
```

In the above example, the string 'Hi!' is called an __argument__. It is what we want `print()` to work on. Any value passed to a function is an [argument](https://docs.python.org/3/glossary.html#term-argument).

But then what are __parameters__?

[Parameters ](https://docs.python.org/3/glossary.html#term-parameter) are the names that we want to use inside the function. They appear in the function definition. Consider this:


In [5]:
def greet_me(my_name):
    print('Hello', my_name, '!')

In [6]:
greet_me('Henrike')

Hello Henrike !


_my_name_ is the parameter. It is what the variable is called __inside__ the function. The string 'Henrike' is the argument I pass when I call the function. 

You can also think of it as the argument as being the actual value and the parameter being its alias.

Some more examples:
https://docs.python.org/3/faq/programming.html#faq-argument-vs-parameter

Sp parameters are like placeholders. Like other variables, you can call your parameters anything, but it is wise to choose clear names that you can recognize! They can have default values too, more about this later.

**Example**: Write a function that returns the sum of two numbers.  Use it to add x and y below:

In [None]:
def add(a,b):
    print("a+b: ", a+b)

Data:

In [None]:
x = 3
y = 5

Calling function on data:

In [None]:
add(x, y)

What are the parameters and what are the arguments in the above example?

## Arguments

There are three types of arguments that Python User-defined functions can take:

- Default arguments
- Required arguments
- Variable number of arguments

### Default arguments

Values that the corresponding parameters take if no argument value is passed during the function call. Assign the default value by using the operator `=` in the function header. Can also be called optional arguments since there is no need to specify values for them when you call a function.

In [None]:
def add(a=10, b=20):
    return a+b

In [None]:
add() # 30

In [None]:
add(1,10) # 11

Having default parameters allows you to avoids errors with incorrect parameters and also generates more readable examples!

### Required arguments

Arguments that are mandatory to pass during the function call and in precisely the right order. If you switch them around, the result might be different.

In [1]:
def divide(a, b):
    return a/b

In [2]:
divide()

TypeError: divide() missing 2 required positional arguments: 'a' and 'b'

In [3]:
divide(20, 10)

2.0

In [4]:
divide(10,20)

0.5



By default, arguments are positional-or-keyword. If you do not specify which argument was given to which parameter they will be assigned in the order that was defined in the function definition. 

In this example

```python
def divide(a, b):
    return a/b
```

If you call `divide(20,10)` the parameter `a` will take the value 20 and the parameter `b` will take the value 10 since they are specified in that order in the function definition.


You can also specify which parameter an argument should be assigned to by using it's name (keyword) during the call:

In [5]:
divide(b=20, a=10)

0.5

Keyword arguments are different from default parameters

* When you **define** a function and use an **=** you are setting a **default parameter**
* When you **call** a function and use an **=** you are making a **keyword argument**

### Variable number of arguments

In cases where you don't know the exact number of arguments that you want to pass to a function, you can use the syntax `*args`.

The asterisk (`*`) is placed before the variable name that holds the values of all non keyword arguments. `args` can be replaced by other names you want.


In [None]:
def plus(*args):
    return sum(args)

In [None]:
plus(1,4,5)

In [None]:
def plus(*args):
    total = 0
    for i in args:
        total += i
    return total

In [None]:
plus(20,30,40,50)

## Warning: Don't use mutable types as default parameters

Do __not__ use e.g. an empty list as a default parameter. Instead, use `None`.

Consider the following example:

In [42]:
def add_number(x, to=None):
    """add x to a list of numbers. Create a new list if none is passed."""
    
    if to is None:
        to = []
    to.append(x)
    return to

The above function behaves well:

In [48]:
my_numbers = add_number(4)
my_numbers

[4]

In [49]:
#this adds 9 to the list we passed as an argument.
add_number(9, to=my_numbers)

[4, 9]

In [46]:
my_numbers

[4, 9]

What happens if we use an empty list in the definition instead?

In [54]:
def add_number_wrong(x, to=[]):
    """add x to a list of numbers. Create a new list if none is passed."""
    to.append(x)
    return to

In [55]:
#this behaves as above
my_numbers2 = add_number_wrong(4)
my_numbers2

[4]

In [56]:
#we can add the number 9
add_number_wrong(9, to=my_numbers2)

[4, 9]

But ...

In [59]:
#I haven't passed a list, so to should be empty as per it's default. However:
add_number_wrong(5)

[4, 9, 5, 5]

Also:

In [61]:
my_numbers2

[4, 9, 5, 5]

This is probably not what you expected to happen!
> More prominent examples arise out of the context of classes, e.g. see Ramalho 2015: Fluent Python, p. 236ff.

## Documenting functions

- Describe what your function does, which parameters it takes, and its return values.

- Use `""" """` to wrap it.

- Placed in the immediate line after the function header.

- Essential when writing complex functions.

- There are several formats. For more detail check out some Github repositories of Python packages like [scikit-learn](https://github.com/scikit-learn/scikit-learn/tree/main/sklearn) or [pandas](https://github.com/pandas-dev/pandas/tree/master/pandas), with plenty of examples.

Example:

```python
def full_name(first, last):
    """A function that takes as arguments a first and last name and prints a full name"""
    print (f"Your name is {first} {last}")
```

Adding parameter descriptions using the [numpy style](https://numpydoc.readthedocs.io/en/latest/format.html):

```python
def full_name(first, last):
    """A function that takes as arguments a first and last name and prints a full name
    
    Parameters
    ----------
    first : str
        The first name of someone.
    last : str
        The last name of someone.
    """
    print (f"Your name is {first} {last}")
```

## Scope

Variables created in functions are scoped inside that function! They do not exist outside the function.

Example:


In [8]:
def say_hello():
    instructor = 'Colt'
    return f'Hello {instructor}'

say_hello()

print(instructor) # NameError

NameError: name 'instructor' is not defined

Similarly, anything you do to a variable inside a function and do not return has no consequence on the outside. Consider:

In [17]:
my_int = 10

In [18]:
def my_function(my_int):
    my_int *= 2
    print(my_int)

In [19]:
my_function(my_int)

20


In [20]:
my_int

10

Why is my_int still 10?

## Pass by value and pass by reference

We just said that variables created inside the function are not visible to the 'outside' and what we do to them inside the function does not propagate outside function if we do not return it. 

__This is only true for variables that are pass-by-value!__

In the above example there are two variables called my_int. They have the same name but they are __not__ the same object. One of them exists outside the function, the one exists inside the function. I named them both the same, which python allows me to do but is generally a very bad idea.

To re-iterate, the reason why me multiplying my_int by 2 inside the function has no bearing on the outside world is because my_int and my_int are not the same objects. They are two objects with the same name. This is because integers are pass by value. Consider the following example:

I create an integer object which I call `a` and I assign it the number 3. 

In [25]:
a = 3
a

3

'3' is the object, `a` is the variable (what I call this object).
I now create a copy of a which I call `b`. We can easily check that `b` is also 3. This is because I copied `a`'s content. 

In [26]:
b = a
b

3

Now I add 2 to `b`.

In [27]:
b = b + 2
b

5

To nobody's surprise `a` is still 3. This is because `a` and `b` are not the same object, they are two objects that happen to have the same value.

In [24]:
a

3

This behavior is called __pass by value__.

However, many complex object types like lists are __pass by reference__.
Here, I create a list object which I will call `x` and I assign it the values 1, 2 and 3.

In [28]:
x = [1, 2, 3]
x

[1, 2, 3]

I copy my list object, just as I did the integer `a` above: 

In [None]:
y = x
y

Now, lets append 10 to my new list `y`. 

In [30]:
y.append(10)
y

[1, 2, 3, 10]

Anyone want to guess what happened to `x`?

This behavior is called __call by reference__. 

What happened is that unlike `a` and `b`, `x` and `y` are __not__ two different objects. They are two reference to the __same__ object. Let me repeat this, `x` and `y` are both __pointers to the same list object__. If you change x, you change y.

This means that if you pass a list into a function, what you do to the list inside the function __propagates to the outside__. This is because lists are pass by reference, which means that instead of creating a new list object inside the function, the function will operate directly on the list object. It has been given a __reference__, not a __copy__. Consider:

In [33]:
x = [1,2,3]
x

[1, 2, 3]

In [34]:
def my_function(my_list):
    my_list.append(10)
    print(my_list)

In [37]:
my_function(x)

[1, 2, 3, 10]


In [38]:
x

[1, 2, 3, 10]

Compare this to what happened above with my_int.

## So is python call by reference or call by value?

Neither. Python is 'call by object reference'. Immutable objects (strings, tuples, integers) are called and passed as values. Mutable objects are called and passed as references. 

Time for coffee.

## Group exercises - part 1

Build functions that carry out the requested task, then use them on the given data. Work in groups for ~20 minutes.

### Exercise 1

#### Calculate absolute difference

Write a function that returns the absolute (positive) difference between two numbers.

Function:

In [None]:
def abs_difference(x, y):
    pass # placeholder statement where your code goes

Data: 

In [8]:
a = 25
b = 65

### Exercise 2

#### Calculate squares

Write a function that returns the square root of the sum of squares of two numbers. 

> Hint: You can use `math.sqrt` to calculate the square root.

Function:

In [18]:
from math import sqrt # see modules and classes notebook for more explanations
sqrt(25)

def square_root_of_squares():
    pass
#     YOUR CODE GOES HERE

Data:

In [19]:
x = 5
y = 4

### Exercise 3

#### Indicate sign of difference

Write a function that returns "Positive" if their difference is positive, and "Negative" if their difference is negative

Function:

Data:

In [None]:
x = 10
y = 5

### Exercise 4

#### Calculate sum and differences

Write a function that returns **both** the sum of the first two inputs and the difference between the second and third input: 

Function:

Data:

In [None]:
a = 10
b = 15
c = 20

### Exercise 5

#### Function overloading: Different behaviour for `int` and `str`

Write a function that adds two numbers together if the inputs are both numbers, and concatenates the inputs if they are both strings.

> Hint: You can use built-in functions [`type`](https://docs.python.org/3/library/functions.html#type) or [`isinstance`](https://docs.python.org/3/library/functions.html#isinstance) to find out the type of a variable.

Function:

In [None]:
def add_stuff():
    pass
    #your code here

Data:

In [None]:
x = "Hello"
y = "World"

In [None]:
a = 3
b = 5

## Using Functions to Wrap Code

In general, we want to make code as abstract as possible, but at the same time we should be specific about what we are trying to accomplish.  These two goals - abstraction and specificity - are often at odds with each other.  Encapsulation solves this problem by allowing us to put specific code inside function definitions, and abstract code in the code that calls it.

Usually, the process of producing this code format follows three steps:
  1. Write code that works.  (Focus on the specifics)
  2. Wrap it in a function.  (Encapsulate it)
  3. Call the function in your script (Abstract it)
 
 
For example:
```python
data = [2, 6, 3, 7, 8, 9, 1]
squares = []
for el in data:
    square = el ** 2
    squares.append(square)
sum(sum_of_squares)
```

will become:

```python
def sum_of_squares(data):
    """Comute the sum of squares."""
    squares = []
    for el in data:
        square = el ** 2
        squares.append(square)
    return sum(squares)


data = [2, 6, 3, 7, 8, 9, 1]
sum_of_squares(data)
```

Let's practice doing this with various types of loops:

## Group exercises - part 2

Take the following working Python code and rewrite it so that it uses functions. Work again in groups for ~20 minutes.

### Exercise 1

#### Square numbers

The code below squares all of the numbers and removes all of the strings from the list.  Make it into a function, **square_numbers**:

In [None]:
data = [5, "missing", 54, "bad", 3, 6]
good_data = []
good_data_squared = []
idx = 0
while idx < len(data):
    el = data[idx]
    if isinstance(el, int):
        good_data_squared.append(el ** 2)
    idx += 1
good_data_squared

Put the modified code below:

### Exercise 2

#### Calculate the standard deviation

The code below calculates the **standard deviation** of the data.  Put it in a function called **`standard_deviation`** and use the function on the data:

In [None]:
import math

data = [2, 6, 8, 2, 5, 8, 9, 2]

mean = 0
std = 0
for el in data:
    mean += el / len(data)
dev_squareds = []
for el in data:
    dev = (el - mean) ** 2
    dev_squareds.append(dev)
sum_dev_squareds = sum(dev_squareds)
standard_dev = sum_dev_squareds / len(data) * 1.
standard_dev = math.sqrt(standard_dev)
standard_dev

Put the modified code below:

### Exercise 3

#### Bootstrap mean

The code below generates a **bootstrap** sample of the data, getting a random selection of the data **boot_n** times and calculating the mean of that sample, so that many estimates of the mean can be made from a single dataset.  Put it in a function called **bootstrap_means**.

In [None]:
import random

data = [2, 6, 8, 2, 5, 8, 9, 2, 6, 2, 10]
n_boot = 5
means = []
for rep in range(n_boot):
    sample = random.choices(data, k=len(data))
    mean = sum(sample) / len(sample)
    rep = rep * 2
    means.append(mean)
means


Put the modified code below:

### Exercise 4

#### Build a function to modify a file

Based on what we have seen in the previous lectures (Importing data, Conditionals and Loops): 

Build a function that takes as parameter the name of a country, then reads the file 'data/sample.txt', checks if the country exists in the file and if it doesn't the function should add it as a new line and otherwise print out that the country already exists (You can use a formatted string).

## Extra

### Functions Applied to Functions

Note that you can also use functions as parameters, like in the example below. `call` is a function that calls the function passed to it (fn) on some data passed to it (arg).

Functions that operate on other functions are called "Higher order functions".


In [1]:

def mult_by_five(x):
    return 5 * x

def call(fn, arg):
    """Call fn on arg"""
    return fn(arg)

def squared_call(fn, arg):
    """Call fn on the result of calling fn on arg"""
    return fn(fn(arg))

In [2]:
print(call(mult_by_five, 1),
      squared_call(mult_by_five, 1), 
      sep='\n')

5
25


### Refactoring Code: Improving it without breaking it

By modularizing code, we give ourselves the ability to modify small parts of the code without having to worry about the rest of it--so long as the inputs and outputs don't change, anything that happens in the middle doesn't matter to the code that calls the function!

The exercises above all contain things that can be improved, whether that is to make them simpler, more readable, or more reliable, any improvements are helpful.  Let's work through them again and make the following improvements:

  1. **Remove Orphan Code**: Often, code that isn't actually used by the function is left sitting there, lost and forgotten.  Deleting those lines will make it easier to see how everything works and improve readability.
  2. **Change variable names to something clearer**: variables like x and y are not helpful.  Make them something that represents that result of the line!
  3. **Reduce the number of steps**: If there are several lines doing something you think is simple, either compress them to a single line or make a new function that represents that action.  If you know that the function you want already exists in another package, then import that package and use it!
  4. **Convert While loops to For loops**: If you see iteration happening, use a for-loop!  If it's a single action, why not make it a comprehension?

### **Exercises**

Refactor the functions created in the previous section.  Make sure to re-run the code with each change you make to verify that it still works!