<img src="./intro_images/MIE.PNG" alt="notebook banner image" width="100%" align="left" />

<table style="float:right;">
    <tr>
        <td>                      
            <div style="text-align: right"><a href="https://alandavies.netlify.com" target="_blank">Dr Alan Davies</a></div>
            <div style="text-align: right">Senior Lecturer Health Data Science</div>
            <div style="text-align: right">University of Manchester</div>
         </td>
         <td>
             <img src="./intro_images/alan.PNG" alt="Alan Davies photo" width="30%" />
         </td>
     </tr>
</table>

# 7.0 Functions
****

#### About this Notebook
This notebook introduces creating our own <code>functions</code> that can be used to write modular reusable code. We also introduce the concept of <code>recursion</code> where we can call a function from within itself.

<div class="alert alert-block alert-warning"><b>Learning Objectives:</b> 
<br/> At the end of this notebook you will be able to:
    
- Explore how we can write our own custom functions to carry out specific tasks

- Explore the concept of recursion

</div> 

<a id="top"></a>

<b>Table of contents</b><br>

7.1 [Function comments](#funccomments)

7.2 [Variable scope](#scope)

7.3 [Anonymous functions](#anon)

7.4 [Recursion](#recursion)

We have already been using functions in R. For example <code>print()</code> is a function, as is <code>length()</code> and <code>lapply()</code>. We use functions to make our code more modular and to contain code that we may need to repeat several times. We also use functions to carry out specific tasks. For example to convert the temperature between different units. To make a function in R we use the <code>function</code> keyword preceded by a function name (as with variables try to make this descriptive of what the function does). We can also provide any parameters that we may want to pass into a function. Functions can optionally take input values and return an output.

<div class="alert alert-success">
<b>Note:</b> Parameters are variables that we can pass into a function for the function to process internally. Parameters are optional. Not all functions have parameters.
</div>

In [1]:
my_hello_function <- function()
{
    print("Hello world!!")
}

You will notice when you run the cell above that nothing happens. This is because to run the code contained within a function we need to first <code>call</code> the function. We do this by using the functions name followed by the parenthesis (round brackets). All the code inside braces <code>{}</code> belongs to the function and will only execute (run) when the function is called.

In [2]:
my_hello_function()

[1] "Hello world!!"


We can pass variables (parameters) to a function so that the values can be used internally by the function. For example we could extend the function to take a string input value and display that message instead of a hard coded one.

In [3]:
display_message <- function(msg)
{
    print(msg)
}

Now we can pass in a custom message as shown below. essentially we have just created a wrapper function for the <code>print()</code> function. In this case if we do not do any other preprocessing to the input, there is no advantages to doing this over just using <code>print()</code>.

In [4]:
display_message("Say Hi")
display_message("Say something else")

[1] "Say Hi"
[1] "Say something else"


We could improve this by turning it into a simple logging function that add the date and time to the message that is passed in as a parameter:

In [7]:
display_logger <- function(msg)
{
    today <- paste0(Sys.time())
    cat(today, ": ", msg)
}

In [8]:
display_logger("Write a log message!")

2022-07-18 14:44:05 :  Write a log message!

We can also pass in multiple values to functions separating them with commas (<code>,</code>).

In [13]:
print_person_data <- function(persons_name, persons_age)
{
    cat("Name:", persons_name)
    cat("\n\rAge: ", persons_age)
}

print_person_data("Dave", 56)

Name: Dave
Age:  56

We can also <code>return</code> or pass back an output form our function. For example the outcome of a calculation that we might want to use later on.

In [14]:
add_numbers <- function(n1, n2)
{
    return(n1 + n2)
}

In [15]:
answer <- add_numbers(5, 2)
print(answer)

[1] 7


<div class="alert alert-success">
<b>Note:</b> Functions can have optional <code>input</code> (parameters) and <code>output</code> (return) values.
</div>

We can also cut out the step above of storing the returned value in a variable. This is inefficient if we don't need to use it again. Instead we could just print the output directly.

In [16]:
print(add_numbers(5, 2))

[1] 7


<div class="alert alert-block alert-info">
<b>Task 1:</b>
<br> 
1. Create a vector called <code>nums</code> with the following values 1, 4, 5, 2, 1, 6<br />
2. Write a function called <code>avg</code> to return the average of these numbers (add up all the numbers and divide by the count)<br>
$$
\frac{x_1 + x_2 + ... + x_n}{n}
$$
</div>

In [29]:
nums <- c(1, 4, 5, 2, 1, 6)

avg <- function(nums)
{
    total <- sum(nums)
    return(total / length(nums))
}

print(avg(nums))

[1] 3.166667


Another useful feature in R is the ability to provide a default value for a function parameter. Let's say we wanted to write a function to output a workers name and job title. We might have a lot of scientists in the company, so we could set this as the default value.

In [32]:
display_name_title <-function(persons_name, persons_role = "Scientist")
{
    cat(persons_role, persons_name)
}

In [33]:
display_name_title("Alan Smith")

Scientist Alan Smith

This automatically uses <code>Scientist</code> as the default role. But this can also be overridden by supplying a value, i.e:

In [34]:
display_name_title("Paul Gantt", "Manager")

Manager Paul Gantt

If we have a variable number of parameters that we want to use we can use ellipsis (<code>...</code>). This produces a list to store all the variables. Let's say we had team members and the number could be different.

In [35]:
team_players <- function(...)
{
    players <- list(...)
    print(players)
}

In [36]:
team_players("Adam", "David", "Barry", "Steve")

[[1]]
[1] "Adam"

[[2]]
[1] "David"

[[3]]
[1] "Barry"

[[4]]
[1] "Steve"



In [37]:
team_players("Paul", "Stan")

[[1]]
[1] "Paul"

[[2]]
[1] "Stan"



<div class="alert alert-success">
<b>Note:</b> For more than around 3 parameters we would typically use a data structure like a <code>list</code> or <code>vector</code> to keep the code cleaner and store the arguments we want to pass into a function rather than having a massive list of comma separated parameters. 
</div>

<img src="./intro_images/circ.PNG" width="90%" align="left" />

<div class=accessibility>
<b>Accessibility:</b> The cell above illustrates an image of a circle. It also indicates the diameter and radius of the circle.
</div>

In [38]:
diameter <- 12

circles <- function(d)
{
    c <- pi * d
    r <- d / 2
    a <- pi * r^2
    
    cat("Circumference = ",c)
    cat("\nRadius = ",r)
    cat("\nArea =", a)
}

circles(diameter)

Circumference =  37.69911
Radius =  6
Area = 113.0973

<div class="alert alert-block alert-info">
<b>Task 2:</b>
<br>
Regarding the function above that outputs the circumference, radius and area of a circle given a diameter.<br /> 
1. How could the function be redesigned to be more modular and reusable?<br />
2. Have a go reimplementing this function as several smaller functions that carry out a specific task (i.e. one for circumference, area and radius).
</div>

In [40]:
circle_circumference <- function(d)
{
    return(pi * d)
}

circle_radius <- function(d)
{
    return(d / 2)
}

circle_area <- function(d)
{
    return(pi * (d/2)^2)
}

diameter <- 12
cat("Circumference = ",circle_circumference(diameter))
cat("\nRadius = ",circle_radius(diameter))
cat("\nArea =", circle_area(diameter))

Circumference =  37.69911
Radius =  6
Area = 113.0973

<a id="funccomments"></a>
#### 7.1 Function comments

It can be a good idea to provide function level comments to your code to explain what a function does. The level of detail is up to you. Here are two examples. The first is a lightweight approach the second provides more detail about the usage of the function. 

In [41]:
# function to return result of addtion of two mumbers
add_two_nums <- function(n1, n2)
{
    return(n1 + n2)
}

In [42]:
# ---------------------------------------------------------------------------------
# FUNCTION:     add_two_nums
# INPUT:        int, int
# OUTPUT:       int
# DESCRIPTION:  Function to return result of addtion of two mumbers
#               
# ---------------------------------------------------------------------------------
add_two_nums <- function(n1, n2)
{
    return(n1 + n2)
}

Of course you don't have to add comments to your functions, but picking a consistent method and using it to document your functions increases the readability of your code, especially for large programs with multiple contributors. This will save people having to read the code to try and work out what the function does. Combining this documentation with clear and descriptive variable and function names is very helpful to aid others (and yourself if you return to the code later) in understanding what your function does and how it is intended to be used. 

<div class="alert alert-block alert-info">
<b>Task 3:</b>
<br> 
1. Write a function to calculate Body Mass Index (BMI) $$BMI = w \div h^2 $$ This is the weight in kilograms divided by the height in meters squared. The height and weight should be parameters passed to the function.<br />
2. Using <code>if</code> statements in the function - output the weight classification: less than 18.5 is underweight, between 18.5 and 24.9 is healthy weight and more than 24.9 is overweight.
</div>

In [43]:
calculate_BMI <- function(weight_kg, height_m)
{
    BMI <- weight_kg / height_m^2
    cat("BMI =", BMI)
    if(BMI < 18.5) {
        cat("\nUnderweight")
    }else if(BMI >= 18.5 && BMI <= 24.9){
        cat("\nHealthy weight")
    }else if(BMI > 24.9){
        cat("\nOverweight")
    }
}

calculate_BMI(70, 1.5)

BMI = 31.11111
Overweight

<a id="scope"></a>
#### 7.2 Variable scope

You can think of the code inside a function as self-contained. This means that a variable with the same name inside a function is actually a different variable to one with the same name outside of a function. This is best illustrated with an example.

In [44]:
x <- 10

my_function <- function()
{
    x <- 7    
    cat("x inside function =", x)
}

my_function()
cat("\nx outside function =", x)

x inside function = 7
x outside function = 10

Here we have 2 variables both called <code>x</code>. The version of x outside of the function contains the value 10, whereas the one inside the function contains the value 7. These are 2 separate variables both with the same name. This is termed the <code>scope</code> of the variable. We can see when we output the values that we get 2 different results (10 and 7). One way to increase the scope of a variable is to give it <code>global</code> scope by making it what is referred to as a <code>global variable</code> by using <code>&lt;&lt;-</code>.

In [45]:
x <<- 10

my_function <- function()
{
    x <- x + 5
    cat("x =", x)
}

my_function()

x = 15

Here we can use <code>&lt;&lt;-</code> to tell R that the <code>x</code> in the function is actually the same <code>x</code> as the one outside of the function. Now when we add 5 to the value of <code>x</code> (which is 10) we get 15.

<div class="alert alert-block alert-info">
<b>Task 4:</b>
<br> 
1. Try removing the <code>&lt;&lt;-</code> from the code above and passing <code>x</code> into the function as a parameter.<br />
2. Output the value of <code>x</code> inside the function and after calling the function.<br />
3. What do you expect the value of <code>x</code> to be in both cases?
</div>

In [47]:
x <- 10

my_function <- function(x)
{
    x <- x + 5
    cat("x in function =", x)
}

my_function(x)
cat("\nx =", x)

x in function = 15
x = 10

<div class="alert alert-success">
<b>Note:</b> Global variables are useful when you want to share a value with many functions and want to avoid passing it in and out of multiple functions. It is good practice however to use the smallest number of global variables needed as there is a risk they could be altered in unexpected ways if they are being used in multiple places. 
</div>

<a id="anon"></a>
#### 7.3 Anonymous functions

Sometimes you need to write a quick disposable one time function to carry out some task and don't want to declare a complete function. R achieves this with what are known as <code>anonymous functions</code>. Consider writing a function to return the sum of two numbers. We might write a function that looks something like this:

In [48]:
add_numbers <- function(n1, n2)
{
    return(n1 + n2)
}

In [49]:
cat("Result =", add_numbers(2, 5))

Result = 7

We can achieve the same with a throw away anonymous function, which is useful if we just want to use a function once.

In [65]:
cat("Result =", (function(n1, n2){n1 + n2}) (8, 2))

Result = 10

<a id="recursion"></a>
#### 7.4 Recursion

Another concept relating to functions is that of <code>recursion</code>. We have seen how we can use <code>iteration</code> in the form of loops and the apply functions to repeat actions. We can also have nested loops and this nesting can be very deep. There is however a limit to this. To overcome this we can use recursion to get a function to call itself over and over. Certain problems lend themselves to recursion and it is a technique often used in algorithm design.

Let's look at a classic problem that can be solved with recursion. The <code>tower of Hanoi</code>. This is mathematical  puzzle  where you have 3 pegs and have to move disks from one peg to another one at time such that no larger disk can be on-top of a smaller disk. The task is to do this in the minimum amount of moves possible. The animation below shows this in action.

<img src="./intro_images/tower.gif" width="500" />

<div class=accessibility>
<b>Accessibility:</b> The illustration above shows the Hanoi tower game animation.
</div>


So if we write a function that calls itself and pass in the number of disks (4) we can see how many moves it takes (15). You can count the moves in the animation to check.

In [66]:
hanoi <- function(n)
{
    if(n == 1){
        return(1)
    }
    return(2 * hanoi(n - 1) + 1)
}

In [67]:
cat("Number of moves for 4 disks =", hanoi(4))

Number of moves for 4 disks = 15

For 4 disks it is actually doing this:<br />
$ = 2 \times hanoi(3) + 1 $ <br />
$ = 2 \times (2 \times hanoi(2) + 1) + 1 $ <br />
$ = 2 \times (2 \times (2 \times hanoi(1) + 1) + 1) + 1 $ <br />
$ = 2 \times (2 \times (2 \times 1 + 1) + 1) + 1 $ <br />
$ = 2 \times (2 \times (3) + 1) + 1 $ <br />
$ = 2 \times (7) + 1 $ <br />
$ = 15 $

The <code>Fibonacci</code> sequence is a number sequence (featured in The Davinci code book and film) where the next number in the sequence is found by summing the previous 2 numbers in the sequence. It looks like this:<br />
$$0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ... $$
So $(0 + 1 = 1)$ and $(1 + 1 = 2)$ and $(1 + 2 = 3)$ and so on.

<div class="alert alert-block alert-info">
<b>Task 5:</b>
<br> 
Given the information above about the Fibonacci sequence: <br />
Write a function using <code>recursion</code> to return a value of the sequence provided as input to the function.<br />
    <strong>Hint:</strong> You can use a loop when calling the function to output the results and pass the loop counter into the function i.e.<br>
<code>n <- 6
for(i in 0:n)
{
    cat(fib_sequence(i), " ")
}
</code>
</div>

In [70]:
fib_sequence <- function(n)
{
    if(n <= 1){
        return(n)
    }else{
        return(fib_sequence(n-1) + fib_sequence(n-2))
    }
}

n <- 9
for(i in 0:n)
{
    cat(fib_sequence(i), " ")
}

0  1  1  2  3  5  8  13  21  34  

<div class="alert alert-success">
<b>Note:</b> you may wonder why Recursion should be used instead of nested loops, especially as you can nest many levels of loops (in the hundreds). Apart from many algorithms making use of recursion, compilers also make use of this to process nested code efficiently. In computer science theory there is also whats known as an <code>Ackermann function</code>. This is a function that grows very large very quickly and becomes difficult to compute without recursive methods.
</div>

In the next notebook we will explore how we can handle errors in R and test our code using <code>unit tests</code>. Testing helps us to ensure our code works as expected and is reliable.

### Notebook details
<br>
<i>Notebook created by <strong>Dr. Alan Davies</strong>.
<br>
&copy; Alan Davies 2022

## Notes:

In [1]:
# This cell maintains the accessibility of the notebook content.
from IPython.core.display import HTML
def css_styling():
    styles = open("./styles/custom.css", "r").read()
    return HTML(styles)
css_styling()