# __R Session n°6 : tutorial__
M1 MEG UE5 - Claire Vandiedonck
***

## **tutorial on functions**
1. Principle
2. Rules
3. Function results
4. Some examples to better understand
5. Writing your fisr R functions?


---
## Avant d'aller plus loin

<div class="alert alert-block alert-danger"><b>Attention:</b> 
Ne travaillez pas directement sur ce notebook pour ne pas le perdre. Dupliquez-le et renommez-le par exemple en ajoutant vos initiales et travaillez sur cette nouvelle copie. Pour ce faire, dans le panneau de gauche, faites un clic droit sur le fichier et sélectionnez "Duplicate". Puis, toujours dans la colonne de gauche, faites un clic droit sur cette copie et sélectionnez "rename" pour changer le nom. Ouvrez ensuite cette nouvelle version en double cliquant dessus. Vous êtes prêt(e) à démarrer! <br>
<br>
<b>N'oubliez pas de sauvegarder régulièrement votre notebook</b>: <kbd>Ctrl</kbd> + <kbd>S</kbd>. ou en cliquant sur l'icone 💾 en haut à gauche de votre notebook ou dans le Menu du JupyterLab "File puis "Save Notebook"! Vous pouvez aussi le sauvegarder au format html: Menu "File" > Export Notebook As> Export notebook as HTML.
</div>

Check your working directory and change it if needed:

In [None]:
#cell 1
getwd()
#setwd('/srv/home/cvandiedonck/megm1_ue5_r/R6') #change with your login!!!

<div class="alert alert-block alert-warning"><b>Warning:</b> you are strongly advised to run the cells in the indicated order. If you want to rerun cells above, you can just restart the kernel to start at 1 again. </div>

## I. Principle
---


### Why writing your own R functions?


Writing your own R functions will allow efficient, flexible and rationale use of R if you want **to repeat an operation in different situations**.

### Properties

Similar strcuture as native R functions except there is no help menu *(we could add one, but it requires another tutorial!)*.

- name
- arguments within brackets to execute the command
- body of the function: commands executing actions on arguments
- results

### Syntax uses two functions:

1. `function()` followed by `{}`
    - It is important to assign to the function a name not already used in R.
    - You add arguments between `()` : give them a name and a default value with `=`
    
2. `return()` **inside** the `{}` of the function so that the output of the function can be saved outside the function space. If multiple results, they must be stored in a single output within a list.

## II. Rules
---

### 1. Do not use the name of a native function

<div class="alert alert-block alert-danger"><b>Caution:</b> 
The name of your own function must not be a native R function, otherwise the native function is overwritten during your session.
</div>

In the example below, we rename an existing function `mean()`. As coded, the returned result will be the squared value instead of the mean.

In [None]:
#cell 2
mean <- function(x){
    return(x^2)
}

mean(c(3,4))
rm(mean)

Let's check we got the correct mean() function back:

In [None]:
#cell 3
mean(c(3,4))

### 2. The function space is closed

**The argument names, all the variables created inside the functions and the results exist only within the enclosed function space!**

- All the required objects must be arguments of the function or they must be defined in the body of the function
- Risk to call an R object that is outside your function : by default, if the object is not defined in your function, R looks for it outside the function


*See example below:*

We first remove all objects in the environment

In [None]:
#cell 4
rm(list = ls())

Let's define a function called `func()` to return the sum between a variable, passed as argument and `a`.

In [None]:
# cell 5
func <- function(x){
    x <- x + a
    return(x^2)
}

Lest' try to apply it to the value 2:

In [None]:
# cell 6
func(2)

This returns an error because the object `a` is not found.

But what if `a` already exists in your R environment?

In [None]:
# cell 7
a <- 2
func <- function(x){
    x <- x + a
    return(x^2)
}
func(2)

This time it worked.
You could also apply the function to a vector of length > 1:

In [None]:
# cell 8
func(c(2, 3, 10))

But imagine now we add `a` as a parameter to a function `func2()` to perform an operation with `x` and `a`:

In [None]:
# cell 9
func2 <- function (x,a){
    x <- x + a
    return (x^2)
}

Don't forget that `a` equals to 2 in our R environement.

In [None]:
# cell 10
a

So if we apply the function `func2()`, will it use this `a`  value or the one you would pass as an argument to the function?
Let's have a look.

In [None]:
# cell 11
func2(2, 3)

So above, the function used 2 for x and 3 for a. It thus used the value `a` defined inside the `()` and not the a value assigned outside the function.

In [None]:
# cell 12
func2(2, 10)

Similarly with the above two values 2 and 10, 10 was used for the second argument instead of `a`  assigned in the R environment.

But what happens if we pass only one argument to the function `func2()`?

In [None]:
# cell 13
func2(2)

It returns an error as the argument `a` is missing and we did not specify a default value for it when defining the function.
The function func2 could not use the value `a=2` present in the workspace.

=> the function space is closed: R uses the objects defined in your function.

**Several arguments can be passed to the function:**

They can be of different types:
- numeric
- logical
- factor
- vector
- matrix
- dataframe
- list
- functions

When `...` is specified, these are additional arguments passed from another function.

Arguments are defined by their name or by their order.

In [None]:
# cell 14
func3 <- function (x, a){
    x <- x + 2 * a
    return(x^2)
}

In [None]:
# cell 15
func3(2, 5)

In [None]:
# cell 16
func3(x = 2, a = 5)

In [None]:
# cell 17
func3(a = 5, x = 2)

In [None]:
# cell 18
func3(5, 2)

Thus above we see that when not specifying the name of the argument, the order set when defining the function is the one used. If we specify the names of the arguments, you may change the order, although this is not recommanded because your script wil lack clarity.

### 3. Do not hesitate to assign default values to the arguments

In [None]:
# cell 19
func4 <- function(x, a = 4){
    x <- x + a
    return(x^2)
}

In [None]:
# cell 20
func4(2, 5)

In [None]:
# cell 21
func4(2)

In the above example, we passed to the function only one argument. But since we defined a default value to `a` when creating the function, this default value has been used.

## III. Function results
---

- A function can **only return one object.**

- By default **the returned result is the last object** of the function body. 

In [None]:
# cell 22
func <- function(x){
    x + 10
    x^2
}
func(2)

Above it returned only the result of the `x^2` computation but not of `x + 10`.

- It is highly recommanded to **return the result with the function `return()`.**

In [None]:
# cell 23
func <- function(x){
    temp <- x^2
    return(temp)
}
func(2)

- **If your function creates more than one result, use a `list` to store them and return the list** as in the example below.

In [None]:
# cell 24
func <- function (x) {
    temp1 <- x ^2
    temp2 <- temp1^x
    results <- list(res1 = temp1, res2 = temp2)
    return(results)
}
func(2)

## IV. Some examples to better understand
---

### 1. use return() and assign results

Let's clean again our session from all objects and functions.

In [None]:
# cell 25
rm(list = ls())
ls()

We first create a function f1 without using `return()`:

In [None]:
# cell 26
f1 <- function(a, b){
    Op <- a + b
}
f1(a = 6, b = 20)
ls()

The function f1 ran but the result of the sum was stored in the object "Op" which was not printed nor returned. So, the result of the function is not shown. It is thus necessary to add the function return() inside the function to get the result.

In [None]:
# cell 27
rm(list = ls())
f1 <- function(a, b){
    Op <- a + b
    return(Op)
}
f1(a = 6 ,b = 20)
ls()

However, we do not get the object "Op" itself but the value that was assigned to it. If you want to get the result object in your R environment, you have to assign the result of the function.

In [None]:
# cell 28
rm(list = ls())
f1 <- function(a, b){
    Op <- a + b
    return(Op)
}
res1 <- f1(a = 6, b = 20)
ls()
res1

Now, the object `res1` contains the result of the function!

### 2. caution to the function space closed

Let's play again with the same function, but we first assign to `a` a value outside of the function.

In [None]:
# cell 29
rm(list = ls())
a <- 27
f1 <- function(a, b){
      Op <- a + b
      return(Op)
}
f1(a = 6, b = 20)
ls()
a

What happened?
- When calling `f1()`, the function used the value `a=6` passed as an argument to the function and not the value `a=27` assigned outside the function.
- When printing `a` after using the function, the returned valued is the one assigned to `a` before the function. The one inside the function did not overwrite it.

Let's try again, this time with assigning a default value to `a` when creating the function.

In [None]:
# cell 30
rm(list = ls())
a <- 27
f1 <- function(a = 22, b){
      Op <- a + b
      return(Op)
}
f1(a, b = 20)
f1(b = 20)

What happened?
- When f1 was first called, it used the value `a=27` assigned within the R session and not the default one defined in the function.
- While when f1 was called the second time, it used the default `a=22` value specified when creating the function.

This second result is probably less surprising than the first one.
Why did f1 did not use `a=22` in the first instance? Because in that case, you asked f1 to use the `object a`, instead of the `argument a`...This is highly misleading!

<div class="alert alert-block alert-danger"><b>Caution:</b><br> 
It is thus highly recommended to <b> never use in your function space the name of a variable outside of the function</b>.
</div>

## V. Writing your first R functions?
---

The recipe to write your own function is:

- give it a name
- identify the arguments
- write the body of the function
- identify the results to return
- test it!

=> Write a function to compute the body mass index using the height and weight

In [None]:
# cell 31
compute.bmi <- function(w, s){
    z <- w/s^2
    return(z)
}
compute.bmi(65, 1.70)

=> Write a function to compute the variation coefficient

In [None]:
# cell 32
compute.cv <- function(m, s){
    z <- s/m
    return(z)
}
compute.cv(rnorm(10), 1:10 )

<div class="alert alert-block alert-success"><b>Success:</b> Well done! You now know how to create a function and to avoid classical pitfalls. 
</div>
    

Of course, you can improve your functions by adding warnings if the input files are not as expected, or by adding some documentation (start with annotations!).

***
***

## Useful commands
<div class="alert alert-block alert-info"> 
    
- <kbd>CTRL</kbd>+<kbd>S</kbd> : save notebook<br>    
- <kbd>CTRL</kbd>+<kbd>ENTER</kbd> : Run Cell<br>  
- <kbd>SHIFT</kbd>+<kbd>ENTER</kbd> : Run Cell and Select Next<br>   
- <kbd>ALT</kbd>+<kbd>ENTER</kbd> : Run Cell and Insert Below<br>   
- <kbd>ESC</kbd>+<kbd>y</kbd> : Change to *Code* Cell Type<br>  
- <kbd>ESC</kbd>+<kbd>m</kbd> : Change to *Markdown* Cell Type<br> 
- <kbd>ESC</kbd>+<kbd>r</kbd> : Change to *Raw* Cell Type<br>    
- <kbd>ESC</kbd>+<kbd>a</kbd> : Create Cell Above<br> 
- <kbd>ESC</kbd>+<kbd>b</kbd> : Create Cell Below<br> 

<em>  
To make nice html reports with markdown: <a href="https://dillinger.io/" title="dillinger.io">html visualization tool 1</a> or <a href="https://stackedit.io/app#" title="stackedit.io">html visualization tool 2</a>, <a href="https://www.tablesgenerator.com/markdown_tables" title="tablesgenerator.com">to draw nice tables</a>, and the <a href="https://medium.com/analytics-vidhya/the-ultimate-markdown-guide-for-jupyter-notebook-d5e5abf728fd" title="Ultimate guide">Ultimate guide</a>. <br>
Further reading on JupyterLab notebooks: <a href="https://jupyterlab.readthedocs.io/en/latest/user/notebook.html" title="Jupyter Lab">Jupyter Lab documentation</a>.<br>   
</em>    
 
</div>