# Functions in R

Functions are commands used to transform an object in some way and return an output.

The input to a function is an "argument". The number of arguments vary between functions.

Functions have the basic syntax: `function(arg1, arg2, arg3)`.

## Functions and their arguments

Arguments can both be required or optional. Required arguments are arguments the function needs in order to return an output.

**Required arguments**

In the base function `mean()`, `x` (the object to calculate the mean of) is an required argument. If we try to run the function without, it will return an error:

In [7]:
mean()

ERROR: Error in mean.default(): argument "x" is missing, with no default


The function needs `x` to work:

In [11]:
numbers <- c(2, 9, 10, 13)
mean(numbers)

**Optional arguments**

Optional arguments are additional arguments that can be given to a function. Often times these arguments can be seen as "settings" for the function, changing a certain way the function behaves. Optional arguments always have a default value. The default values of the optional arguments can be seen in the documentation of the function.

The base function `mean()`, the argument `na.rm` is an optional argument. The default value can be inspected either by looking up the documentation (`?mean`) or by inspecting the source code itself.

The source code of a function can be inspected by simply inputting the function name in the console (in Rstudio, it is also possible to inspect the source code by placing the cursor inside the function and pressing F2). 

(NOTE: `mean()` automatically calls `mean.default()`, which contain the actual source code).

In [10]:
mean.default

As can be seen in the source code, the argument `na.rm` is set to `FALSE`. The documentation describes what the argument does ("a logical value indicating whether NA values should be stripped before the computation proceeds").

The default value can be changed simply by passing the argument when calling the function, as long as the input given is valid for the argument. The documentation describes what inputs are valid for the argument. In the case of `na.rm`, a boolean value is a valid input (`TRUE` or `FALSE`).

In [12]:
numbers <- c(2, 9, 10, 13, NA)
mean(numbers) # With default

mean(numbers, na.rm = TRUE) # Default changed

### Specifying arguments

Notice that it is often not necessary to specify the name of the argument. As long as the arguments are specified in the right order (the one given in the documentation), simply the input values can be given. This is why it is not necessary to specify `x` when using `mean()`:

In [14]:
numbers <- c(2, 9, 10, 13)
mean(numbers) # Works

mean(x = numbers) # Also works

When arguments are not named, R assumes that the arguments are specified in order. If one names the arguments, it can be put in any order:

In [16]:
numbers <- c(2, 9, 10, 13, NA)
mean(TRUE, numbers) # Does not work

ERROR: Error in mean.default(TRUE, numbers): 'trim' must be numeric of length one


In [17]:
mean(na.rm = TRUE, x = numbers) # Does work

## EXERCISE: FUNCTIONS AND THEIR ARGUMENTS

- Inspect the documentation for `str_detect`. What arguments does the function require and what are optional?

- The function `head()` is used to return the first 6 rows of a dataframe. Is there a way to make the function return more than 6 rows?

## Creating functions in R

Functions can be created like any other object in R. A function consists of arguments, some code to be executed and a return statement indicating what the function should return.

The code below creates a function for adding 5 to a number:

In [18]:
add5 <- function(x){
    result = x + 5
    return(result)
    }

Running the code does not return any output but makes the function available in the environment:

In [19]:
add5(10)

**Several arguments**

Just like existing functions, any number of arguments can be added to a function. Below the function is changed to simply add two input numbers:

In [21]:
add2num <- function(x, y){
    result = x + y
    return(result)
    }

add2num(7, 9)

**Optional arguments in own functions**

Optional arguments can be created simply by specifying a default value for the function. Below the function is changed to have the second number be 10 by default but it can still be changed when using the function:

In [24]:
add2num <- function(x, y = 10){
    result = x + y
    return(result)
    }

add2num(7) # Using the default value

add2num(7, 8) # Changing the default value

**The return statement**

The return statement indicates what the function should return. Without a return statement, the function returns no ouput:

In [25]:
add2num <- function(x, y = 10){
    result = x + y
    }

add2num(7) # No return statement - no output

The return statement also marks the end of the function; meaning that the function will stop execution when reaching a return statement (i.e. code following a return statement is ignored):

In [26]:
add2num <- function(x, y = 10){
    result = x + y
    print("This is included in the function")
    
    return(result)
    
    print("This is not included in the function")
    
    }

add2num(7) # No return statement - no output

[1] "This is included in the function"


**Objects inside functions**

Objects created inside a function only exists while the function is run; i.e. the objects are not created outside of the function and does not become part of the accessible environment:

In [32]:
result # Does not exist - only created inside the function

ERROR: Error in eval(expr, envir, enclos): objekt 'result' blev ikke fundet


### Functions and control structures

Control structures like if-else statements can easily be incorporated in a function. In a previous example an if-else statement was used to check wheter a number is larger than 10. This could be written as a function instead:

In [28]:
isabove10 <- function(x){
    if (x > 10){
        print("The number is larger than 10!")
    } else {
        print("The number is not larger than 10!")
    }
    } 

isabove10(12)

[1] "The number is larger than 10!"


Note that this function does not contain a return statement because it simply prints text telling whether or not the number is larger than 10.

### Functions in functions

When writing functions, it is possible to incorporate other functions. This allows one to fx create so-called "wrapper functions": functions that use existing functions but with different settings. (`read_csv` from the package `readr` is fx a wrapper function of `read_delim`).

Here is a simple wrapper function for the `head()` function with a different default:

In [29]:
head10 <- function(x, n = 10){
    head(x, n)
    }

In [30]:
library(readr)

data <- read_csv("https://github.com/CALDISS-AAU/workshop_R-intro/raw/master/data/ESS2018DK_subset.csv")

head10(data)

Parsed with column specification:
cols(
  idno = col_double(),
  netustm = col_double(),
  ppltrst = col_double(),
  vote = col_character(),
  prtvtddk = col_character(),
  lvpntyr = col_character(),
  tygrtr = col_character(),
  gndr = col_character(),
  yrbrn = col_double(),
  edlvddk = col_character(),
  eduyrs = col_double(),
  wkhct = col_double(),
  wkhtot = col_double(),
  grspnum = col_double(),
  frlgrsp = col_double(),
  inwtm = col_double()
)


idno,netustm,ppltrst,vote,prtvtddk,lvpntyr,tygrtr,gndr,yrbrn,edlvddk,eduyrs,wkhct,wkhtot,grspnum,frlgrsp,inwtm
110,180,8,Yes,Socialdemokratiet - The Social democrats,1968,Never too young,Male,1949,"Kort videregående uddannelse af op til 2-3 års varighed, F.eks. Erhvervsakadem",9,28,28,,,119
705,60,5,Yes,Det Konservative Folkeparti - Conservative People's Party,1976,67,Male,1958,"Kort videregående uddannelse af op til 2-3 års varighed, F.eks. Erhvervsakadem",22,37,45,,,55
1327,240,5,,,"Still in parental home, never left 2 months",,Male,2000,Folkeskole 9.-10. klasse,11,37,37,,,37
3760,300,7,Not eligible to vote,,"Still in parental home, never left 2 months",40,Male,2002,Folkeskole 9.-10. klasse,9,2,2,200.0,,43
4658,90,8,Yes,,1974,50,Female,1956,"Kort videregående uddannelse af op til 2-3 års varighed, F.eks. Erhvervsakadem",4,30,30,,,62
5816,90,7,Yes,SF Socialistisk Folkeparti - Socialist People's Party,1994,60,Male,1974,"Mellemlang videregående uddannelse af 3-4 års varighed. Professionsbachelorer,",35,37,37,37000.0,35000.0,61
7251,300,5,Yes,Dansk Folkeparti - Danish People's Party,1993,40,Female,1975,"Faglig uddannelse (håndværk, handel, landbrug mv.), F.eks. Faglærte, Social-",13,32,34,22000.0,30000.0,68
7887,360,8,Yes,Socialdemokratiet - The Social democrats,1983,55,Male,1958,"Lang videregående uddannelse. Kandidatuddannelser af 5.-6. års varighed, F.eks",25,39,39,36000.0,42000.0,89
9607,540,9,Yes,Alternativet - The Alternative,1982,64,Female,1964,"Mellemlang videregående uddannelse af 3-4 års varighed. Professionsbachelorer,",13,32,34,32000.0,,50
11123,150,7,Yes,Socialdemokratiet - The Social democrats,1994,45,Male,1974,"Faglig uddannelse (håndværk, handel, landbrug mv.), F.eks. Faglærte, Social-",16,37,37,9000.0,,62


## EXERCISE: CREATING FUNCTIONS

- Create a wrapper function for the `mean()` function where missing values are removed by default.

- Use the function on a valid variable in the ESS dataset.

### Use cases for creating own functions

Creating your own functions is often not necessary at all when using R but it can come in handy. Here are some examples of possible use cases:

- Creating wrapper functions
- Convert repeated parts of a script to a function
- Creating functions for specific datasets (fx a single function containing various datamanagement tasks for varialbes in ESS data that are used in several rounds. That way the same function can be used for ESS 2010, 2012, 2014 and so on).