# Functions

**by Serhat Çevikel**

A great resource to answer why we need functions in R and how we write functions in R is the "Functions" chapter from "R for Data science book:

http://r4ds.had.co.nz/functions.html

```
One of the best ways to improve your reach as a data scientist is to write functions. Functions allow you to automate common tasks in a more powerful and general way than copy-and-pasting. Writing a function has three big advantages over using copy-and-paste:

You can give a function an evocative name that makes your code easier to understand.

As requirements change, you only need to update code in one place, instead of many.

You eliminate the chance of making incidental mistakes when you copy and paste (i.e. updating a variable name in one place, but not in another).

...

One of the best ways to improve your reach as a data scientist is to write functions. Functions allow you to automate common tasks in a more powerful and general way than copy-and-pasting. Writing a function has three big advantages over using copy-and-paste:

You can give a function an evocative name that makes your code easier to understand.

As requirements change, you only need to update code in one place, instead of many.

You eliminate the chance of making incidental mistakes when you copy and paste (i.e. updating a variable name in one place, but not in another).

...

There are three key steps to creating a new function:

You need to pick a name for the function. Here I’ve used rescale01 because this function rescales a vector to lie between 0 and 1.

You list the inputs, or arguments, to the function inside function. Here we have just one argument. If we had more the call would look like function(x, y, z).

You place the code you have developed in body of the function, a { block that immediately follows function(...).

```

We declare a function with the following syntax

```R
function(arguments) code body
```

However the main reason to write a function is its reusability. So we better assign this function to a named object (remember functions are also objects in R)

```R
function name <- function(arguments) code body
```

However most of the time our function will be longer than a one-liner. So we should wrap the code body inside a block with curly braces:

```R
function name <- function(arguments)
{
    code body
    some more code body
}
```

And it is good practice to always end your code block with a return statement for the return value of the function.

```R
function name <- function(arguments)
{
    code body
    some more code body
    return(some value or object)
}
```

NOTE THAT RETURN ENDS A FUNCTION!


## Function with no parameters

Let's start with a simple one, a one-liner function that accepts no parameters: 

In [None]:
hello_world <- function() return("Hello World")

Let's see what is inside the hello_world() function:

In [None]:
hello_world

And let's call it with "()":

In [None]:
hello_world()

Let's see the challenge of "Jack Torrance" from the Shining:

[![Dull boy](https://img.youtube.com/vi/4lQ_MjU4QHw/0.jpg)](https://www.youtube.com/watch?v=4lQ_MjU4QHw)


He is obsessed with writing this sentence:

"All work and no play makes Jack a dull boy"

thousands of times, let's help him:

In [None]:
dull_boy <- function()
{
    return("All work and no play makes Jack a dull boy")
}

Now let's call it, any number of times we want:

In [None]:
dull_boy()

## Function with a single parameter

This automated and eased Jack's work. But most of the time, we may need our function to respond to changes to some input - so return different values accordingly:

Let's write a function that returns the value $$x^2 + 2x + 1$$

It has one input named x, and the name of the function is plus1squared

**WE SHOULD ALWAYS NAME PARAMETERS!**

In [None]:
plus1squared <- function(x)
{
    return(x^2 + 2*x + 1)
}

Or more concisely:

In [None]:
plus1squared <- function(x)
{
    return((x + 1)^2)
}

Now call it for a single value:

In [None]:
plus1squared(3)

And call it for a vector:

In [None]:
plus1squared(0:4)


Since all the operators/functions we use inside the body are vectorized (^, +, \*), function can also operate on vectors

## Parameters with default values

Now let's not provide a value for the x parameter

In [None]:
plus1squared()

"With no default" it says, and function needs an x value so that the body can calculate

We can provide a default value to an argument, that the function may assume in case no value is provided for that argument

In [None]:
plus1squared <- function(x = 1)
{
    return(x^2 + 2*x + 1)
}

Now call it with no values for x:

In [None]:
plus1squared()

And call it with 1

In [None]:
plus1squared(1)

So the function assumed the default value of 1 when we did not provide one

## Functions with multiple line bodies

Let's say we want to do more things inside the function body:

- First calculate the (x+1)^2 value and assign to a variable named "interim"
- Decrement the interim value by three and return it
- Name of function will be square_and_decrement

In [None]:
square_and_decrement <- function(x = 1)
{
    interim <- x^2 + 2*x + 1
    return(interim - 3)
}

In [None]:
square_and_decrement(1)

And let's assign "interim - 3" to a variable named "decremented"
And then get the square root of decremented and return it

name the function "square_decrement_root"

In [None]:
square_decrement_root <- function(x = 1)
{
    interim <- x^2 + 2*x + 1
    decremented <- interim - 3
    return(sqrt(decremented))
}

Now call:

- plus1squared()
- square_and_decrement()
- square_decrement_root()

for x = 5

In [None]:
plus1squared(5)
square_and_decrement(5)
square_decrement_root(5)

## Scope of function 

Let's view the code for function square_decrement_root():

In [None]:
square_decrement_root <- function(x = 1)
{
    interim <- x^2 + 2*x + 1
    decremented <- interim - 3
    return(sqrt(decremented))
}

Let's be more parametric. Your function may be a part of a larger body of code. The input value for x argument may be passed by an assigned object, and the return value may be assigned to another object

- Let's call square_decrement_root with a variable named var_1 with value of 5
- Assign the return value of the function to a variable named var_2
- Print var_1 and var_2

In [None]:
var_1 <- 5
var_2 <- square_decrement_root(var_1)

var_1
var_2

Good, we can access var_1 and var_2

How about interim and decremented?

In [None]:
interim

In [None]:
decremented

They were assigned inside the function but do not exist?

Why?

Because those variables are NOT GLOBAL. THEY ONLY EXIST WITHIN THE SCOPE OF THE FUNCTION

THEY CAN BE PRINTED FROM INSIDE THE FUNCTION

In [None]:
square_decrement_print_root <- function(x = 1)
{
    interim <- x^2 + 2*x + 1
    print("interim value is:")
    print(interim)
    
    decremented <- interim - 3
    print("decremented value is:")
    print(decremented)
    
    return(sqrt(decremented))
}

Now use var_3, and var_4 for the same task above

In [None]:
var_3 <- 5
var_4 <- square_decrement_print_root(var_3)

var_3
var_4

The printed values are not captured as the return value, they can be used for debugging purposes.

Here we used print to demonstrate that variables created inside a function existonly inside the scope of the function, not outside.

We pass inputs into function with arguments and save outputs of the function with the return() call

## Multiple arguments

Now let's say, we want our code to be more general.

Apart from reusability, another advantage of using functions is to write more general code that can respond to more scenarios 

Let's say our function square_decrement_root decrements the interim value but "3" is converted to another parameters as "decr" with no default value

Let's name it square_decrement_multi

In [None]:
square_decrement_multi <- function(x = 1, decr)
{
    interim <- x^2 + 2*x + 1
    decremented <- interim - decr
    return(sqrt(decremented))
}

Call it with 5 for x and 4 for decr

In [None]:
square_decrement_multi(5, 4)

And with 5 for x and 7 for decr

In [None]:
square_decrement_multi(5,7)

So we can use the function more generally

## Order of arguments and named arguments

When we don't provide the name of the arguments to the function call, the argument values are taken as positional:

First value goes for the first argument, second values goes for the second, etc. So the values must be in the same order they are defined in the function code

First value is x and second value is decr

Let's we only want to define a value for decr as 3 , but take the default value for x (1 in our code) as granted

If we use positional parameters:

In [None]:
square_decrement_multi(3)

The 3 value is taken for x and since no value for decr is declared and no default value exists for decr, it throws an error

Now make the same call, but pass 3 explicitly to decr argument

In [None]:
square_decrement_multi(decr = 3)

It is the same as providing 1 and 3 values respectively for x and decr

In [None]:
square_decrement_multi(1, 3)

We can change the order the arguments are provided to the call when we pass them with names

In [None]:
square_decrement_multi(decr = 3, x = 1)

**NOTE THAT EQUAL SIGN IS USED WHEN PARAMETER VALUES ARE PASSED TO FUNCTION CALLS OR DEFAULT VALUES ARE DEFINED FOR PARAMETERS IN FUNCTION DECLARATIONS**

It is good practice to pass arguments by pass whenever possible for clarity and debugging purposes (otherwise you should always go back to function declaration to understand what each positional parameter stands for in a function call)

## Importance of return

Now take square_decrement_multi again:

In [None]:
square_decrement_multi <- function(x = 1, decr)
{
    interim <- x^2 + 2*x + 1
    decremented <- interim - decr
    return(sqrt(decremented))
}

Let's create a new version:

The sqrt(decremented) statement is assigned to root_1 but no return statement exists

The name of the function is square_decrement_noreturn

In [None]:
square_decrement_noreturn <- function(x = 1, decr)
{
    interim <- x^2 + 2*x + 1
    decremented <- interim - decr
    root_1 <- sqrt(decremented)
}

Now call it with 7, 7

In [None]:
square_decrement_noreturn(7,7)

Nothing is returned? Why?

Because we did not instruct the R interpreter to do so!

Now let's rewrite it again so that last line returns root_1

Name it square_decrement_root_1

In [None]:
square_decrement_root_1 <- function(x = 1, decr)
{
    interim <- x^2 + 2*x + 1
    decremented <- interim - decr
    root_1 <- sqrt(decremented)
    return(root_1)
}

Call both square_decrement_multi() and square_decrement_root_1() with named parameters x = 10, decr = 6, in any order

In [None]:
square_decrement_root_1(x = 10, decr = 6)
square_decrement_multi(decr = 6, x = 10)

They do the same thing under the hood. However, with a one more step, it is easier to debug square_decrement_root_1, and see what it does in each step. We may need this feature in the future with more complex code

**IMPORTANT NOTICE FOR RETURN()**

- In different contexts, a function may use **print()** or **cat()** functions: For example print() may be used to provide debug or progress messages during execution or cat can be used when an R script is called from the operating system's shell and the output is required to have a certain format (such as tab separated values). So it is good that you have seen some examples using these functions.

**- HOWEVER IN THE CONTEXT OF CMPE140 QUIZZES OR EXAMS NEVER USE print() or cat() INSIDE YOUR FUNCTION SINCE THEY DO NOT CAPTURE THE RETURN VALUE AS AN R OBJECT AND WE CANNOT TEST THE OUTPUT FROM THESE FUNCTIONS**

## Multiple return values

Now we want our square_decrement_root_1 function to return two values: negative square root and positive square root

Let's name it square_decrement_roots

In [None]:
square_decrement_roots <- function(x = 1, decr)
{
    interim <- x^2 + 2*x + 1
    decremented <- interim - decr
    root_1 <- sqrt(decremented)
    root_2 <- -root_1
    return(root_1, root_2)
}

And call it with x = 9, decr = 3

In [None]:
square_decrement_roots(x = 9, decr = 3)

```
Error in return(root_1, root_2): multi-argument returns are not permitted
Traceback:
```

Why?

Because a function can return only one object! And we called return with two parameters, return can only take one!

In [None]:
?return

```
Usage
function( arglist ) expr
return(value)
Arguments
arglist	
Empty or one or more name or name=expression terms.

expr	
An expression.

value	
An expression.

Details
The names in an argument list can be back-quoted non-standard names (see ‘backquote’).

If value is missing, NULL is returned. If it is a single expression, the value of the evaluated expression is returned. (The expression is evaluated as soon as return is called, in the evaluation frame of the function and before any on.exit expression is evaluated.)

If the end of a function is reached without calling return, the value of the last evaluated expression is returned.
```

Note two things:

- Return takes only "value", a single argument
- "If the end of a function is reached without calling return, the value of the last evaluated expression is returned."

In our noreturn version, since the last evaluated expression was an assignment and since assignments are slient (they return nothing), the function did not return anything

So what can we de two return both the + and - root?

We can return only single object but it can have multiple values!

For example we can concatenate both values into a single vector!

In [None]:
square_decrement_roots <- function(x = 1, decr)
{
    interim <- x^2 + 2*x + 1
    decremented <- interim - decr
    root_1 <- sqrt(decremented)
    root_2 <- -root_1
    return(c(root_1, root_2))
}

In [None]:
square_decrement_roots(x = 9, decr = 3)

Or we can name the concatenated vector and return it by name:

In [None]:
square_decrement_roots <- function(x = 1, decr)
{
    interim <- x^2 + 2*x + 1
    decremented <- interim - decr
    root_1 <- sqrt(decremented)
    root_2 <- -root_1
    roots <- c(root_1, root_2)
    return(roots)
}

## Apres "return" le déluge 

Let's make our function return some value and do more things:

In [None]:
square_decrement_more <- function(x = 1, decr)
{
    interim <- x^2 + 2*x + 1
    decremented <- interim - decr
    root_1 <- sqrt(decremented)
    root_2 <- -root_1
    roots <- c(root_1, root_2)
    return(roots)
    roots <- roots + 10
    return(c("I did more after return, added 10 to roots now roots are:", roots))
}

Call our function with the same parameters.

We expect the function to return the last expression:

In [None]:
square_decrement_more(x = 9, decr = 3)

Seems, the function disregarded the things we did after the first return statement.

**THE FUNCTION TERMINATES AT FIRST RETURN() CALL IT ENCOUNTERS**

To work it as expected, we should delete or comment out (so it doesn't execute anymore) the first return statement:

In [None]:
square_decrement_more <- function(x = 1, decr)
{
    interim <- x^2 + 2*x + 1
    decremented <- interim - decr
    root_1 <- sqrt(decremented)
    root_2 <- -root_1
    roots <- c(root_1, root_2)
    #return(roots)
    roots <- roots + 10
    return(c("I did more after return, added 10 to roots now roots are:", roots))
}

In [None]:
square_decrement_more(x = 9, decr = 3)

## Functions that take two vectors

We know even single values are vectors, vector is the atomic (indivisible) object type in R. Now let's follow the lab exercise and write a function that orders the first vector according to the descending order of the second vector  

In [None]:
vec_1 <- sample(seq(from = 3, to =5, length.out = 20), 10)
vec_1

In [None]:
vec_2 <- sample(seq(to = 1000, by = 7, length.out = 30 ), 10)
vec_2

In [None]:
ordervecs <- function(vector1, vector2)
{
    order2 <- order(vector2, decreasing = T)
    ordered1 <- vector1[order2]
    return(ordered1)
}

Let's call it with vec_1 and vec_2 inputs

In [None]:
ordervecs(vector1 = vec_1, vector2 = vec_2)

Note that, "vector1" is the name of the first argument to the function ordervecs(), vec_1 is the value/object we pass to that argument
"vector2" is the name of the second argument to the function ordervecs(), vec_2 is the value/object we pass to that argument

**IT IS GOOD PRACTICE NOT TO USE THE SAME NAME FOR THE ARGUMENT AND THE OBJECT WE PASS INTO**

## YOUR FUNCTION BODY SHOULD NOT REFER TO A GLOBAL OBJECT

A global object is an object we define at the global environment, not inside a function for example.

vec_1 and vec_2 are global objects

In [None]:
vec_1
vec_2

vector1 and vector2 are local objects that exist inside the scope of ordervecs function

In [None]:
vector1

In [None]:
vector2

They don't exist in the global environment unless we instruct to do so

Now let's we did a mistake and confused vector1 (the named argument and the local object created with that argument) with vec_1 (the object that we want to pass to vector1 argument as a value)

In [None]:
ordervecs_global <- function(vector1, vector2)
{
    order2 <- order(vector2, decreasing = T)
    ordered1 <- vec_1[order2]
    return(ordered1)
}

When we have vec_1 and vec_2 in our global environment and they appear in the code as intended, we may not feel the difference:

In [None]:
ordervecs_global(vector1 = vec_1, vector2 = vec_2)

Now let's create a new vector to pass into vector1 argument, named as vec_1a:

In [None]:
vec_1a <- sample(seq(from = 7, to = 1000, by = 5), 10)
vec_1a

And delete the original vec_1 object!

In [None]:
rm(vec_1)
vec_1

We do not have vec_1 anymore. That may be the case when a collaborator uses your code inside her own R session where vec_1 was never defined

Now try to call ordervecs_global with vec_1a and vec_2

In [None]:
ordervecs_global(vector1 = vec_1a, vector2 = vec_2)

That's a major source of bug in quiz responses with functions!

**DO NOT USE GLOBAL OBJECT, DEFINED OUTSIDE THE SCOPE OF THE FUNCTION, INSIDE THE FUNCTION CODE BODY!**


**ONLY INTERACTION BETWEEN THE GLOBAL OBJECTS AND YOUR FUNCTION SHOULD BE WHEN PASSING GLOBAL OBJECTS AS VALUES TO ARGUMENTS**

Note that we test you submissions in our own environments with test inputs that we create ourselves.

If your code refers to global objects, the code cannot find it when run in our environment, and throw an error, rendering your submission wrong!