# Control Structures in R

* Author: Johannes Maucher, few modifications by OK in 2019
* Last Update: 2017-03-13


In an R-program statements are executed sequentially from top to bottom. As in all other languages R provides *control structures*, which allow e.g. that statements are executed repetitively or only if certain conditions are met.
Compound statements in control structures contain more than one atomic R-statement. Such blocks of statements must be surrounded by curly brackets *{}*.



## Conditional Execution

### If-Else
In an *If-Else* control structure the (compound) statement in the *if-block* is executed only if the *condition* at the *if*-keyword is true, otherwise the (compound) statement of the *else-block* is executed:

`if (cond) statement1 else statement`

It is also possible that only an *if-block* is defined, then nothing is executed, if the condition at the if-block is not met:

`if (cond) statement`

`cond` is an expression, which resolves to either TRUE or FALSE.

Examples:

In [17]:
gender <- "Male"
if (gender == "Male") print("blue")


[1] "blue"


In [2]:
#Good style

gender <- "Female"
if (gender == "Male"){
    mycolor <- "blue"
} else {
    mycolor <- "pink"
}
print(mycolor)

[1] "pink"


Same as above, but as one-liner:

In [3]:
#Bad style - I don't like that
if (gender=="Male") mycolor <- "blue" else mycolor<-"pink"
print(mycolor)

[1] "pink"


In the case of simple statements, the if-else control structure can be formulated in the following compact form:

`ifelse(cond, statement1,statement2)`

which is very fast!

Example:

In [18]:
gender <- c("Male", "Otto", "Otto", "Male")
ifelse(gender == "Male", mycolor <- "blue", mycolor <- "pink")

### Switch
An *If-Else* control structure realizes a binary decision. If more than two alternatives must be checked, the *Switch* control structure can be applied. The general syntax is:

`switch(expression, value-assignment-pairs)`

I.e. if expression evaluates to value, then the assignment, which belongs to this value is returned by the switch-function.


In [5]:
country <- "US"
currency <- switch(country,
      "Germany" = "Euro",
      "UK" = "Pound",
      "US" = "Dollar",
      "Russia" = "Rubel"           
      )

print(currency)

[1] "Dollar"


## Repeated Execution and Loop 

### For-Loop
In a *For-Loop*, a (compound) statement is executed repetitively until the value of a counter-variable is no longer contained in a sequence. Usually, in each iteration of the loop the value of the counter-variable is incremented.

`for (variable in sequence) statement`


Example:

In [6]:
#Bad style
for (c in 1:5) print(c("Iteration ", c))

#Good style
for (c in 1:5) {
    print(c("Iteration ", c))
}

[1] "Iteration " "1"         
[1] "Iteration " "2"         
[1] "Iteration " "3"         
[1] "Iteration " "4"         
[1] "Iteration " "5"         
[1] "Iteration " "1"         
[1] "Iteration " "2"         
[1] "Iteration " "3"         
[1] "Iteration " "4"         
[1] "Iteration " "5"         


### Nested Loop

Nested loops are loops, which contain other loops in their statement. 

In [7]:
for (row in 1:3){
    for (col in 1:2){
        print(c(row, col))
    }
}

[1] 1 1
[1] 1 2
[1] 2 1
[1] 2 2
[1] 3 1
[1] 3 2


### While Loop
In a *While-Loop* a (compound) statement is executed repetitively until a condition is no longer true:

`while (cond) statement`


Example:

In [8]:
i <- 0
result <- 0
while (result < 100) {
    result <- i**2
    print(result)
    i <-i+1
}

cat('After termination: i = ',i)

[1] 0
[1] 1
[1] 4
[1] 9
[1] 16
[1] 25
[1] 36
[1] 49
[1] 64
[1] 81
[1] 100
After termination: i =  11

> **Note:** In the case of processing the rows or columns of large datasets looping can be very inefficient and time-consuming. In contrast to other programming languages, such as *Java*, in R functions can be called to single elements in the same way as for vectors. This feature implies, that many tasks, which require loops in other programming languages can be implemented without loops in R. 

> As shown in this example of calculating squares:  

In [9]:
(bases <- 1:10)
(squares <- bases**2)

### Terminate loops and iterations by *break* and *next*
The `break`-statement within a loop causes the entire loop to terminate. The `next`-statement terminates just the current iteration of the loop and then continues with the next iteration.

In [10]:
for (i in 1:5){
    if (i==4) {
        break
    } else {
        print(i)
    }
}

[1] 1
[1] 2
[1] 3


In [11]:
for (i in 1:5){
    if (i==4) {
        break
    } else {
        print(i)
    }
}

[1] 1
[1] 2
[1] 3


### What should not be done in loops
In the following loop in each iteration a new element is attached to a vector. This is inefficient because in each loop new memory must be allocated for storing the increasing vector.

With [`system.time()`](https://www.rdocumentation.org/packages/base/versions/3.6.1/topics/system.time) we measure "user CPU time" which outputs the CPU time spent by the current process (e.g. the current R session) and "system CPU time" which outputs the CPU time spent by the kernel (the operating system). "elapsed CPU time" is the elapsed time by the statements - that is our point of interest here.

In [20]:
system.time({
    a <- numeric()
    for (j in 1:100000){
        a <- c(a, j%%3)
    }
})

#length(a)

   user  system elapsed 
  24.13   10.78   35.14 

The same functionality as in the loop above can be implemented in a much faster way by first allocating the required memory for the vector and assigning in each iteration the corresponding value to the already assigned memory element.

In [21]:
system.time({
    a <- numeric(100000)
    for (j in 1:100000){
        a[j] <- j%%3
    }
}) 

   user  system elapsed 
   0.03    0.00    0.03 

See [Strategies to Speedup R Code](https://www.r-bloggers.com/strategies-to-speedup-r-code/) to pimp your R code.


### Alternatives for repeated Execution
Compared to other programming languages in R for- and while- loops are rarely used. This is because R provides much more efficient ways to carry out repeated execution: The functions *apply()*, *lapply()*, *sapply()* and *vapply()*. These functions are described in an upcoming lecture.

## Exercises

[Exercise on control structures in R](../exercises/Ass04ControlStructuresR.ipynb)