# Lesson 05 - Programming and Parallel Programming in R
The basic programming features in R are:
* Structures of decisions
* Loops
* Functions and anonymous functions

Other features:
* Vectorization
* Parallel programming
* Developing and Running
* Debugging

## Structures of Decisions

### What `if`..., and what `else`?

In [None]:
if (!FALSE) print("* Be very careful when not using curly brackets!")

if (1 < 2) {
    print("Yes, 1 is less than 2")
}

In [None]:
n <- 0

if (n < 0) {
    print("Negative")
} else if (n == 0) {
    print("Zero")
} else {
    print("Positive")
}

### In case you have more cases
You may switch to `switch(n, result1, ..., resultN)` or `switch(str, name1=result1, ..., default_result)`
* One value is tested
* May return an object or a sequence of actions
* May have a default behaviour

In [None]:
n <- 1 + 1
switch(n, {print("Case One")}, {print("Case Two")}, {print("Case Three")})

In [None]:
x <- 9
action <- "walk"

x <- x + switch(action, walk=1, run=3, 0)
x

In [None]:
action <- "run"
switch(action, walk={x <- x + 1}, run={x <- x + 3}, {print("error")})
x

action <- "jump"
switch(action,
       walk={x <- x + 1},
       run={x <- x + 3},
       {x <- NA ; print("Undefined action, please run or walk")})
x

### Ternary Operator on Assignment
* For scalar values: `var <- if(expr) val1 else val2`
* For each element in a vector: `var <- ifelse(expr, val1, val2)`

In [None]:
x <- -12345678
sign <- if(x < 0) -1 else (if(x > 0) 1 else 0)
sign

In [None]:
v1 <- c(1, 1, 2, 3, 5, 8, 13, 21, 34, 55)
v2 <- ifelse(v1 %% 2 == 1, 7, 4)
v2

## Loops
* Use `for()` to loop over elements of collections (lists, vectors, ...)
* Use `while()` to loop until a condition is satisfied

In [None]:
words <- c("List", "of", "small", "words")

for (word in words) {
    print(word)
}

In [None]:
for (i in 1:3) {
    print(i)
}

In [None]:
count <- 10
while (count >= 0) {
    print(count)
    count <- count - 1
}

### Not Using Loops in R
* Loops in R tend to be slow
* It is preferred to avoid them when possible
* Use vectorized operations (like `ifelse()`) as much as possible
* We will see more about that later

## Functions and Anonymous Functions
* In R, functions are objects that get assigned variable names

In [None]:
dotProd <- function(vect) {
    sum(vect * vect)
}

dotProd(1:5)

* As seen previously with built-in R functions, you may specify default values for optional arguments in functions:

In [None]:
dotProd <- function(vect, debugLevel=0) {
    if (debugLevel >= 1) {print("dotProd: multiplying...")}
    prod <- vect * vect
    if (debugLevel >= 2) {print(prod)}
    
    if (debugLevel >= 1) {print("dotProd: final sum...")}
    sum(prod)
}

dotProd(1:5)
dotProd(debugLevel=2, 6:10)

### Anonymous Functions
* You can use a function without giving it a name
* Useful if a small function is only needed on one line
* This is most frequently used with the `*apply()` set of functions. We will see more about that later

In [None]:
p <- (function(x, y) list(a=x**2-y**2, b=2*x*y, h=x**2+y**2)) (3, 1)
p

(p$a^2 + p$b^2 == p$h^2)    # Testing: A^2 + B^2 == H^2

### Lexical Scoping in R
* Go to: https://b.socrative.com/login/student/
* Type room name: PIERLUCSTONGE

What is the output of this R code? (Select your answer in Socrative)
```{R}
z <- 17
foo <- function(x) {
    bar <- function(y) {
        y + z
    }
    
    z <- 5
    x + bar(x)
}

z <- 10
foo(2)
```

* **Free variable**: not defined in the function nor as an argument
* **Lexical scoping** (used in R): the value of free variables is determined when the function is called
* **Dynamic scoping**: see https://en.wikipedia.org/wiki/Scope_%28computer_science%29

What is the output of this R code? (Select your answer in Socrative)
```{R}
foo <- function(x, y=x) {
    x+y
}

x <- 100
c(foo(2, x), foo(2))
```

### Vectorization
* `Reduce()` function iterates a function over a list or vector
* R provides functions for applying a function repeatedly to sets of data:
  * `apply()` - apply function to sections of array, return array
  * `sapply()` - apply function to elements of a list, returns a vector, matrix or list
  * `lapply()` - apply function to elements of a list, returns a list
* Use '`?apply`' or '`?Reduce`' for more details

In [None]:
lapply(c(2,4,6), function(x) x**2)

## Exercise 01
Edit code below in order to reimplement `ifelse(v1 %% 2 == 1, 7, 4)` with `sapply()`

In [None]:
v1 <- c(1, 1, 2, 3, 5, 8, 13, 21, 34, 55)

v3 # <- sapply(...)
v3

## Parallel Programming in R
Built-in package `parallel`
* Included in versions >= 2.14.0
* Builds on and merges `snow` (simple network of workstations) and `multicore` packages

In [None]:
library(parallel)

# Report the number of available cores
detectCores()

In [None]:
# Create a cluster with 2 cores
cl <- makeCluster(2)
cl

* The most common way to parallelize R code is to replace expensive `lapply()` operations with the parallel `parLapply()` from the `parallel` package:

In [None]:
myList <- seq(4e6)

system.time(myList2 <-    lapply(    myList, function(x) log(x)))

system.time(myList2 <- parLapply(cl, myList, function(x) log(x)))

## Exercise 02
* The following serial code spends most of the time on an `lapply()` operation
* Parallelize this code using the `parallel` package in R

In [None]:
#Choose a seed for reproducible random numbers
set.seed(1)

#Size of random matrix 
size <- 512

#Generate random numbers and shape them 
#into a square matrix
x <- runif(size*size)
x <- matrix(x, nrow=size, ncol=size)

#Compute the coefficients of variation of each row 
print("Serial time:")
print(system.time(y <- lapply(x, function(x) sd(x)/mean(x))))

## Developing and Running
In the real world (outside of Jupyter):
* For small tests: use an interactive `R` session in a terminal
  * `R`
* For developping scripts: use RStudio
  * https://www.rstudio.com/products/rstudio/
* For running complex pipelines on a Unix system: use the `Rscript` command
  * `Rscript script.R`

## Debugging in R
* **Good practices**:
  * Validate all your function inputs
    * Write understandable error messages
  * Create unit-tests that will trigger these errors


* Interactive debugging only works in an R session within a Unix shell (like Bash)
  * `traceback()` - Prints the sequence of function calls
* For IPython notebooks:
  * Use `on.exit(traceback(2))` before aborting with `stop()`

In [None]:
factorial <- function(n) {
    if (n < 0) {
        on.exit(traceback(2))
        stop("factorial(): does not support n < 0")
    } else if (n == 0) {
        1
    } else {
        n * factorial(n - 1)
    }
}

In [None]:
factorial(-1)

* In an interactive R session in a terminal, use `debug(functionName)` in order to flag a function for debugging
  * When `functionName` runs, you will be taken to a code "browser"
  * There are four basic debug commands
    * n[ext] - Execute current line and print the next
    * c[ontinue] - Execute the rest of the function without stopping
    * q[uit] - Exit the "browser"
    * where - Show the call stack
* On IPython notebooks, we simply get the full debugging log (automatic `next` commands until the function returns)

In [None]:
debug(factorial)

factorial(-2)