<a href="https://colab.research.google.com/github/Jinzhao-Yu/BioStat615/blob/main/BIOSTAT615_Lecture_3_Fall_2022.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# BIOSTAT615 Lecture 3 - R

## 1. An example of recursion

A factorial $a_n = n!$ can be defined recursively as follows:

$$a_{n} = n a_{n-1}$$
$$a_0 = 1$$

This recursive formula can also be implemented as a function.

In [None]:
#' A function to calculate factorial
#' @param n - argument of factorial
#' @return n! = n * (n-1) * .. * 1 
factorial = function(n) {
  if ( n == 0 ) {  ## terminating condition; a_0 = 1
    ret = 1
  } else {         ## recursive calls; a_n = n * a_{n-1}
    ret = (factorial(n-1) * n)
  }
  return (ret)
}

In [None]:
factorial(10)

OK. It seems working. But how does it really work?

Let's see what is happening inside by printing a few messages in the function

In [None]:
#' A function to calculate factorial
#' @param n - argument of factorial
#' @return n! = n * (n-1) * .. * 1 
factorial = function(n) {
  print(paste0("factorial(", n, ") is starting")) ## comment out this line for details
  if ( n == 0 ) {  ## terminating condition
    ret = 1
  } else {         ## recursive calls
    ret = (factorial(n-1) * n)
  }
  print(paste0("factorial(", n, ") is ending")) ## comment out this line for details
  return (ret)
}

In [None]:
factorial(10)

Do you think you understood how this function actually works?

## 2. Tower of Hanoi

We learned how `towerOfHanoi()` algorithm could be implemented.

Can you actually write a recursive R function yourself?

In [None]:
#' towerOfHanoi() : print solution for hanoi tower problem
#' @param ndisk - Number of disks to move
#' @param beg - The name of rod stacking the disks initially
#' @param via - The name of rod used as a mediator
#' @param end - The name of rod to stack the disks eventually
towerOfHanoi = function(ndisk, beg, via, end) {
## Can you fill the function in?
}

Unless you completed (or gave up), do not look at the implementation below

In [None]:
#' towerOfHanoi() : print solution for hanoi tower problem
#' @param ndisk - Number of disks to move
#' @param beg - The name of rod stacking the disks initially
#' @param via - The name of rod used as a mediator
#' @param end - The name of rod to stack the disks eventually
towerOfHanoi = function(ndisk, beg, via, end) {
##  print(paste0("towerOfHanoi(", ndisk,",",beg,",",via,",",end,") is called")) ## comment out this line for details
  if ( ndisk > 0 ) {
    towerOfHanoi(ndisk-1, beg, end, via)
    cat(paste0("Disk ", ndisk, " : ", beg , " -> ", end),"\n")
    towerOfHanoi(ndisk-1, via, beg, end)
  }
}

In [None]:
towerOfHanoi(3, "s", "i", "d")

In [None]:
towerOfHanoi(4, "1st", "2nd", "3rd")

## 3. Insertion Sort

In [None]:
#' insertionSort() : sort an array in O(n^2)
#' @param x - A unsorted numeric vector
#' @return A sorted version of x
insertionSort = function(x) {
  n = length(x)
  for(j in seq(2,n,1)) {  ## assumption: a_1, a_2, ... , a_{j-1} is sorted
    key = x[j]            ## objective is to place a_j in the correct spot
    i = j-1               ## shift elements to right while x[i] > key
    while ( ( i > 0 ) && (x[i] > key) ) {
      x[i+1] = x[i]
      i = i - 1
    }
    x[i+1] = key          ## place key in the right spot (why i+1?)
  }
  return(x)
}

In [None]:
## check whether the algorithm works 
set.seed(2022)
x = sample(1:20)  ## shuffle 1...20
print(x)          ## print the original vector
print(insertionSort(x)) ## print the sorted output

## 4. Merge Sort

In [None]:
#' mergeSort() : sort an array in O(n log n)
#' @param x A unsorted numeric vector
#' @return A sorted version of x
mergeSort = function(x) {
    if(length(x)>1) {  
        mid = ceiling(length(x)/2)
        a = mergeSort(x[1:mid])             # divide - part 1
        b = mergeSort(x[(mid+1):length(x)]) # divide - part 2
        return( merge(a,b) ) # combine the solutions
    } else {  # terminating condition - only 1 element left
        return (x)
    }
}

In [None]:
#' merge() : merge two sorted vectors in O(n)
#' @param a - A sorted numeric vector
#' @param b - Another sorted numeric vector
#' @return A sorted vector merging a and b
merge = function(a,b) {
    r = numeric(length(a)+length(b)) # make an empty vector
    i=1; j=1 # i and j are indices for a and b
    for(k in 1:length(r)) { 
      ## if b is used up or a[i] < b[j], copy from a
      if ( ( j > length(b) ) || ( i <= length(a) && a[i]<b[j] ) ) {
        r[k] = a[i]
        i = i + 1
      } else {  ## otherwise, copy from b
        r[k] = b[j]
        j = j + 1
      }
    }
    return(r) ## return the merged vector
}

In [None]:
print(mergeSort(x))

## 5. Comparing efficiency of sorting algorithms 

In [None]:
## create a vector of 10,000 random values
x=rnorm(1e4)

In [None]:
## evaluate the time for insertion sort
system.time(insertionSort(x))

In [None]:
## evaluate the time for merge sort
system.time(mergeSort(x))

Which one do you prefer? `insertionSort()`, or `mergeSort()`?

In [None]:
## evaluate the time for default sort in R
system.time(sort(x))

Which one do you prefer? `sort()` or `mergeSort()`?