In [1]:
options(jupyter.rich_display = FALSE)

# Week 9 Tutorial: Fundamentals of R Programming II

## POP77001 Computer Programming for Social Scientists

##### Module website: [bit.ly/POP77001](https://bit.ly/POP77001)

## Code formatting

- Use consistent style and indentation (RStudio indents by 2 whitespaces, Jupyter Notebook by 4)
- Even though it does not affect how programs are executed in R

In [2]:
# Good style
is_positive <- function(num) {
  if (num > 0) {
    res <- TRUE
  } else {
    res <- FALSE
  }
  return(res)
}

In [3]:
# Bad style
is_positive <- function(num) {
if (num > 0) {
res <- TRUE
}
else {
res <- FALSE
}
return(res)
}

## Exercise 1: Iteration

- Below you see a matrix of random 30 observations of 5 variables
- Inspect visually the matrix
- Which variable(s) do you think has(ve) the highest standard deviation?
- First, try subsetting individual rows and columns from this matrix
- Check the dimensionality of the matrix using `dim()`, `nrow()` and `ncol()` functions
- Write a loop that goes over each variable and calculates its standard deviation
- You can use `sd()` function to calculate the standard deviation
- Save these calculated standard deviations in a vector
- Find the variable with the maximum standard deviation using `max()` or `which.max()` functions
- Is it the one you thought it would be?

In [4]:
# When dealing with random number generation it's always a good idea to make your code replicable
# by setting the seed with set.seed(function)
set.seed(2021)
# Here we create a matrix of 30 observations of 5 variables
# where each variable is a random draw from a normal distribution with mean 0
# and standard deviation drawn from a uniform distribution between 0 and 10
mat <- mapply(
  function(x) cbind(rnorm(n = 30, mean = 0, sd = x)),
  runif(n = 5, min = 0, max = 10)
)

In [5]:
mat

      [,1]       [,2]        [,3]         [,4]       [,5]       
 [1,]  2.3839361  -9.3570121  -3.38211062 -1.6188191   2.4345828
 [2,] -2.8108812   3.6398922   9.12221934  2.1498588   2.1145327
 [3,]  9.5657865 -13.0553064  -5.65808900 -1.5074431  -0.4870916
 [4,]  4.4413551   3.4095297  -5.55809112 -4.7089982 -11.5910021
 [5,]  0.7666776  12.0176689   0.86723020  0.8672294 -11.6148730
 [6,] -3.0214758  -0.8865633 -11.54534021  6.1809994  -1.3194973
 [7,]  5.0308587  -4.3594766  10.55399527  0.5751735   1.6799590
 [8,]  0.5180495  20.3242580  -1.64929984 -0.2931002  -1.1849911
 [9,]  7.6669813  -1.8116540  16.75477035 -4.4174753  -6.1632955
[10,] -2.7861283   3.2815657  -4.84693805  5.0580929   3.1865921
[11,]  6.1104984  -6.1228633  -1.06513883 -2.3779707  -3.9278496
[12,]  5.3133132   7.2338976   8.62467646 -0.9139698  -1.5229585
[13,]  4.0453120  -2.0328074  -0.09658005 -4.4241866  -2.7919515
[14,]  7.1206917   3.6832976 -12.75733596 -2.2678010  -7.4990240
[15,]  5.8374139   0.4654

## Exercise 2: Functions

- As R is a functional language, many of iteration routines can be avoided.
- For example, instead of creating a loop for calculating standard deviations above,
- We are more likely to run a function `apply(<object_name>, 2, <function_name>)` to calculate the desired summary statistic for each of the variables (more on the `apply`-family of function in the next lecture)
- Apply this function to the matrix from the exercise above
- Now, change 2 in the function call to 1
- What do you see? What do the current numbers show? Does this summary make sense and why?
---
- Let's turn to a more complicated case
- Below you can see another matrix object, but this time it's interspersed with letters
- What is the type of this matrix?
- Write a function that can take this matrix as an input and return a list, where each element is a column of the input matrix
- Internally, you can re-use the loop from the previous exercise
- In addition to that while building iteratively your list try checking whether a column is coercible into numeric

In [6]:
set.seed(2021)
mat2 <- cbind(
  letters[sample.int(26, 30, replace = TRUE)],
  mapply(
    function(x) cbind(rnorm(n = 30, mean = 0, sd = x)),
    runif(n = 3, min = 0, max = 10)
  ),
  letters[sample.int(26, 30, replace = TRUE)]
)

In [7]:
mat2

      [,1] [,2]               [,3]               [,4]               [,5]
 [1,] g    -2.26248101185558  -4.67722160983318  -1.50979300274865  w   
 [2,] f    -0.445765619916805 -21.0901358530494  -2.87809056214973  v   
 [3,] n    -2.6434201992139   0.40893792423111   -9.69321155043919  n   
 [4,] z    2.90026773641013   -3.4480824645454   2.83291610610758   v   
 [5,] g    -3.90091658817758  -8.9771249160351   -0.64825005423638  l   
 [6,] l    0.252977008172167  0.97011182624825   -5.54706110264981  v   
 [7,] t    -3.4940171234176   3.99472863747609   2.38858085002964   s   
 [8,] f    -0.849870553544038 -1.5938329162815   -7.10484622112137  m   
 [9,] f    -0.224941456038607 -14.4829217683358  -4.43291989578963  l   
[10,] f    2.64232547402139   -14.0758627365065  2.16178933687776   m   
[11,] n    -4.71446626344588  0.149991103137851  3.62617045505629   y   
[12,] e    -3.4760147063997   -1.73297205751811  -1.31366103638704  v   
[13,] o    2.44733178666701   3.66418632154723   -0

## Week 9: Assignment 4

- Practice subsetting, conditional statements and functions in R
- Due at 11:00 on Monday, 15th November