# Lecture 1.2: Functions in R  

## Functions  

* All computations in R are carried out by calling functions
* Functions defined by users have the same status as the functions built into R

In [1]:
# A simple function that squares its argument

square = function(x) x * x

In [2]:
square(5)

In [3]:
square(1:7)

**Exercise**: Write a function 'sumsq' that returns the sum of the squares of its argument

In [4]:

sumsq(1:5)

**Exercise**: Write a function 'sqdiff' that squares the difference between the argument and its mean

In [5]:

sqdiff(1:10)

**Exercise**: Write a function 'sumsqr' that returns the sum of the squares of the difference between the argument and its mean

In [6]:

sumsqr(1:10)

### Example: The 'which' Function

* The 'which' function takes a logical vector and returns the indices of the values in the vector that are true

In [7]:
u = runif(10)

In [8]:
which(u > .5)

In [9]:
which(.25 < u & u < .75)

In [10]:
# Given a set of logical values stored in a vector x

x = u > .5
print(x)

 [1] FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE FALSE


In [12]:
# the indices of elements of the vector can be obtained using the expression

1:length(x)

In [13]:
# The indices corresponding to the TRUE elements of x can be selected by using logical subsetting

(1:length(x))[x]

**Exercise**: Write your own which function and call it 'mywhich'.

In [14]:

mywhich(u > 0.5)

**Quiz**: Can you think of any edge cases when the function breaks?

In [15]:
t = u[u < 0]
print(t)

numeric(0)


In [17]:
x = t > 0.5
mywhich(x)

In [18]:
print(x)

logical(0)


In [19]:
1:length(x)

Not quite what we wanted. How to fix this?

Hint: use 'seq(along = x)' instead of '1:length(x)'.

**Exercise**(Optional): Think of any other possible edge cases, and try to refine the implementation of the function so it gives the output below.

In [4]:
mywhich(c(0, 1, 0, 1))

In [6]:
mywhich(c("false", "true"))

In [7]:
mywhich(c(TRUE, NA, FALSE))

* In general, an R function definition has the form:  

function (*arglist*) *body*  

* *arglist*: a (comma separated) list of variable names known as the arguments of the function
* *body*: an expression known as the body of the function
* When there is more than one expression in the body, we put '{ }' around the body

**Exercise**: The monthly payment on a loan with principal, p, loan repayment period in years, y, and monthly (percentage) interest rate i, is given by the formula  

$$ payment = p \frac{(i/1200)(1 + i/1200)^{12y}}{(1 + i/1200)^{12y} - 1} $$  

Write a R function that takes in three arguments p, y and i, and computes payments.

In [9]:
payment(1000000, 30, 5)

### If-Then-Else Statements  

if (*condition*) *expr1* else *expr2*


In [13]:
x = rnorm(1)
if (x > 0) y = sqrt(x) else y = -sqrt(-x)

In [14]:
print(y)

[1] -0.8959412


In [15]:
# If can also be written as
y = if (x > 0) sqrt(x) else -sqrt(-x)
print(y)

[1] -0.8959412


### For Loops

for(*variable* in *vector*) *expression*

In [22]:
x = rnorm(5)
s = 0
for(i in 1:length(x))
    s = s + x[i]

In [23]:
s

In [24]:
# It can also be written as

s = 0
for(elt in x)
    s = s + elt

In [25]:
s

**Exercise**: Look up 'next' and 'break'. What is the difference between the two?

In [28]:
# What does the following code calculate?

s = 0
for (elt in x) {
    if (elt <= 0)
        next
    s = s + elt
}
s

In [29]:
sum(x[x > 0])

In [30]:
# What does the following code do?

s = 0
for (elt in x) {
    if (elt <= 0) {
        s = NA
        break
    }
    s = s + elt
}
s

[1] NA

### While Loops  

while (*condition*) *expression*

In [31]:
# The following example shows how to carry out long
# division (of integers) by repeated subtraction

dividend = 22
divisor = 5
wholes = 0
remainder = dividend
while (remainder > divisor) {
    remainder = remainder - divisor
    wholes = wholes + 1
}
c(wholes, remainder)

**Exercise**: Write a function 'divide' that takes in two arguments, dividend and divisor, and carries out the computation above.

**Exercise**: A word is palindrome if does not change when its letters are reversed. An number is a palindrome if reversing its digits produces and identical value. White a function 'is.palindrome' to check for palindromes.  

Hint: you might find the 'substring' and 'paste' functions helpful.

## Matrices  

* A matrix is a rectangular arrangement of values which are all of the same basic type

In [32]:
# Create a matrix

matrix(1:6, nrow = 3, ncol = 2)

0,1
1,4
2,5
3,6


In [33]:
# Create a matrix, but fill the values by row

matrix(1:6, nrow = 3, ncol = 2, byrow = TRUE)

0,1
1,2
3,4
5,6


In [34]:
# Create a 2*3 matrix filled with 1's

matrix(1, nrow = 2, ncol = 3)

0,1,2
1,1,1
1,1,1


In [35]:
matrix(1:6, nrow = 3)

0,1
1,4
2,5
3,6


In [36]:
x = matrix(1:6, nrow = 3)
print(x)

     [,1] [,2]
[1,]    1    4
[2,]    2    5
[3,]    3    6


In [37]:
ncol(x)

In [38]:
nrow(x)

In [39]:
dim(x)

In [40]:
# Create a matrix by combining columns
cbind(1:3, 4:6)

0,1
1,4
2,5
3,6


In [41]:
# Create a matrix by combining rows
rbind(1:3, 4:6)

0,1,2
1,2,3
4,5,6


**Quiz**: Try the following expression and explain the output.

In [None]:
rbind(1:2, 1:3)

### Extracting Matrix Elements  

The ij-th element of a matrix x can be extracted with the expression x[i, j].

In [43]:
print(x)

     [,1] [,2]
[1,]    1    4
[2,]    2    5
[3,]    3    6


In [44]:
x[1, 1]

### Matrix Subsets

In [49]:
x[2:3, 1:2]

0,1
2,5
3,6


In [50]:
x[c(1,3), 1:2]

0,1
1,4
3,6


In [51]:
x[c(1,3), 1:2] = 99

In [52]:
print(x)

     [,1] [,2]
[1,]   99   99
[2,]    2    5
[3,]   99   99


In [53]:
x[c(1,3), 1:2] = 1:4

In [54]:
print(x)

     [,1] [,2]
[1,]    1    3
[2,]    2    5
[3,]    2    4


In [55]:
x[1,]

In [56]:
x[,2]

In [57]:
x[row(x) < col(x)] = 0

In [58]:
print(x)

     [,1] [,2]
[1,]    1    0
[2,]    2    5
[3,]    2    4


**Exercise**: Set the diagonal values to be 1 in the following matrix.

In [59]:
x = matrix(0, nrow = 4, ncol = 4)
print(x)

     [,1] [,2] [,3] [,4]
[1,]    0    0    0    0
[2,]    0    0    0    0
[3,]    0    0    0    0
[4,]    0    0    0    0


### Computations with Matrices  

In [61]:
x = matrix(1:4, nrow = 2, ncol = 2)
x + x^2

0,1
2,12
6,20


In [62]:
x + 1

0,1
2,4
3,5


In [63]:
# What happens when we do this?
x + 1:3

“longer object length is not a multiple of shorter object length”

0,1
2,6
4,5


In [65]:
# Find the row means
apply(x, 1, mean)

In [67]:
# Find the standard deviation for each row
apply(x, 1, sd)

In [66]:
# Find the column means
apply(x, 2, mean)

In [68]:
# Find the sum of squares for each column
apply(x, 2, function(x) sum(x^2))

In [70]:
# Subtract the row mean
sweep(x, 1, apply(x, 1, mean), "-")

0,1
-1,1
-1,1


In [71]:
# Subtract the column mean
sweep(x, 2, apply(x, 2, mean), "-")

0,1
-0.5,-0.5
0.5,0.5


In [74]:
# Transpose x
t(x)

0,1
1,2
3,4


In [72]:
# Calculate the outer product
outer(1:8, 1:8)

0,1,2,3,4,5,6,7
1,2,3,4,5,6,7,8
2,4,6,8,10,12,14,16
3,6,9,12,15,18,21,24
4,8,12,16,20,24,28,32
5,10,15,20,25,30,35,40
6,12,18,24,30,36,42,48
7,14,21,28,35,42,49,56
8,16,24,32,40,48,56,64


In [75]:
# Matrix multiplication
x %*% t(x)

0,1
10,14
14,20


**Exercise**:

1) Create a 4*4 matrix, you can fill it with any numbers you like  

2) Set the first row to be 1:4  

3) Set the 4th column to be 1:3  

4) Fill the 2*2 sub-matrix at the center with 99

5) Make all the values on the diagonal 1

* **More useful functions** - we don't have time to cover them in today's class, but you might want to look them up if you have extra time during the lab.

    * lapply, sapply, mapply

* Common **[data structures](http://www.statmethods.net/input/datatypes.html)** in R