# Introduction to Programming

## Executing the commands in a File

A set of R commands can be saved in a file and then executed as if you had typed them in from the command line. The source command is used to read the file and execute the commands in the same sequence given in the file.

\> source('file.R')

\> help(source)

\>


In [6]:
# Define a variable.
x <- rnorm(10)

# calculate the mean of x and print out the results.
mux = mean(x)
#cat("The mean of x is ",mux,"\n")

# print out a summary of the results
summary(x)
cat("The summary of x is \n",summary(x),"\n")
print(summary(x))

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
-0.8062 -0.6667  0.2208  0.3305  1.0029  2.3632 

The summary of x is 
 -0.8062084 -0.6666564 0.2208422 0.330463 1.002914 2.363238 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
-0.8062 -0.6667  0.2208  0.3305  1.0029  2.3632 


Some examples are given assuming that a file, simpleEx.R, is in the current directory. 

In [7]:
source('simpleEx.R')

The mean of x is  -0.5490513 
The summary of x is 
 -2.318656 -1.557799 -1.062676 -0.5490513 0.8493697 1.384212 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
-2.3187 -1.5578 -1.0627 -0.5491  0.8494  1.3842 



The file also demonstrates the use of # to specify comments. Anything after the # is ignored. Also, the file demonstrates the use of cat and print to send results to the standard output. Note that the commands have options to send results to a file. Use help for more information.

The output for the different options can be found below:

In [8]:
source('simpleEx.R')

The mean of x is  0.2591133 
The summary of x is 
 -1.708577 -0.7353721 0.6607362 0.2591133 1.045939 1.877565 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
-1.7086 -0.7354  0.6607  0.2591  1.0459  1.8776 


In [9]:
source('simpleEx.R',echo=TRUE)


> x <- rnorm(10)

> mux = mean(x)

> cat("The mean of x is ", mean(x), "\n")
The mean of x is  0.05771005 

> summary(x)
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
-1.21062 -0.72977 -0.26571  0.05771  0.32697  2.27370 

> cat("The summary of x is \n", summary(x), "\n")
The summary of x is 
 -1.210617 -0.7297702 -0.2657094 0.05771005 0.3269658 2.273703 

> print(summary(x))
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
-1.21062 -0.72977 -0.26571  0.05771  0.32697  2.27370 


In [10]:
source('simpleEx.R',print.eval=TRUE)

The mean of x is  0.2588207 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
-1.0812 -0.0650  0.2341  0.2588  0.7490  1.4642 
The summary of x is 
 -1.081173 -0.06500179 0.2341496 0.2588207 0.7489819 1.464185 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
-1.0812 -0.0650  0.2341  0.2588  0.7490  1.4642 


In [11]:
source('simpleEx.R',print.eval=FALSE)

The mean of x is  -0.2705606 
The summary of x is 
 -2.199802 -0.916397 -0.1260679 -0.2705606 0.4279287 0.9248894 
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
-2.1998 -0.9164 -0.1261 -0.2706  0.4279  0.9249 


In [12]:
source('simpleEx.R',verbose=TRUE)

'envir' chosen:<environment: R_GlobalEnv>
encoding = "native.enc" chosen
--> parsed 6 expressions; now eval(.)ing them:

>>>> eval(expression_nr. 1 )

> x <- rnorm(10)
curr.fun: symbol <-
 .. after ‘expression(x <- rnorm(10))’

>>>> eval(expression_nr. 2 )

> mux = mean(x)
curr.fun: symbol =
 .. after ‘expression(mux = mean(x))’

>>>> eval(expression_nr. 3 )

> cat("The mean of x is ", mean(x), "\n")
The mean of x is  0.2019424 
curr.fun: symbol cat
 .. after ‘expression(cat("The mean of x is ", mean(x), "\n"))’

>>>> eval(expression_nr. 4 )

> summary(x)
curr.fun: symbol summary
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
-1.4809 -1.0519  0.1729  0.2019  0.7191  2.8315 
 .. after ‘expression(summary(x))’

>>>> eval(expression_nr. 5 )

> cat("The summary of x is \n", summary(x), "\n")
The summary of x is 
 -1.480942 -1.051911 0.1728507 0.2019424 0.7190653 2.831464 
curr.fun: symbol cat
 .. after ‘expression(cat("The summary of x is \n", summary(x), "\n"))’

>>>> eval(expression_nr

One common problem that occurs is that R may not know where to find a file.

In [13]:
source('notThere.R')

“cannot open file 'notThere.R': No such file or directory”

ERROR: Error in file(filename, "r", encoding = encoding): cannot open the connection


R will search the current working directory. You can see what files are in the directory using the dir command, and you can determine the current directory using the getwd command.

In [14]:
getwd()

In [15]:
dir()

## if statements
Conditional execution is available using the if statement and the corresponding else statement.

In [16]:
x = 0.1
if( x < 0.2)
    {
        x <- x + 1
        cat("increment that number!\n")
    }
#increment that number!
x

increment that number!


The else statement can be used to specify an alternate option. In the example below note that the else statement must be on the same line as the ending brace for the previous if block.

In [19]:
x = 2.0
if ( x < 0.2)
    {
        x <- x + 1
        cat("increment that number!\n")
    }else
    {
        x <- x - 1
        cat("nah, make it smaller.\n");
    }
x

nah, make it smaller.


Finally, the if statements can be chained together for multiple options. The if statement is considered a single code block, so more if statements can be added after the else.

In [20]:
x = 1.0
if ( x < 0.2)
    {
        x <- x + 1
        cat("increment that number!\n")
    } else if ( x < 2.0)
    {
        x <- 2.0*x
        cat("not big enough!\n")
    } else
    {
        x <- x - 1
        cat("nah, make it smaller.\n");
    }
x

not big enough!


## for statements



The for loop can be used to repeat a set of instructions, and it is used when you know in advance the values that the loop variable will have each time it goes through the loop. The basic format for the for loop is for(var in seq) expr


An example is given below:

In [21]:
for (lupe in seq(0,1,by=0.3))
{
        cat(lupe,"\n");
}

0 
0.3 
0.6 
0.9 


In [22]:
x <- c(1,2,4,8,16)
for (loop in x)
{
    cat("value of loop: ",loop,"\n");
}

value of loop:  1 
value of loop:  2 
value of loop:  4 
value of loop:  8 
value of loop:  16 


## while statements
The while loop can be used to repeat a set of instructions, and it is often used when you do not know in advance how often the instructions will be executed. The basic format for a while loop is while(cond) expr

In [23]:
lupe <- 1;
x <- 1
while(x < 4)
{
    x <- rnorm(1,mean=2,sd=3)
    cat("trying this value: ",x," (",lupe," times in loop)\n");
    lupe <- lupe + 1
}

trying this value:  1.849869  ( 1  times in loop)
trying this value:  2.291895  ( 2  times in loop)
trying this value:  2.2037  ( 3  times in loop)
trying this value:  7.675012  ( 4  times in loop)


## repeat statements
The repeat loop is similar to the while loop. The difference is that it will always begin the loop the first time. The while loop will only start the loop if the condition is true the first time it is evaluated. Another difference is that you have to explicitly specify when to stop the loop using the break command.

That is you need to execute the break statement to get out of the loop.

In [24]:
repeat
{
    x <- rnorm(1)
    if(x < -2.0) break
}
x

## break and next statements
The break statement is used to stop the execution of the current loop. It will break out of the current loop. The next statement is used to skip the statements that follow and restart the current loop. If a for loop is used then the next statement will update the loop variable.

In [25]:
x <- rnorm(5)
for(lupe in x)
{
    if (lupe > 2.0)
        next
    if( (lupe<0.6) && (lupe > 0.5))
        break
    cat("The value of lupe is ",lupe,"\n");
}

The value of lupe is  -0.338553 
The value of lupe is  -0.8899581 
The value of lupe is  0.01398593 
The value of lupe is  -0.8135413 
The value of lupe is  -1.638293 


## switch statement
The switch takes an expression and returns a value in a list based on the value of the expression. How it does this depends on the data type of the expression. The basic syntax is switch(statement,item1,item2,item3,...,itemN).

If the result of the expression is a number then it returns the item in the list with the same index. Note that the expression is cast as an integer if it is not an integer.

In [27]:
x <- as.integer(2)
x
z = switch(x,1,2,3,4,5)
z
x <- 3.5
x
z = switch(x,1,2,3,4,5)
z

If the result of the expression is a string, then the list of items should be in the form “valueN”=resultN, and the statement will return the result that matches the value.

In [28]:
y <- rnorm(5)
y
x <- "sd"
z <- switch(x,"mean"=mean(y),"median"=median(y),"variance"=var(y),"sd"=sd(y))
z
x <- "median"
z <- switch(x,"mean"=mean(y),"median"=median(y),"variance"=var(y),"sd"=sd(y))
z

## scan statement
The command to read input from the keyboard is the scan statement. It has a wide variety of options and can be fine tuned to your specific needs. We only look at the basics here. The scan statement waits for input from a user, and it returns the value that was typed in.

When using the command with no set number of lines the command will continue to read keyboard input until a blank line is entered.

\> help(scan)

\> a <- scan(what=double(0))

1: 3.5

2:

Read 1 item

\> a

[1] 3.5

\> typeof(a)

[1] "double"

\>

\> a <- scan(what=double(0))

1: yo!

1:

Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
scan() expected 'a real', got 'yo!'

If you wish to only have it read from a fixed number of lines the nmax option can specify how many lines can be typed in, and the multi.line option can be used to turn off multi-line entry.

\> a <-  scan(what=double(0),nmax=1,multi.line = FALSE)

1: 6.7

Read 1 item

\> a

[1] 6.7

# Functions
A shallow overview of defining functions is given here. A few subtleties will be noted, but R can be a little quirky with respect to defining functions. The first bit of oddness is that you can think of a function as an object where you define the function and assign it to a variable name.

To define a function you assign it to a name, and the keyword function is used to denote the start of the function and its argument list.

In [34]:
newDef <- function(a,b)
{
    x = runif(10,a,b)
    mean(x)
}
newDef(-1,1)
newDef

The last expression in the function is what is returned. So in the example above the sample mean of the numbers is returned.

In [36]:
x <- newDef(0,1)
x

The arguments that are passed are matched in order. They can be specified explicitly, though.

In [39]:
newDef(b=10,a=1)
newDef(10,1)

“NAs produced”

You can mix this approach, and R will try to match up the named arguments and then match the rest going from left to right. Another bit of weirdness is that R will not evaluate an expression in the argument list until the moment it is needed in the function. This is a different kind of behavior than what most people are used to, so be very careful about this. The best rule of thumb is to not put in operations in an argument list if they matter after the function is called.

Another common task is to have a function return multiple items. This can be accomplished by returning a list of items. The objects within a list can be accessed using the same $ notation that is used for data frames.

In [40]:
c = c(1,2,3,4,5)
sample <- function(a,b)
{
    value = switch(a,"median"=median(b),"mean"=mean(b),"variance"=var(b))
    largeVals = length(c[c>1])
    list(stat=value,number=largeVals)
}
result <- sample("median",c)
result

In [41]:
result$stat

In [42]:
result$number