# Scripting in R
My intention is to cover some of the basics of scripting in R, which will help you deal with assignments in 421. These are techniques that are generally not immediately necessary for analyzing data in R--but even learning this scripting stuff will likely make you generally more proficient in R, even for data analysis.

### If / Else

This is probably the most basic of the scripting techniques (and is one that will *sort of* come into play when analyzing data). The basic syntax is as follows:

```
if (TRUE) {
    (code to be executed)
} else {
    (code to be executed)
}
```
The curly braces are what is most important for organizing the code (the indenting doesn't matter). The crucial key here is the code inside the parentheses next to the `if` statement. This code needs to reduce to either `TRUE` or `FALSE` (ie, a boolean). Last week we talked about a bunch of ways to generate TRUEs and FALSEs, but an if statement needs to process just *one* TRUE or FALSE return. What happens if you give an `if` statement more than one boolean? Well:

In [None]:
if (c(TRUE, TRUE)) {
    print('Woo')
}

Notice using the `else` after the `if` is *not* necessary. It's only necessary in the event that I want to execute some code in the event that the `if` statement returns `FALSE`.

What do you think will be returned in the following examples?

In [None]:
if (2 + 3 == 5){
    print('Good')
} else {
    print('Bad')
}

In [None]:
x <- 5
if (x == 5) print('Good') else print('Bad')

In [None]:
x <- c(0,1,2,2,3)

if (sum(x) > 3) {
    print('Good')
} else print('Bad')

In [None]:
x <- c(0,1,2,2,3)

if (x > 3) {
    print('Good')
} else print('Bad')

In [None]:
if (x[1] > -1 & x[3] == 2){
    print('Good')
} else print('Bad')

In [None]:
if (x[1] == 1){
    print('Condition 1 Met')
} else if (x[3] == 2) {
    print('Condition 2 Met')
} else if (x[4] == 2) {
    print('Condition 3 Met')
} else print('Everything Failed')

In the example above, "conditions" 2 and 3 are both met, how come only condition 2 prints?  
Comare to code below:

In [None]:
if (x[1] == 1){
    print('Condition 1 Met')
} else if (x[3] == 2) {
    print('Condition 2 Met')
}
if (x[4] == 2) {
    print('Condition 3 Met')
} else print('Everything Failed')

### Loops
Loops can be tricky. Let me demonstrate an easy one first, the `while` loop:

In [None]:
x <- 1
while (x < 6) {
    x = x + 1
    print(x)
    print(x < 6)
}

The logic here is:
```
while (some condition remains true) {
    keep doing all of this
}
```
Everything within the curly braces will be executed continually as long as the expression within the parentheses reduces to `TRUE`.

**For Loop**  
The for loop relies on the same logic except, instead of continually evaluating a boolean expression, it's executing code a set number of times. The way that the number of executions is defined can be tricky. The basic structure is:

```
for (iterator in iterable_object){
    keep doing all of this
    stop once you've reached the end of iterable object
}
```
The variable `iterator` takes on the value of each level of the `iterable_object` as the script is executed. Let's do an example you're probably already familiar with from 421.

In [None]:
for (i in 1:5){
    print(i)
}

In this case, `i` is just a variable that takes on the values 1 through 5. We could've named `i` anything. Remember what I said last week about the colon `:` notation, it's just *one* way of quickly defining a vector. A vector is the most common type of `iterable_object`, it's something that you can go over each of the levels of. We could've written that loop in either of these two ways:

In [None]:
for (i in c(1,2,3,4,5)){
    print(i)
}

In [None]:
x <- 1:5
for (i in x){
    print(i)
}

But realize you can extend this logic to *any* vector:

In [None]:
fruits <- c('apple','pear','orange','pineapple')

for (fruit in fruits){
    print(fruit)
}

You can get even more tricky and nest for loops inside for loops. Let's say I wanted to print the multiples of 2, 3, and 4:

In [None]:
numbers <- c(2,3,4)

for (number in numbers) {
    print(number)
    for (i in 1:5){
        print(paste(number, 'times', i, 'equals:', i*number))
    }
}

It's often the case that you'll want to store the output from each loop iteration into an object so that you can clearly see and work with the results of your `for` loop. The way this works is that you need to create something (an object) to store all the output in *before* the loop begins. You *can't* create this object as you go through the loop, it has to be set up beforehand.

Think of the `for` loop like it's sort of like leaving a faucet of water running for a set period of time. Each second the faucet is running, there's a certain amount of water that comes out of it. If we want to store all of this water so we can drink it, we need to use a cup that's large enough to contain all the water. (Yes, usually you just turn off the faucet when the cup fills up, but let's pretend that here you're "locked" into keeping the faucet on for a set period of time). You need to use a cup that's at least as large as all the water that will come out of the faucet for that set period of time, otherwise the cup will overflow and you'll lose water.

You need to tell R to create a data object that's large enough to store all the output from the `for` loop that you want to save. Then, on each iteration of the loop, we need to specify where in the object that output will go. This can be any object type (I'm pretty sure Chris shows you how to do it with arrays), I'll demonstrate with a vector.

Let's go back to the fruit example. Let's say I want to change the fruit names to include a statement about how much I love all of them, and store that in an object so I can access it after the fact:

In [None]:
fruits <- c('apples','pears','oranges','pineapples')

## initialize output object that has the same length as the number of times the for loop will run, in this case it's just
## the length of the 'fruits' vector
out <- vector(,length(fruits))
## im also gonna add a counter
count = 0

for (fruit in fruits){
    count = count + 1
    out[count] <- paste('Gosh, I really love', fruit)
    ## print the output on each iteration
    print(out)
    cat('\n')
}

#print(out)

You can see that, on each iteration, the "slots" of the vector get incrementally filled according to which iteration of the loop it is. This same logic applies to filling the entries of any data object, the indexing just gets more tricky when the objects are no longer one dimensional.