# Week 2: Basics of R

Today we will cover the fundamental building blocks of R.

## Types and Common Objects
There are five fundamental types in R. They are...
- character ...... "this and", "t", "h", "a", "t"
- numeric ........ 3.14159, 12.345689
- integer ........ 1, 3, 10, 9, -1
- boolean (T/F) .. TRUE, FALSE, T, F
- complex ........ 3 + i, -1 + 3i

Some other special types in R are
- Inf ... infinity
- NaN ... Not A Number
- NA .... Not Applicable

And the most common object you will use in R is the **vector**
- <code>c(1,2,3,4), c(), c("some", "elements")</code>
- *Note* that we can make **lists** which can hold multiple types or objects of different classes. We will cover lists next week.

Functions are much like common programming languages
- <code>abs(-5), sum(c(1,1,2,3,5,8)), ggplot(iris)</code>

In [1]:
# We don't need to specify the type/class on declaration like C++ or Scala.

# Remember, R has one class for both strings and characters
char <- "!"
also_char <- "PSTAT is awesome"

# I need to use "L" to specify a number to be an Int
integer <- 1L
not_int <- 3.3333333

# 1 and 0 can be interpreted as TRUE and FALSE
false <- 1
true <- FALSE

# Since R is a statistical computing environment, it make sense that
# it would have a built-in type just for complex numbers
complex = -1 + 2i

if(false){
    print(sqrt(complex))
    
    print(!FALSE)
    
    print(also_char)
}

[1] 0.786151+1.27202i
[1] TRUE
[1] "PSTAT is awesome"


In [20]:
# Some operations on vectors
first_vector <- c(120, 130, 131, 174, 193)
second_vector <- c(1, 2, 3, 4, 5)

added_together <- first_vector + second_vector
product <- first_vector * second_vector

# The operations are element-wise
added_together
product

In [10]:
#  _________ EXERCISES _________

# Make two variables, both with numeric types
number_1 <- <FILL_IN>
number_2 <- <FILL_IN>

# Make two vectors. Their elements should be numer_1 and number_2
vector_1 <- <FILL_IN>
vector_2 <- <FILL_IN>

# Make a vector, with length = 2, and contains FALSE
logical_vec <- <FILL_IN>

# Take the product of vector_1 and the logical vector
<FILL_IN> * logical_vec

# Were you surprised by the result?

ERROR: Error in parse(text = x, srcfile = src): <text>:4:13: unexpected '<'
3: # Make two variables, both with numeric types
4: number_1 <- <
               ^


## More on Vectors

Vectors are an important component of R. Let's take a moment to learn how to properly use them! 

### Selection

We can use square brackets, <strong><code>[ ]</code></strong>, at the end of a vector variable to access its elements. This is a common paradigm in a lot of programming languages... *except Scala*. Remember that R uses one-based indexing; Python's indexing is zero-based.

<code>
first_elem <- a_vector[1]
other_elem <- a_vector[5]
</code>

But what if you need multiple elements back? Just use a vector of indices in the <strong><code>[ ]</code></strong>! Remember that if you ask for more than 1 element from a vector, R will return those elements in a vector
<code>
first_3 <- a_vector[c(1,2,3)]
</code>

In [32]:
#  _________ YOUR TURN _________

the_vector <- c(0.01, 0.1, 1, 10, 100, 1000)

# Grab the first and last elements of the_vector
first <- <FILL_IN>
second <- <FILL_IN>

# Take their product
the_product <- <FILL_IN>

# Multiply the_product against the first two elements of the_vector...
# ... IN ONE LINE!
one_liner <- <FILL_IN> * <FILLIN>

ERROR: Error in parse(text = x, srcfile = src): <text>:6:10: unexpected '<'
5: # Grab the first and last elements of the_vector
6: first <- <
            ^


In [30]:
# solution
#  _________ YOUR TURN _________

the_vector <- c(0.01, 0.1, 1, 10, 100, 1000)

# Grab the first and last elements of the_vector
first <- the_vector[1]
second <- the_vector[length(the_vector)]

# Take their product
the_product <- first * second

# Multiply the_product against the first two elements of the_vector...
# ... IN ONE LINE!
one_liner <- the_product * the_vector[c(1,2,3)]
one_liner

Note that vectors <strong>do not</strong> nest other vectors within them. When we run the code below, the vector <vector>d</vector> just concatenates the contents of <code>b</code> to <code>a</code>. If you need to have a vector of vectors, use a list of vectors instead.

In [9]:
# Make some simple vectors
a <- c(1, 2, 3)
b <- c(4, 5, 6)

# Nest them inside another vector
d <- c(a, b)

# d is not a vector of vectors, it is a vector of integers
print(paste("Length of d =", length(d)))
print(d)

[1] "Length of d = 6"
[1] 1 2 3 4 5 6


## Value Matching

In the previous section, we could retrieve elements of a vector by typing their index within the square brackets. We can also retrieve elements of a vector <strong>by a condition</strong>. Conditional statements are logical expressions that evaluate to TRUE or FALSE. Some statements your're probably familiar with are <code>==, !=, >=, <=, >, <, &&, and ||</code>.

R also comes with more interesting functions and operators for conditional selection. If this lecture is a breeze for you, try using these functions in the end-of-lecture exercises.

<ul>
    <li>
        <a target="_blank" href="https://www.tutorialspoint.com/r/r_operators.htm">"in" operator</a>: <code>7 %in% c(1,2,7,8,7)</code> will return <code>TRUE</code>
    </li>
    <li>
        <a target="_blank" href="">which()</a>: <code>which(c(TRUE, FALSE, FALSE, TRUE)</code> will return <code>c(1, 4)</code> because those are the indices of the true elements.
    </li>
    <li>
        <a target="_blank" href="https://stat.ethz.ch/R-manual/R-devel/library/base/html/all.html">all()</a>: <code>all(c(T, T, T))</code> will return <code>TRUE</code> because all values were true. It is often used more like this: <code>all(c(5,5,5) == 5)</code>
    </li>
</ul>

To match values in a vector using a condition, follow this syntax: 

<strong><code>aVector[ <em>condition involving aVector</em> ]</code></strong>

You must repeat "aVector" in the conditional expression. This is <strong>incorrect</strong>: <code>aVector[ >3]</code>. This is correct: <code>aVector[aVector > 3]</code>.

In [24]:
# A quick example...

example <- c(-1, 2 -5, 4, 7, -11, 55, 44, 3, -9)

positives <- example[example > 0]
divisibleBy11 <- example[example %% 11 == 0]

# Notice that I had to repeat "example" in the square brackets.

print(positives)
print(divisibleBy11)

[1]  4  7 55 44  3
[1] -11  55  44


## Writing good code

<strong>A.K.A.</strong> How to not look like a fool at your first job.

Don't you hate it when you have to read someone's notes, and the writer had really poor handwritting? The ame thing applies with code. Follow these tips so you don't give your boss a headache at your first job.

**In R, we use <-**
- Yes it is weird, yes you need to use it, and yes there are reasons
    - Using "=" and "<-" affects the variable's scope
    - We'll come back to this when we cover functions

**Use Whitespace *Wisely* **
- In lists, function parameters, etc.
    - <code>c("This","is","ugly")</code> vs. <code>c("this", "is", "beautiful")</code>
    - <code>sum(1,2,3,4)</code> vs. <code>sum(1, 2, 3, 4)</code>
    - There is **some** wiggle room on this rule. Don't use whitespace in named function arguments
      - <code>excessiveSpacing <- f(a = "arg", b = 3, c = c(1, 2, 3))</code>
      - <code>betterSpacing <- f(a="arg", b=3, c=c(1, 2, 3))</code>
- Put a space between variable assignments
    - <code>no<-1.1</code>
    - <code>yes <- 6.6</code>
- Use a blank line between chunks of code that are accomplishing different steps, like paragraphs in writing
    
**Comment your Code**
- If you made a temporary work-around in a project, leave a comment
- If you made a very complex expression, expain it *briefly*
- Use comments to section off parts of code
    - Use <code>###</code> to seperate parts of your R script (Data Collection, Cleaning, Modeling, Model Validation)

**Write code your friends can read**
- Keep this in mind, you'll be set.
- Use variable names that make sense
 

## Exercises

Stay sharp! Work on these and ask a fellow student for help if you get stuck. If you breeze through them, find a student and assist them. You'll understand R better by explaining the process to someone else!

In [1]:
# Assign three variables: "one", "two", and "three" with 
# the values 1, 2, and 3

one <- <FILLIN>
two <- <FILLIN>
three <- <FILLIN>

# vector1 should hold them in order. vector2 should hold them
# in reverse order

vector1 <- <FILLIN>
vector2 <- <FILLIN>

ERROR: Error in parse(text = x, srcfile = src): <text>:3:2: unexpected ','
2: 
3: a,
    ^


In [3]:
# You can get the dot product of two vectors using "%*%"
# Take the dot product of vector1 and vector2

dot <- <FILLIN> __ <FILLIN>


0
10


In [None]:
bigVect <- c(10, 9, 8, 7, 6, 5, 4, 2, 2, 1)

# Pull out the first three elements of bigVect and assign them
# to "first", "second", and "third".

first <- bigVect[<FILLIN>]
second <- bigVect<FILLIN>
third <- <FILLIN>

# There is an extra "2" in bigVect. Change the value using square
# brackets, so bigVect then counts down from 10 to 1.

bigVect[<FILLIN>] <- <FILLIN>

stopifnot(third == 8)
stopifnot(bigVect == 10:1)

In [None]:
aVect <- c(1,2,3,4,5,5,3,5,5,5,7,6,4,2,8)

# Use value matching to retrieve all the 5's from aVect

just5s <- <FILLIN>
stopifnot(all(just5s == 5))

In [13]:
# The vector below is contaminated with 0 values that should not
# be there. Use value matching to remove all 0's.

dirtyVect <- c(0,1,3,5,7,4,0,23,2,4,0,6,3,-4,2,0,-3,54,0,33,
               26,22,66,-8,22,1,3,6,8,12,63,16,23,0,8,-1,34,56,12,0)

cleanVect <- <FILLIN>

stopifnot(length(cleanVect) == 33)

In [19]:
# Clean this ugly code!
# The code is using a for loop to access and print elements of the vector
# ---------------------

wow=1
such=c(1,2,3,4)
aaaaa<-length(such)

while(wow<=aaaaa){    print( such[wow]  )
    
    
    wow <- wow + 1    }

[1] 1
[1] 2
[1] 3
[1] 4


In [28]:
# vect1 and vect2 are the same length. Wherever vect1 has "yes",
# retrieve the value in the corresponding index from vect1. (So 
# the first element of "answer" should be "3").

vect1 <- c("yes", "no", "no", "yes", "yes", "yes", "no", "yes")
vect2 <- c(3, 2, 4, 4, 6, 2, 1, 2)

answer <- <FILLIN>

stopifnot(answer[1] == 3 & length(answer) == 5)

### Cleaned Code (answer)

In [27]:
# Here's the clean code
# -----------------------

# Declare variables
iter <- 1
vect1 <- c(1, 2, 3, 4)
len <- length(such)

# Iterate over vect, print elements
while(iter <= len)
{
    print(vect1[iter])
    iter <- iter + 1 
}

**How I cleaned the dirty code**:
- Added comments because I'm not a fool
- Renamed the variables
    - iter for "iterator"
    - vect1 for "vector one"
    - len for "length"
- Added spaces as appropriate
    - Around assignment
    - After comments
    - Around conditionals (<,>,==, <=, etc.)
- Removed unnecessary white space
    - Dropped down the leading curly brace
        - This is personal preference
    - Put the ending curly brace on a new line
        - This is <strong>not</strong> personal preference, always do this

## Congratulations!

You're done with tonight's exercises! Check back to [the syllabus](https://github.com/JasonFreeberg/R_Tutorials/blob/master/README.md) for this week's homework. And remember... *if you're going through hell, you keep going.*