# R and Rstudio

To download R, go to CRAN, the comprehensive R archive network. Use the cloud mirror, <https://cloud.r-project.org>, which automatically figures it out for you.

RStudio is an integrated development environment, or IDE, for R programming. Download and install it from <http://www.rstudio.com/download>. 

When you start RStudio, you will see three panes in the interface:
<img src="./figures/rstudio.jpg" alt="ds" style="width: 750px;"/>

# R Basics

* *Console pane*: where you enter your commands
* *Running code*: the act of telling R to perform an act by giving it commands in the console
* *Objects*: where values are saved in R. 

## Data types

* integers: values like $-1,2,1024$ 
* doubles: a larger set of values containing both the integers but also fractions and decimal values like $-0.8, 20.15$
* logicals: either `TRUE` or `FALSE`
* characters/strings： text such as "cabbage", “UMass”, "I am coding in R." Characters are often denoted with the quotation marks around them

Both integers and doubles are called numerics.
You can use `typeof()` to investigate the type of a value.
* Factors: categorical data are commonly represented in R as factors. 
Categorical data can also be represented as strings. For example, "male", "female"


### Your turn
* What are the types of 
```
"1"  "one"   one 1
```

 ## Objects
Object is where values are saved in R.
You can create new objects with `<- `
```{r}
x <- 3
```

All R statements where you create objects, assignment statements, have the same form
```{r eval=FALSE}
object_name <- value
```

There is a general preference among the R community for using `<-` for assigning values instead of `=`.
In R’s syntax the symbol `=` has two distinct meanings that get routinely conflated:

1. the first meaning is an assignment operator. This is the same as `<-`

2. the second meaning is a syntax token that signals named argument passing in a function call

In most cases, `=` will work, but it will cause confusion sometimes. 
So I recommend using `<-`. You can use RStudio’s shortcut: Alt + - (the minus sign)


Object names must start with a letter, and can only contain letters, numbers, `_` and `.`
You want your object names to be descriptive, so you need a convention for multiple words. 
```{r eval=FALSE}
some.people.use.dots
someUseCamelCase
```

You can inspect an object by typing its name:
```r
x
```

 ## Objects
Object is where values are saved in R.
You can create new objects with `<- `
```r
x <- 3
```

All R statements where you create objects, assignment statements, have the same form
```r
object_name <- value
```

There is a general preference among the R community for using `<-` for assigning values instead of `=`.
In R’s syntax the symbol `=` has two distinct meanings that get routinely conflated:

1. the first meaning is an assignment operator. This is the same as `<-`

2. the second meaning is a syntax token that signals named argument passing in a function call

In most cases, `=` will work, but it will cause confusion sometimes. 
So I recommend using `<-`. You can use RStudio’s shortcut: Alt + - (the minus sign)


Object names must start with a letter, and can only contain letters, numbers, `_` and `.`
You want your object names to be descriptive, so you need a convention for multiple words. 
```{r eval=FALSE}
some.people.use.dots
someUseCamelCase
```

You can inspect an object by typing its name:
```{r}
x
```

Make another assignment
```r
this_is_a_long_name <- 2.5
```

To inspect this object, try out RStudio’s completion facility: type "this," press TAB, add characters until you have a unique prefix, then press return.

## Using R as a calculator
You can use R as a calculator.
```r
1/100*20
1+2+4
pi^2
```

There are some other operators that are usefull

```r
100 %/% 30
100 %% 30
```

This is useful for extracting hours and minutes from a time data.
```r
1350 %/% 100
1350 %% 100
```
### Your turn
* Consider the format `hhmm` for the time.
How many minutes elasped from `240` to `1530`? ( hint: first calculate the total minutes of `240` and `1530` counted from midnight  )


## Relational operators

Relational operators are used to compare between values.

```r
1 <  1 
1 <= 1
1 == 1
1 != 1
```

## Logical operators
Logical operators are used to carry out Boolean operations like `AND`, `OR`, etc.
Zero is considered `FALSE` and non-zero numbers are taken as `TRUE`.

```r
! 3>2
2>1  &  3<2 
2>1  |  3<2 
```

### Your turn
* Try the following code. Are you surprised by the result? Think about why.
```r
sqrt(4)^2 == 4
sqrt(2)^2 == 2
```

##  Vector
* Vectors: a series of values. These are created using the `c()`, where `c()` stands for “combine” or “concatenate.” For example, `c(6, 11, 13, 31, 90, 92)` creates a six element series of integer values.

R is very good at dealing with vectors. The arithmetic operators introduced before can all be used for vectors.

```r
x <- c(1,2,3,4,5)
x/2
x^2
x %% 2
2/x
```
```r
y <- c(6,7,8,9,10)
x/y
x^y
```
```r
c(1,2,3) < c(3,2,1)
```

### Your turn

* Run the following code and explain the result. Do some experimentation to verify your thought.

```{r eval=FALSE}
x <- c(2,4,6,8)
y <- c(1,2)
x/y
```

## Subsetting a vector
**Positive integers** return elements at the specified positions:
```r
letters
letters[2]
letters[c(3,1,5)]
```

**Negative integers** omit elements at the specified positions
```{r}
letters[-1]
letters[-c(1,3)]
```


**Logical vectors** select elements where the corresponding logical value is `TRUE`

```{r}
a <- 1:4
a[c(TRUE,FALSE,TRUE,FALSE)]
a[a > 2]
```


### Your turn
* Use subsetting to creat a vector with the 5th, 10th, 15th, 20th, and 25th letters.

## List and Data frame
List is the object which contains elements of different types – strings, numbers, vectors and another list inside it.


```{r}
list1 <-  list(name="Mike", gender="M", company="A")
list(x = c(1,2,3), gender="M", company="A")
list1$name
```

Data frames: rectangular spreadsheets. They are representations of datasets in R where the rows correspond to observations and the columns correspond to variables that describe the observations. 
They are lists of vectors of equal length.

```{r}
name <- c("Mike", "Lucy", "John") 
age <- c(20, 25, 30) 
student <- c(TRUE, FALSE, TRUE) 
df <-  data.frame(name, age, student)  
```

You can subset a data frame similar to a vector. But for data frame, you have two dimensions.
```{r}
df[2,2]
df[,2]
df[2,]

df[ df$age<25,]
```

## Functions
Functions perform tasks in R. They take in inputs called arguments and return outputs. You can either manually specify a function’s arguments or use the function’s default values.

R has a large collection of built-in functions that are called like this:
```{r eval=FALSE}
function_name(arg1 = val1, arg2 = val2, ...)
```


Let’s try using `seq()` which makes regular sequences of numbers and, while we’re at it, learn more helpful features of RStudio. Type `se` and hit TAB. A popup shows you possible completions. 


 ```{r}
 seq(1,10)
 ```
If you want more help, type `? seq` to get all the details in the help tab in the lower right pane.

 When there are more than one arguments for a function, we do not need to specify them in order, but need to write out the arguments names.
 
 ```{r}
 seq(1,4)
 seq(from = 4, to = 1)
 ```
 Here, we use `=` as a syntax token that signals named argument passing in a function call.
 

In the help page of a function, the values of the arguments  are default values, which you do not have to specify when calling this function. If the argument does not have a value in the help page, then you need to specify it.
 
 We can then look at `sum` function.
 ```{r}
 sum(1,2,3,4)
 ```

Carl Friedrich Gauss-who, as an elementary student in the late 1700s, amazed his teacher with how quickly he found the sum of the integers from 1 to 100. We can do it as well, using R.
```{r}
sum(seq(1,100))
```
 

Quotation marks and parentheses must always come in a pair. RStudio does its best to help you, but it is still possible to mess up and end up with a mismatch. If this happens, R will show you the continuation character `+`:
 ```{r eval=FALSE}
 x <- seq(1,3
 ```
 
 The `+` tells you that R is waiting for more input; it does not think you are done yet. Either add the missing pair, or press ESCAPE to abort the expression and try again.
 
 Now look at your environment in the upper right pane. You can see all of the objects that you have created. You can remove objects from environment using `rm()`, and remove all the objects using `rm(list=ls())`
 
 You can clear the console using `Ctrl+l`.

### Your turn
 
 * Press  Alt + Shift + K. What happens?
 
 * Here are some examples appear in the  help page of `seq()`, explain the results.
 ```{r eval=FALSE}
 seq(0, 1, length.out = 11)
 seq(1, 6, by = 3)
 ```
 
 * Generate sequence  `1, 1.5, 2, 2.5, ..., 5`. 
 
 * Calculate  $1^3+2^3+\cdots+20^3$ and $(1+2+\cdots+20)^2$
 

## Other useful functions

`sqrt()` computes the square root 
```{r}
sqrt(4)
```

`log()` and `exp()` compute the logarithm and expoential 
```{r}
log(2.718)
exp(1)
```


`mean()` computes the mean of a vector
```{r}
mean(1:100)
```

`min()` and `max()` computes the maximum and minimun

```{r}
min(1:100)
max(c(1,2,3,5,5))
```

`range()`

```{r}
range(1:100)
```



## Scripts
To have more room to work, we can use the script editor. Open it up either by clicking the File menu, and selecting New File, then R script, or using the keyboard shortcut Cmd/Ctrl + Shift + N. Now you’ll see four panes.

The script editor is a great place to put code you care about. Keep experimenting in the console, but once you have written code that works and does what you want, put it in the script editor. RStudio will automatically save the contents of the editor when you quit RStudio, and will automatically load it when you re-open. Nevertheless,  save your scripts regularly to back them up.

The script editor will also highlight syntax errors with a red squiggly line and a cross in the sidebar. Hover over the cross to see what the problem is.
RStudio will also let you know about potential problems. Read
https://support.rstudio.com/hc/en-us/articles/205753617-Code-Diagnostics to find out common mistakes in RStudio diagnostics report.

## Your turn
* Create a script in RStudio and do the following step by step

1. calculate $\frac{\pi^2}{6}$ and save it in `x`

2. culculate $\frac{1}{1}+\frac{1}{2^2}+\cdots+\frac{1}{10^2}$

3. take the difference between the two numbers obtained from the last two steps, save it in `d1`

4. replace $10$ with $20$ in step 2 and do step 3 again, save the difference in `d2`

5. replace $20$ with larger numbers, do the same calculation and save them in `d3`, `d4`,`d5`...

6. combine `d1`,`d2`,`d3`.. in a vector `d`, did you find anything interesting in `d`?



## Errors, warnings, and messages

R reports errors, warnings, and messages in a glaring red font, which makes it seem like it is scolding you. R will show red text in the console pane in three different situations:

* **Errors**: When the red text is a legitimate error, it will be prefaced with “Error in…” and will try to explain what went wrong. Generally when there’s an error, the code will not run.

* **Warnings**: When the red text is a warning, it will be prefaced with “Warning:” and R will try to explain why there’s a warning. Generally your code will still work, but with some caveats

* **Messages**: When the red text doesn’t start with either “Error” or “Warning”, it’s just a friendly message.


***
* If the text starts with “Error”, figure out what’s causing it

* If the text starts with “Warning”, figure out if it’s something to worry about.

* Otherwise, the text is just a message. Read it, and thank it for talking to you.
