# Introduction to R

**R** is a `statistical programming language` that allows you to write code to work with data: read, analyze and visualize data sets. **R** was inspired by the language `S`, a language for **S**tatistics developed by AT&T, and because it was developed by **R**oss Ihaka and **R**obert Gentleman.

- **R** is an interpreted language (there is no code compilation step) and gives you the ability to separately execute each individual line of code in your script.
- **R** is dynamically typed language.

### To install **R**:

1. Install **R** language interpreter from [The R Project for Statistical Computing](https://www.r-project.org/).
2. Install **RStudio Desktop** IDE from [The Comprehensive R Archive Network](https://cran.rstudio.com/).
3. [Installing the R kernel in Jupyter Lab](https://richpauloo.github.io/2018-05-16-Installing-the-R-kernel-in-Jupyter-Lab/):

- `$ r`
- `> install.packages(c('repr', 'IRdisplay', 'evaluate', 'crayon', 'pbdZMQ', 'devtools', 'uuid', 'digest'))`
- `> devtools::install_github('IRkernel/IRkernel')`
- `> q()`
- `$ /Library/Frameworks/R.framework/Versions/Current/Resources/bin/R`
- `> IRkernel::installspec() # install for the current user`
- `> IRkernel::installspec(user = FALSE) # install system-wide`
- `q()`
- restart Jupuyter Lab


### To run **R** code:

- In RStudio
    - Select code to run and press `cmd+enter`
    - Run entire script by clicking "Source" button or via `shift+cmd+enter`
- From terminal
    - interactive **R** session:
        - `$ r`
        - `> <your code> + enter`
        - `$ q()`
    - run entire script:
        - `$ RScript <script name>.R`

### Help

- Use `Help` from top RStudio menu to find many resources including various cheatsheets.
- [RDocumentation](https://www.rdocumentation.org/) with examples
- [RStudio Community](https://community.rstudio.com/)
- see below: `?`, `??`, `help()`, `example()`
- Free ebook [R for Data Science](https://r4ds.had.co.nz/)

### Syntax & Style

- For **commenting** one line of code use the pound symbol (`#`).
- **Variable names** and `case sensitive` can contain any combination of letters, numbers, periods(.) and underscores(_) and must begin with a letter.
- [Tidyverse style guide](https://style.tidyverse.org/)

## Data types

In [1]:
# Assign a value 3 to the variable
num_cups_coffee <- 3

In [3]:
# [1] means: the first value in the variable num_cups_coffee
print(num_cups_coffee)

[1] 3


In [4]:
too_much_coffe <- 3 + 4

In [5]:
print(too_much_coffe)

[1] 7


In [12]:
# Numeric data type/mode
num <- 3.0

In [13]:
print(num)

[1] 3


In [7]:
# Character data type/mode
str <- "abc"

In [10]:
# Logical data type/mode
logical <- TRUE
logical2 <- FALSE
logical3 <- T
logical4 <- F
logical5 <- num == num_cups_coffee
print(logical5)

[1] TRUE


In [14]:
# Integer (whole-number) type/mode
# L stands for "long integer"
my_integer <- 10L

In [15]:
# Complex (imaginary) data type/mode
complex = 2i

## Help

In [17]:
# Using built-in documentation
?sum

0,1
sum {base},R Documentation

0,1
...,numeric or complex or logical vectors.
na.rm,logical. Should missing values (including NaN) be removed?


In [19]:
# Search broader
#??sum

In [21]:
#help(sum)

In [22]:
example(sum)


sum> ## Pass a vector to sum, and it will add the elements together.
sum> sum(1:5)
[1] 15

sum> ## Pass several numbers to sum, and it also adds the elements.
sum> sum(1, 2, 3, 4, 5)
[1] 15

sum> ## In fact, you can pass vectors into several arguments, and everything gets added.
sum> sum(1:2, 3:5)
[1] 15

sum> ## If there are missing values, the sum is unknown, i.e., also missing, ....
sum> sum(1:5, NA)
[1] NA

sum> ## ... unless  we exclude missing values explicitly:
sum> sum(1:5, NA, na.rm = TRUE)
[1] 15
