# Base R

The first thing we must do as R users is install a set of packages, collectively called `tidyverse`. We will take a closer look at tidyverse in the next lecture.

In [3]:
install.packages("tidyverse") #This only needs to be done once per computer

Installing package into 'C:/Users/p/Documents/R/win-library/3.6'
(as 'lib' is unspecified)


package 'tidyverse' successfully unpacked and MD5 sums checked

The downloaded binary packages are in
	C:\Users\p\AppData\Local\Temp\RtmpAvfc5F\downloaded_packages


As always, the first thing a novice programmer should do is get a hold of a good cheat sheet for reference: https://rstudio.com/wp-content/uploads/2016/10/r-cheat-sheet-3.pdf (there are many other high quality cheat sheets on the web)

## The oddness of R
Many people with a background in Computer Science, the type of people we might call _programmers_ or _software engineers_  find R very odd. The author is among those people. However, R must be judged according to its audience.

For decades, R has proven itself to be extraordinarily successful among statisticians. Long before Python existed, researchers in the fields of medicine, poverty alleviation and other humanitarian fields were using R to solve real-world problems.

Python's data science stack, itself, borrows heavily from R. The author of Python's `Pandas` project and the author of R's `tidyverse` set of packages work very closely together to bring advances to both communities.

Whatever one may think of R as a language, it is a fantastic tool for solving very important problems.

## Vectors

The most important built-in data type in R is `Vector`.

Vectors are created using the `c()` function:

In [9]:
c(1,2,3,4,5,6,7,8,9,10)

In [10]:
my_list <- c(1,2,3,4,5,6,7,8,9,10)
my_list

Variables are assigned using the `<-` symbol. You may notice the `=` sign being used, similar to most other languages. However, the correct way to assign variables is to use the `<-` function and we will stick to this method.

In some ways, accessing elements from an array is very similar to Python. The first exception we must mention is that in R, indexes begin at 1, not 0.

In [11]:
my_list[0]

In [12]:
my_list[1]

In [13]:
my_list[1:5]

Notice that other than index beginning at 1, indexing a single value and a range of values is the same.

Negating an index or a range does _not_ return values from the end of the list (like it does in Python). In R, negation means "don't return these values"

In [14]:
my_list[-5]

Notice that 5 is missing from the list above. `-5` means return values at all indexes, except 4.

In [15]:
my_list[c(4,5)]

In [16]:
my_list[-c(4,5)]

Passing in a list of indexes returns values at those indexes. Negating that list means return everything _but_ values at those indexes.

In [17]:
my_list > 4

In [18]:
my_list[my_list > 4 ]

Notice in the examples above that R's vectors work more like Python's Pandas library rather than basic lists. 

### Functions to generate vectors
`seq` and `rep` are two important functions, similar to the `range` function in Python.

In [19]:
seq(1,10)

In [20]:
seq(1,10,by=2)

in the previous example, notice that seq performs essentially the same purpose as Python's `range` function. Further notice that R also has keyword arguments and default arguments.

In [23]:
rep(1, times=3)

In [24]:
rep(1:10, times=3)

In [25]:
rep(1, each=3)

In [26]:
rep(1:10, each=3)

### Functions to combine vectors

`cbind` and `rbind` are very common functions. You will see these in other people's code and use them yourself to join vectors into matrices.

In [33]:
#Combine vectors into a matrix of two columns
cbind(my_list, my_list)

my_list,my_list.1
1,1
2,2
3,3
4,4
5,5
6,6
7,7
8,8
9,9
10,10


In [35]:
#Combbine vectors in a matrix of two rows
rbind(my_list, my_list)

0,1,2,3,4,5,6,7,8,9,10
my_list,1,2,3,4,5,6,7,8,9,10
my_list,1,2,3,4,5,6,7,8,9,10


## Getting help

In [28]:
?mean

In [29]:
help.search('range of values') #better to use R's web docs

starting httpd help server ... done


In [31]:
help(package='ggplot2')