In [1]:
source(file = "helpers.r")

# Intro Notebook (Following Tutorial)
I've never used R before so this is an attempt to learn.  I am a data scientist after all.  I'm using the [R Debuts booklet](https://cran.r-project.org/doc/contrib/Paradis-rdebuts_en.pdf) by Emmanuel Paradis from University of Montpellier.

### General & Basics:
- R is a dialect of S
- it is interpreted
- it is object-oriented
- objects, operators, functions (which are objects as well)
- results objects, data objects, and functions are all stored in active memory
- Figures, output files, and libraries reside on hard disk

- assign operator can be oriented L to R or R to L

In [2]:
a <- 5
10 -> A

In [3]:
A

- this function generates a normal random variate:
- displayed, but not stored:

In [4]:
rnorm(1)

- you can separate commands on the same line with `;`
- `ls()` will list objects in memory.  Pass option `pat` to look for a pattern.
- seems to work mostly like regex

In [5]:
ls(pat = "^a")

I'm pretty sure "arguments" to fns are called options in R.

In [6]:
ls.str()

a :  num 5
A :  num 10
object_type : function (obj)  

- `rm(a)` will delete `a`
- `rm(list = ls())` will delete every obj in memory

In [7]:
rm(a)

- prepending a function with `?` will give help for it
- `#` for single-line comments

In [8]:
# ?ls

- every object has two intrinsic attributes, `mode` and `length`
- mode describes the type of the elements in the object: `numeric`, `logical`, `complex`, `character`
- length is self-explanatory

In [9]:
mode(A)

In [10]:
n <- 2.1e23
o <- NaN
p <- Inf

In [11]:
q <- "quotation marks delimit variables of mode character \" and you can escape with a slash"

In [12]:
cat(q)

quotation marks delimit variables of mode character " and you can escape with a slash

There are quite a few types of objects:
- vector
- factor (categorical)
- array
- matrix
- dataframe
- ts
- list

### Interacting w/ the filesystem

#### Directories

In [13]:
getwd()
# setwd("~/Documents/")

#### Reading files:

- `read.table()` is for reading ASCII data.
- returns a data frame
- many options: `read.table(file, header = FALSE, sep = "", quote = "\"’", dec = ".",...`
- variants such as `read.csv(), read.delim()` with different default options

Several ways to access data in a dataframe:
- `mydata$V1`
- `mydata["V1"]`
- `mydata[,1]`

It seems like some of these return vectors, and another may return a df in the shape of the vector

- `scan()` allows for the specification of data modes by (I'm assuming) column, with the `what` option.
- `read.fwf()` allows for reading fixed-width format data

#### Writing files:

- `write.table(x, file)` where `x` is obj to be written, `file` is filename, etc.
- option `append` may be useful. It concerns writing over data that may already be in `file`.

### Generating Data:

In [14]:
s <- 1:10; s

In [15]:
t <- seq(1, 5, 0.5);t

In [16]:
typeof(t)

In [17]:
u <- c(1, 1, 2, 3, 5, 8); u

In [18]:
is.vector(u)

- `gl(n, k)` "generate levels" should be explained in the next cell:
- there is more functionality depending on the options, but that's a task to learn later\
- there is also a `length` option to expand the series

In [19]:
v <- gl(3, 4); v

In [20]:
class(v)

- `is` function can tell you the object type if you are unsure.

In [21]:
is.vector(v); is.factor(v)

In [22]:
gl(n = 2, k = 3, labels = c("m", "f"))  # labels are just that: labels.  The 

- `expand.grid()` does makes a permutation data frame like so:

In [23]:
expand.grid(h = c(60,80), w = c(100, 300))

h,w
<dbl>,<dbl>
60,100
80,100
60,300
80,300


#### Random data:
- the functions take the form of `r`*`func`*`(n, other_params...)`
- where _`func`_ is something like `exp`, `norm`, `binom` -- the names of distributions
- you can even replace `r` with `d` (for the pdf) or `c` for the cdf

In [29]:
w <- rpois(5, lambda = 0.5)
w

- of course, for `p`_`func`_`()` and `c`_`func`_`()`, `n` becomes the value in question instead of the # of returns.

In [33]:
v <- pnorm(3, mean = 0.0, sd = 1.0)
v

#### Creating objects very directly:
- they tend to have a default value like `0` or `FALSE` depending on the mode

In [39]:
vector(mode = "logical", length = 4)

In [40]:
matrix(1:6, 2, 3)

0,1,2
1,3,5
2,4,6


In [58]:
x <- 1:5
y <- seq(0.1, 0.5, 0.1)
data.frame(label = x, y)

label,y
<int>,<dbl>
1,0.1
2,0.2
3,0.3
4,0.4
5,0.5


#### Time Series:
- an object (and function to create the object) representing time, built on top of `vector` object
- `ts(data, start, end, frequency, deltat, )`
- more so years than milliseconds and such

In [59]:
ts(1:10, start = 1959)

#### Type conversion:

In [64]:
as.numeric(FALSE)

In [65]:
z <- c(1, 2)
object_type(z)
as.factor(z)

#### Functions in R:
- two spaces for indenting is standard
- enclosed with `{}`
- `name_name <- function() { ... }`

# No longer following tutorial"