# Vectors and Modes 

<img src=https://www.evernote.com/l/AAFwjrTTrFJI24Q5-ifUjOKGn9MeoAzMNAkB/image.png width=600px>

## Vectors

A vector is a list of values. 

All elements of a vector must have the same **mode** or data type. 

In [None]:
char_v <- c('a','b','c')
mode(char_v)

In [None]:
int_v <- c(1,2,3)
mode(int_v)

In [None]:
int_v

In [None]:
int_v[1]

##### In R, vectors are subscripted beginning with 1

### R doesn't have scalars 

Any scalar value is actually a vector of length 1. 

In [None]:
scalar <- 8

In [None]:
scalar[1]; length(scalar); mode(scalar)

### Strings

#### Strings are just character vectors

In [None]:
string <- 'abc'; mode(string)

#### `paste` to concatenate strings 

In [None]:
h <- 'hello'; w <- 'world'
paste(h,w)

In [None]:
blake <- 'The sun descending in the West,\nThe evening star does shine;\nThe birds are silent in their nest,\nAnd I must seek for mine.'

In [None]:
print(blake)

#### `strsplit` to split a string on a character

In [None]:
blake_split <- strsplit(blake, "\n")

In [None]:
unlist(blake_split)

In [None]:
mode(unlist(blake_split))

In [None]:
for (substring in unlist(blake_split)) {
    print(substring)
}


### Matrices 

A matrix is a rectangular array of numbers.

A matrix in R is a vector with two additional properties, a number of rows and a number of columns.

In [None]:
mat <- rbind(c(1,2,3),c(4,5,6))
mat

We can see these properties with the `dim` function. Here we see that our matrix has two rows and three columns.

In [None]:
dim(mat)

Matrices are subscripted similarly to vectors, again beginning from an index of 1.

In [None]:
mat[1,2]; mat[2,3]

## Lists

A list is a compound object like the vector or the matrix, buts its elements can be different data types. 

Lists are stored via key-value pairs much like a dictionary in Python or a hash in Ruby or Perl. 

In [None]:
l <- list(a=1, b='foo', c=1:10)

In [None]:
l

We can store and access elements in a list using dollar sign notation.

#### Store an element

In [None]:
l$d <- 'bar'

In [None]:
l

####  Access an Element

In [None]:
l$c

### Usage of lists in R

Many of the objects we will be working with in R will be complex statistical objects. R uses lists to store these values. 

For example, as we previously did, we can generate a histogram with the `hist` function. But the function also returns a list we can assign to a variable.

In [None]:
library(repr)
options(repr.plot.width=12, repr.plot.height=4)

In [None]:
hist_nile <- hist(Nile)

Here, we display the output of this stored list. 

In [None]:
hist_nile

## Data Frames

A matrix can be thought of as a vector of vectors.

A dataframe can be thought of as a list of lists or a vector of lists.

Let's briefly consider why we would need such a data structure. 

> In an employee data set, for example, we might have character string data, such as employee names, and numeric data, such as salaries. So, although a data set of (say) 50 employees with 4 variables per worker has the look and feel of a 50-by-4 matrix, it does not qualify as such in R, because it mixes types.

### The Data Frame as a list of lists

In [None]:
employee_1 <- c(name='Alice', salary=87000)
employee_2 <- c(name='Bob', salary=81000)
employee_3 <- c(name='Carol', salary=88000)
employee_4 <- c(name='Dean', salary=87000)

In [None]:
employees <- list(employee_1, employee_2, employee_3, employee_4)

In [None]:
employees_df <- as.data.frame(t(as.data.frame(employees)))
rownames(employees_df)<-NULL
employees_df

### The Data Frame as a list of vectors

In [None]:
salaries = c(87000, 81000, 88000, 87000)
names = c('Alice','Bob','Carol','Dean')
data.frame(list(name=names, salary=salaries))

## Classes

The bottom line for object-oriented programming languages, like R, is that:

1. everything that you will be working with is an object
2. objects are instances of classes

We have seen this already in the `hist_nile` object returned by the `hist` function.

In [None]:
print(hist_nile)

Note that this object has five key referenced values (keys: `$break`, `$counts`, `$density`, `$mids`, `$xname`, and `$equidist`) but it also has a single **attribute** `"class"` and that the class of this object is `"histogram"`. 

### `histogram` class

According to the [documenation](http://stat.ethz.ch/R-manual/R-devel/library/graphics/html/hist.html), the `hist` function returns:

Value

> an object of class "histogram" which is a list with components:

> `breaks`	
> the `n+1` cell boundaries (`= breaks` if that was a vector). These are the nominal breaks, not with the boundary fuzz.
> 
> `counts`	
> n integers; for each cell, the number of x[] inside.
> 
> `density`	
> values $\widehat{f}(x[i])$, as estimated density values. If `all(diff(breaks) == 1)`, they are the relative frequencies counts/n and in general satisfy sum$[i; \widehat{f}(x[i]) (b[i+1]-b[i])] = 1$, where `b[i] = breaks[i]`.
> 
> `mids`	
> the n cell midpoints.
> 
> `xname`	
> a character string with the actual x argument name.
> 
> `equidist`	
> logical, indicating if the distances between breaks are all the same.