# Essentials - Atomic Vectors 

__Atomic vectors are sequences of values, which all of the same type.__

See Chapter 16. Vectors from 
R for Data Science
by Garrett Grolemund; Hadley Wickham
Published by O'Reilly Media, Inc., 2016 
(http://r4ds.had.co.nz, [Amazon](http://amzn.to/2aHLAQ1))

![alt](http://www.scatter.com/images/DataLab_logo.jpg)

## Table of contents 

1. Creating Atomic Vectors 
2. Naming vectors 
3. Finding the length of vectors 
4. Subsetting vectors 
5. Arithmetic operations on vectors 
6. Comparing vectors 
7. Functions on vectors (numeric and character)

Vectors are the most basic R objects, and they come in two flavours: lists and atomic vectors.

Atomic vectors contain only one type of value while list vectors can be a combination of different types of values. 

 In this notebook, you will learn about atomic vectors.

There are 4 common types of atomic vectors: logical, numeric(double), character and integer.

### Creating Atomic Vectors

There are several ways of creating vectors of length greater than `1`.

The two basic ways are:
1. The concatenation function `c()`
2. The colon operator `:` (works only with numeric vectors)

In [7]:
%r
myvec <- c("the", "brown" , "fox", "hello")
myvec

In [8]:
%r
c(234,34,-356, 2249999)

In [9]:
%r
c(TRUE, FALSE, TRUE)

In [10]:
%r
c(3,3,4)->x
typeof(x)

In [11]:
%r
10:20

In [12]:
%r
7:0

In [13]:
%r
3:-3

Another method of generating a sequence of numbers is by using the `seq()` command

In [15]:
%r
seq(from=3, to=-3, by=-2)

In [16]:
%r
seq(from=1, to=10, length.out=6)

### Naming vectors

Vector elements can be named:
- with the `c()` function, when they are created 
- with the `names()` function, after they have been created

In [18]:
%r
c(x = 3, y = 20, z = 100)
typeof(c(x = 3, y = 20, z = 100))

In [19]:
%r
7:10 -> y
c("a","b","c","d") -> names(y)
y

Note: Naming fewer elements than those in a vector does not generates errors. Naming more elements than those in the vector, on the other hand, generates errors.

In [21]:
%r
7:10 -> z
c("a","b") -> names(z)
z

### Length of vectors

The `length()` function returns the number of elements in a vector.

All values from the notebook `1. Values` are atomic vectors of length `1`.

In [23]:
%r
length("Hello")

In [24]:
%r
length(35)

In [25]:
%r
length(TRUE)

In [26]:
%r
myvec<-11:18
length(myvec)

In [27]:
%r
length(c("a","b","c"))

In [28]:
%r
length(c(TRUE, FALSE))

In [29]:
%r
length(c(1, NA, 3))

### Subsetting Vectors

- Single elements can be retrieved from a vector. 
- Collections of elements can be retrieved from a vector.

Retrieve single elements from a vector by using square brackets containing the position of the element to be retrieved.

In [32]:
%r
x = c("dog", "fish", "cat", "blue", "red", "green")
x[length(x)]

Collections of elements can be retrieved by placing a vector of positive integers inside the square brackets. 

This returns all elements at positions specified by these integers.

In [34]:
%r
y

In [35]:
%r
y[c("a", "c")]

In [36]:
%r
y[2:3]

In [37]:
%r
y[c(1,4)]

In [38]:
%r
x = c("dog", "fish", "cat", "blue", "red", "green")
x[6:1]

The `NA` value is returned if the position integer does not specify an item in the vector. This happens when the position integer is greater than the length of the vector.

In [40]:
%r
x[c(6,1, 1, 1, 100)]

In [41]:
%r
length(x)

In [42]:
%r
y = x[length(x):1]
y

In [43]:
%r
y

Collections of elements can also be retrieved by placing a vector of __negative integers__ inside the square brackets. 

In this case, all elements of the vector are returned, __except__ for those in the positions specified by the absolute values of each integer. i.e. specifying the position as -1 leaves out the first element or -4 leaves out the fourth element.

In [45]:
%r
x

In [46]:
%r
x[-1]

In [47]:
%r
x[c(-4,-1)]

Logical vectors can also be used to retrieve elements.

In [49]:
%r
x[as.logical(c(1, 0, 0, 0, 0, 1))]

### Operations on Vectors

Arithmetic operations can be performed on vectors, just as well as single values, when:
- the vectors are the same length
- at least one of the vectors has length of `1`

In [51]:
%r
1:2

In [52]:
%r
1:4

If two vectors are of unequal length, the values in the shorter vector are repeated until its length matches the larger vector and then operations are performed.

In [54]:
%r
1:5 + 1:2

In [55]:
%r
c(1,2,3) * 4:6

In [56]:
%r
10 * 1:3

There are situations where you can perform arithmetic on two vectors of different lengths, __but don't do this.__ 

In the case where the vectors are of different lengths, the length of the longer vector __must be__ a multiple of the length of the shorter vector.

### Comparing Vectors

Two vectors of the same length can be compared using relational operators. See the link below for a list of these operators:
- https://stat.ethz.ch/R-manual/R-devel/library/base/html/Comparison.html

For example,

In [59]:
%r
c(1,2,3,4) < c(4,3)

The elements of a vector can be compared to a vector with a single element.

In [61]:
%r
1:10 < 5

In [62]:
%r
c(1,2,3,4,5,6,7,8,9,10)

In [63]:
%r
1:10 == 10

Eileen: what is the difference between "=" and "=="  When should we use "=" and when should we use “==”？

### Functions on Vectors

There are many functions that take a vector as input. Here are some of the common ones. 

See also: http://www.statmethods.net/management/functions.html

#### Functions on Numeric Vectors

The most common functions on numeric vectors are: `sum`, `min`, `max`, `mean`. Those and a few others are listed below.

First create a vector `x` to use in the examples.

In [68]:
%r
1:4 -> x
x

In [69]:
%r
sum(x)

In [70]:
%r
prod(x)

In [71]:
%r
min(x)

In [72]:
%r
max(x)

In [73]:
%r
mean(x)

In [74]:
%r
newvc<-c(10,20,230,40,50)
weight<-c(1,2,3,4,5)
wtdvalue<-weighted.mean(newvc,weight)
wtdvalue

In [75]:
%r
sd(x)

In [76]:
%r
round(c(1.254342, 3.0, 4.42523459), 1)

In [77]:
%r
sort(c(4,8,1,3,5,2), decreasing= TRUE)

#### Functions on Character Vectors

The most common functions on character vectors are `nchar`, `tolower` and `toupper`.

In [79]:
%r
c("Live", "long", "and", "prosper") -> y
y 

In [80]:
%r
nchar(y)

In [81]:
%r
tolower(y)

In [82]:
%r
toupper(y)

Substr() can be used to extract parts of a string. It can also be used to replace parts in elements in a character vector.

In [84]:
%r
substr("abcdefg", start=5, stop=2)

In [85]:
%r
?substr

In [86]:
%r
c("mat", "car", "say") -> p
substr(p, 1,1)<- '#'
p

In [87]:
%r
c("mat", "car", "say") -> p
substr(p, 2, 3)<- '##'
p

__Exercise__: Create a vector named `over_half` that contains only those numbers from vector `my_vec` (see below) which are greater than `0.5`. 
Then take the average of these numbers (that are greater than `0.5`.) 
Store your result in variable `over_half_avg`.

The End