In [1]:
# Author: Jayme Anchante
# Date:
Sys.Date()

# 1. Functions and arguments
In R, functions are defined as:

`function(arg1, arg2, ...)`

For example `runif()` returns random numbers from an uniform distribution:

`runif(n, min=0, max=1)`

In [2]:
runif(n=10, min=1, max=10)

Arguments with default values could be omitted:

In [3]:
runif(10)

If arguments are named, the order is unimportant:

In [4]:
runif(min=1, max=100, n=10)

Named and unnamed arguments can be used, since unnamed ones are in the correct position:

In [5]:
runif(10, max=100, min=1)

## 1.1. Other types of arguments

For example: `sample()`function:

```sample(x, size, replace=FALSE, prob=NULL)```

> `x` and `size` must be specified

> `replace` is logical: `TRUE` (`T`) or `FALSE` (`F`)

> `prob` is oprtional

For example: `plot()` function:

```plot(x, y, ...)```

> "`...`" allow specify arguments from other functions

To see all arguments from a function, use `args()`:

In [6]:
args(sample)

# 2. Help

In [7]:
?runif

In [8]:
help(runif)

In [9]:
# searching for function names which contain some word:
apropos('mod')

In [10]:
apropos('model')

In [11]:
# search for functions that contain a word in any part of its documentation

help.search('palavra')

No vignettes or demos or help files found with alias or concept or
title matching ‘palavra’ using fuzzy matching.

In [12]:
# In-browser help

help.start()

starting httpd help server ... done


If the browser launched by '/usr/bin/xdg-open' is already running, it
    is *not* restarted, and you must switch to its window.
Otherwise, be patient ...


The R-core contains some basic packages (not all are loaded in the default R initialization). To list all loaded packages:

In [13]:
search()

To load a package:

In [14]:
library(lattice)

In [15]:
require(MASS)

Loading required package: MASS


In [16]:
search()

Loading a package makes all its functions available to use. To install a package:

In [17]:
install.packages('devtools')

Installing package into ‘/home/jayme/.R/library’
(as ‘lib’ is unspecified)


To verify available libraries and if packages can be updated:

In [18]:
packageStatus()

Number of installed packages:
                        
                         ok upgrade unavailable
  /home/jayme/.R/library 82       1           1
  /usr/lib/R/library     23       6           0

Number of available packages (each package counted only once):
                                                 
                                                  installed not installed
  http://cran.revolutionanalytics.com/src/contrib        98         12400

To update all packages:

In [19]:
# update.packages()

# 3. Creating functions

In [20]:
hello.world <- function() {
    # writting a simple 'Hello world!' function
    writeLines('Hello world!')
}

In [21]:
hello.world()

Hello world!


In [22]:
hello.world <- function(text) {
    # 'Hello world!' function with an argument
    writeLines(text)
}

In [23]:
hello.world('Function are cool!')

Function are cool!


## 3.1. Exercise 1

In [24]:
runif(n=30, min=0, max=1)

In [25]:
runif(n=30, min=-5, max=5)

In [26]:
runif(n=30, min=10, max=500)

In [27]:
?"+"

In [28]:
sum_function <- function(x, y) {
    # sum two numbers, x and y
    sum(x, y)
}

In [29]:
sum_function(4, 8)

In [30]:
dice <- function(n) {
     # simulate the throws of a dice
     sample(x=1:6, size=n, replace=TRUE)
}

In [31]:
dice(5)

In [32]:
dices <- function(n) {
    # simulate the throws of a dice
    first_dice <- sample(x=1:6, size=n, replace=TRUE)
    second_dice <- sample(x=1:6, size=n, replace=TRUE)
    print(paste(first_dice, ', ', second_dice, sep=''))
}

In [33]:
dices(5)

[1] "6, 3" "5, 5" "5, 3" "1, 6" "5, 3"


# 4. Objects
*Object* is a symbol or a variable capable of storing values or data structures

*Class* if the definition of an object, it describes it and how it is going to be manipulated by different functions

*Method* are generic functions that execute tasks according to each class

In [34]:
methods(plot)

 [1] plot.acf*            plot.correspondence* plot.data.frame*    
 [4] plot.decomposed.ts*  plot.default         plot.dendrogram*    
 [7] plot.density*        plot.ecdf            plot.factor*        
[10] plot.formula*        plot.function        plot.hclust*        
[13] plot.histogram*      plot.HoltWinters*    plot.isoreg*        
[16] plot.lda*            plot.lm*             plot.mca*           
[19] plot.medpolish*      plot.mlm*            plot.ppr*           
[22] plot.prcomp*         plot.princomp*       plot.profile*       
[25] plot.profile.nls*    plot.raster*         plot.ridgelm*       
[28] plot.shingle*        plot.spec*           plot.stepfun        
[31] plot.stl*            plot.table*          plot.trellis*       
[34] plot.ts              plot.tskernel*       plot.TukeyHSD*      
see '?methods' for accessing help and source code

In [35]:
# x recieves 2, becoming an object
# <- is the attribution operator

x <- 2

In [36]:
# to see x contents, simply:

x

In [37]:
x <- 2
y <- 4

# the same as:
x <- 2; y <- 4

In [38]:
# arithmetic operation

x + y

In [39]:
# objects can store different data structure
# y prior value is supercripted

y <- runif(10)

In [40]:
y

## 4.1. Object names

* Can be letters, numbers, "_" and "."

* Can't begin with a number and/or dot

* Can't have spaces

* Avoid accents

* Avoid already declared function names

* Object names are case sensitive

# 4.2. Managing the desktop

In [41]:
# list objects created

ls()

In [42]:
# removes an object

rm(x)

In [43]:
rm(x, y)

“object 'x' not found”

In [44]:
# removes all objects:

rm(list = ls())

## 4.2. Exercise 2

In [45]:
# 1.

x <- 32 + 16 ** 2 - 25 ** 3

In [46]:
# 2.

y <- x / 345

In [47]:
# 3.

random_values <- runif(n=30, min=10, max=50)

In [48]:
# 4.

rm(y)

In [49]:
# 5.

rm(list = ls())

In [50]:
# 6.

apropos('pois')
help(rpois)
rpois(n=100, lambda=5)

# 5. Types and object classes

R has two basic types of vectors:

* Atomic vectors: can only contain one type of element

> double

> integer

> character

> logical

> complex

> raw

* Lists aka recursive vectors because they can contain other lists, can contain mixed types of elements

A vector is an unidimensional structure, but we can work with structures of more than one dimension, created from unidimensional structures. An object with several dimensions has a class. The `typeof()` function determines the object type, whereas the `class()` determines the object class:

In [51]:
x <- c(2, 4, 6)
typeof(x)

In [52]:
class(x)

In [53]:
x <- c(2L, 4L, 6L)
typeof(x)

In [54]:
class(x)

In [55]:
x <- c('a', 'b', 'c')
typeof(x)

In [56]:
class(x)

In [57]:
x <- c(TRUE, FALSE, TRUE)
typeof(x)

In [58]:
class(x)

In [59]:
x <- c(2 + 1i, 4 + 1i, 6 + 1i)
typeof(x)

In [60]:
class(x)

In [61]:
x <- raw(3)
typeof(x)

In [62]:
class(x)

## 5.1. Numeric vectors

Ordered collection of numbers, unidimenional structure

Using the `c()`function to create vectors:

In [63]:
num <- c(10, 5, 2, 4, 8, 9)
num

In [64]:
typeof(num)

In [65]:
class(num)

To force a number into integer we need the *L* suffix

In [66]:
x <- c(10L, 5L, 2L, 4L, 8L, 9L)
x

In [67]:
typeof(x)

In [68]:
class(x)

One of the differences between *numeric* and *integer* is that integer uses less storage space

In [69]:
object.size(num)

88 bytes

In [70]:
object.size(x)

72 bytes

### 5.1.1. Numeric representation

The numbers displayed are a simplified representation of the numbers stored in memory

In [71]:
x <- runif(10)
x

In [72]:
# get the default number of decimals

getOption('digits')

7 is the number of significant decimals. Internally, however, each number is stored with a 64 bits precision (up to 16 significant decimals). It can introduce errors sometimes:

In [73]:
sqrt(2) ^ 2 - 2

In [74]:
print(sqrt(2) ^ 2, digits = 22)

[1] 2.000000000000000444089


This is not exactly zero because the precision is 16 decimals. This kind of error is known as floating-point error. In R, numbers can be represented to up to 22 decimals:

In [75]:
print(x, digits=1)

 [1] 0.53 0.73 0.93 0.30 0.87 0.38 0.03 0.76 0.09 0.41


In [76]:
print(x, digits=7)

 [1] 0.52906946 0.72586459 0.92717495 0.29825008 0.87389428 0.37516016
 [7] 0.02872974 0.75557713 0.09399029 0.40960838


In [77]:
print(x, digits=22)

 [1] 0.52906945627182722091675 0.72586458525620400905609
 [3] 0.92717494629323482513428 0.29825007519684731960297
 [5] 0.87389427586458623409271 0.37516016233712434768677
 [7] 0.02872973866760730743408 0.75557712675072252750397
 [9] 0.09399028727784752845764 0.40960838040336966514587


In [78]:
# changing the display to scientific:

print(x, scientific=TRUE)

 [1] 0.52906946 0.72586459 0.92717495 0.29825008 0.87389428 0.37516016
 [7] 0.02872974 0.75557713 0.09399029 0.40960838


### 5.1.2. Sequence of numbers

In [79]:
seq(1, 10)

In [80]:
# same as

1:10

In [81]:
seq(from=1, to=10, by=2)

In [82]:
# 15 values between 1 and 10

seq(1, 10, length.out=15)

In [83]:
rep(1, 10)

In [84]:
rep(c(1, 2, 3), times = 5)

In [85]:
rep(c(1, 2, 3), each=5)

### 5.1.3. Mathematical operations with numeric vectors

In [86]:
# betweem vector and number

num * 2

In [87]:
# two vectors of the same size

num * num

In [88]:
# or multiple length

num * c(2, 4, 1)

### 5.1.4. Recycling rules

In [89]:
num / c(2, 4, 1, 3)

“longer object length is not a multiple of shorter object length”

## 5.2. Other types of vectors

In [90]:
# characters

characters <- c('brava', 'joaquina', 'armação')
characters

In [91]:
typeof(characters)

In [92]:
class(characters)

In [93]:
# logical

logical <- num > 4
logical

Some logical operators:

|||
|------------|
|**Operator**|**Test**|
|<|less than|
|<=|less than or equal to|
|>|greater than|
|>=|greater than or equal to|
|==|equal to|
|!=|different than|
|%in%|is contained in |

## 5.3. Mixing classes of objects

In [94]:
w <- c(5L, 'a')
x <- c(1.7, 'a')
y <- c(TRUE, 2)
z <- c('a', TRUE)

When different types are mixed up, **coercion** occurs, so that all elements are of the same class. Above we saw R's implicit coercion, but we can explicitly coerce elements using the `as.*` functions

In [95]:
x <- 0:6
typeof(x)

In [96]:
class(x)

In [97]:
as.numeric(x)

In [98]:
as.logical(x)

In [99]:
as.character(x)

In [100]:
as.factor(x)

In [101]:
(x <- c(FALSE, TRUE))

In [102]:
class(x)

In [103]:
as.numeric(x)

In [104]:
# sometimes it is not possible to coerce

x <- c('a', 'b', 'c')
as.numeric(x)

“NAs introduced by coercion”

In [105]:
as.logical(x)

## 5.4. Lost values and specials

Lost values are defined as `NA` (not available)

In [106]:
lost <- c(3, 5, NA, 2)
lost

In [107]:
class(lost)

In [108]:
# check to see if there are any NAs:

is.na(lost)

In [109]:
any(is.na(lost))

Other special values are `NaN` (not a number), e.g. `0/0` and `-Inf` and `+Inf`, e.g. `1/0`, `is.na` also tests the presence of `NaN`

In [110]:
lost <- c(-1, 0, 1)/0
lost

In [111]:
is.na(lost)

In [112]:
# test if there are infinite values

is.infinite(lost)

## 5.5. Exercise 3

In [113]:
# 1.

obj <- c(54, 0, 17, 94, 12.5, 2, 0.9, 15)

In [114]:
obj + c(5, 6)

In [115]:
obj + c(5, 6, 7)

“longer object length is not a multiple of shorter object length”

In [116]:
# 2.

obj <- rep(c('A', 'B', 'C'), times=c(15, 12, 8))

In [117]:
obj == 'B'

In [118]:
?sum

In [119]:
sum(obj == 'B')

In [120]:
# 3.

obj <- runif(n=100)

In [121]:
sum(obj >= 0.5)

In [122]:
# 4.

seq1 <- 2 ** seq(1, 50)
seq1

In [123]:
seq2 <- seq(1, 50) ** 2
seq2

In [124]:
seq1[seq1 == seq2]

In [125]:
sum(seq1 == seq2)

In [126]:
sin <- sin(seq(0, 2 * pi, by=0.1))
cos <- cos(seq(0, 2 * pi, by=0.1))
tan <- tan(seq(0, 2 * pi, by=0.1))

In [127]:
tan2 <- sin / cos

In [128]:
tan - tan2

In [129]:
tan[tan == tan2]

In [130]:
max(abs(tan - tan2))

# 6. Other classes

## 6.1. Factor

Similar to `characters`, but stored and treated differently. `Factors` are a collection of categories or *levels*, unidimensional structure

In [131]:
factor <- factor(c('high', 'low', 'low', 'average',
                   'high', 'average', 'low', 'average', 'average'))
factor

In [132]:
class(factor)

In [133]:
typeof(factor)

Its class is factor, but its type is integer. Each class is internally represented as a number and factors have an ordering. When we remove a class from an object

In [134]:
unclass(factor)

In [135]:
as.character(factor)

In [136]:
as.integer(factor)

In [137]:
# explicitly declaring levels

factor <- factor(c('high', 'low', 'low', 'average',
                   'high', 'average', 'low', 'average', 'average'),
                 levels = c('high', 'average', 'low'))
factor

In [138]:
# explicitly declaring levels and order

factor <- factor(c('high', 'low', 'low', 'average',
                   'high', 'average', 'low', 'average', 'average'),
                 levels = c('high', 'average', 'low'),
                 ordered=TRUE)
factor

In [139]:
typeof(factor)

In [140]:
class(factor)

In [141]:
levels(factor)

In [142]:
nlevels(factor)

## 6.2. Matrix

Matrices are two-dimensional vectors, they can contain only one type of elements

In [143]:
matrix <- matrix(1:12, nrow=3, ncol=4)
matrix

0,1,2,3
1,4,7,10
2,5,8,11
3,6,9,12


In [144]:
class(matrix)

In [145]:
typeof(matrix)

In [146]:
# changing the filling order

matrix <- matrix(1:12, nrow=3, ncol=4, byrow=TRUE)
matrix

0,1,2,3
1,2,3,4
5,6,7,8
9,10,11,12


In [147]:
# dimensions

dim(matrix)

In [148]:
# addind columns

cbind(matrix, rep(99, 3))

0,1,2,3,4
1,2,3,4,99
5,6,7,8,99
9,10,11,12,99


In [149]:
# adding rows

rbind(matrix, rep(99, 4))

0,1,2,3
1,2,3,4
5,6,7,8
9,10,11,12
99,99,99,99


matrices can be created from vectors by adding an dimensions attribute:

In [150]:
m <- 1:10
m

In [151]:
class(m)

In [152]:
dim(m)

NULL

In [153]:
dim(m) <- c(2, 5)
m

0,1,2,3,4
1,3,5,7,9
2,4,6,8,10


In [154]:
class(m)

In [155]:
typeof(m)

### 6.2.1. Matrices operations

In [156]:
# multiply by scalar

matrix * 2

0,1,2,3
2,4,6,8
10,12,14,16
18,20,22,24


In [157]:
# multiply by matrix

matrix2 <- matrix(1, nrow=4, ncol=3)
matrix %*% matrix2

0,1,2
10,10,10
26,26,26
42,42,42


## 6.3. Array

An array is a matrix with n dimensions

In [158]:
ar <- array(1:12, dim=c(2, 2, 3))
ar

In [159]:
ar <- array(1:12, dim=c(3, 2, 2))
ar

## 6.4. List

In [160]:
list1 <- list(1:30, 'R', list(TRUE, FALSE))
list1

In [161]:
# check the structure

str(list1)

List of 3
 $ : int [1:30] 1 2 3 4 5 6 7 8 9 10 ...
 $ : chr "R"
 $ :List of 2
  ..$ : logi TRUE
  ..$ : logi FALSE


In [162]:
# unidimensional

dim(list1)

NULL

In [163]:
length(list1)

In [164]:
list2 <- list(factor, matrix)
list2

0,1,2,3
1,2,3,4
5,6,7,8
9,10,11,12


## 6.5. Dataframe

Dataframe is a bidimensional list in which each element must have the same length, each vector of the list becomes a column

In [165]:
df <- data.frame(nome=c('João', 'José', 'Maria'),
                 sexo=c('M', 'M', 'F'),
                 idade=c(32, 34, 30))
df

nome,sexo,idade
João,M,32
José,M,34
Maria,F,30


In [166]:
class(df)

In [167]:
typeof(df)

In [168]:
dim(df)

In [169]:
str(df)

'data.frame':	3 obs. of  3 variables:
 $ nome : Factor w/ 3 levels "João","José",..: 1 2 3
 $ sexo : Factor w/ 2 levels "F","M": 2 2 1
 $ idade: num  32 34 30


In [170]:
length(num); length(factor)

In [171]:
df2 <- data.frame(numeric=c(num, NA, NA, NA),
                  factor=factor)
df2

numeric,factor
10.0,high
5.0,low
2.0,low
4.0,average
8.0,high
9.0,average
,low
,average
,average


In [172]:
str(df2)

'data.frame':	9 obs. of  2 variables:
 $ numeric: num  10 5 2 4 8 9 NA NA NA
 $ factor : Ord.factor w/ 3 levels "high"<"average"<..: 1 3 3 2 1 2 3 2 2


In [173]:
# converting a data.frame to matrix
# converts to character

as.matrix(df2)

numeric,factor
10.0,high
5.0,low
2.0,low
4.0,average
8.0,high
9.0,average
,low
,average
,average


In [174]:
# converts to integer

data.matrix(df2)

numeric,factor
10.0,1
5.0,3
2.0,3
4.0,2
8.0,1
9.0,2
,3
,2
,2


# 7. Object attributes

It is a piece of information that can be attributed to an object without interfering with the actual values, they are seen as metadata. For example: `names`, `dimnames`, `dim`, `class`. They can all be seen with the `attributes()` function

In [175]:
x <- 1:6
attributes(x)

NULL

In [176]:
names(x) <- c('um', 'dois', 'três', 'quatro', 'cinco', 'seis')
names(x)

In [177]:
attributes(x)

In [178]:
x

In [179]:
# attributes can be removed using the NULL

names(x) <- NULL
attributes(x)

NULL

In [180]:
x

In [181]:
length(x)

In [182]:
length(x) <- 10
x

In [183]:
length(x) <- 6
dim(x)

NULL

In [184]:
dim(x) <- c(3, 2)
x

0,1
1,4
2,5
3,6


In [185]:
attributes(x)

In [186]:
dim(x) <- NULL
x

In [187]:
# lists can also have names

x <- list(Curitiba = 1, Paraná = 2, Brasil = 3)
x

In [188]:
names(x)

In [189]:
# associated names with matrices rows and columns

matrix

0,1,2,3
1,2,3,4
5,6,7,8
9,10,11,12


In [190]:
attributes(matrix)

In [191]:
rownames(matrix) <- c('A', 'B', 'C')
colnames(matrix) <- c('T1', 'T2', 'T3', 'T4')
matrix

Unnamed: 0,T1,T2,T3,T4
A,1,2,3,4
B,5,6,7,8
C,9,10,11,12


In [192]:
attributes(matrix)

In [193]:
# data.frames

df

nome,sexo,idade
João,M,32
José,M,34
Maria,F,30


In [194]:
attributes(df)

In [195]:
names(df)

In [196]:
row.names(df)

## 7.1. Exercise 4

In [197]:
# 1.

matrix <- matrix(c(2, 8, 4, 0, 4, 1, 9, 7, 5), nrow=3, ncol=3, byrow=T)
matrix

0,1,2
2,8,4
0,4,1
9,7,5


In [198]:
# 2.

rownames(matrix) <- c(1, 2, 3)
colnames(matrix) <- c('col1', 'col2', 'col3')
matrix

Unnamed: 0,col1,col2,col3
1,2,8,4
2,0,4,1
3,9,7,5


In [199]:
# 3.

list3 <- list(rep(c('A', 'B', 'C'), times=c(2, 5, 4)), matrix)
list3

Unnamed: 0,col1,col2,col3
1,2,8,4
2,0,4,1
3,9,7,5


In [200]:
# 4.

names(list3) <- c('comp1', 'comp2')
list3

Unnamed: 0,col1,col2,col3
1,2,8,4
2,0,4,1
3,9,7,5


In [201]:
list3 <- list(list3, fator = factor(c('brava', 'joaquina', 'armação')))
list3

ERROR while rich displaying an object: Error in FUN(X[[i]], ...): attempt to use zero-length variable name

Traceback:
1. FUN(X[[i]], ...)
2. tryCatch(withCallingHandlers({
 .     rpr <- mime2repr[[mime]](obj)
 .     if (is.null(rpr)) 
 .         return(NULL)
 .     prepare_content(is.raw(rpr), rpr)
 . }, error = error_handler), error = outer_handler)
3. tryCatchList(expr, classes, parentenv, handlers)
4. tryCatchOne(expr, names, parentenv, handlers[[1L]])
5. doTryCatch(return(expr), name, parentenv, handler)
6. withCallingHandlers({
 .     rpr <- mime2repr[[mime]](obj)
 .     if (is.null(rpr)) 
 .         return(NULL)
 .     prepare_content(is.raw(rpr), rpr)
 . }, error = error_handler)
7. mime2repr[[mime]](obj)
8. repr_html.list(obj)
9. repr_list_generic(obj, "html", "\t<li>%s</li>\n", "\t<dt>$%s</dt>\n\t\t<dd>%s</dd>\n", 
 .     "<strong>$%s</strong> = %s", "<ol>\n%s</ol>\n", "<dl>\n%s</dl>\n", 
 .     numeric_item = "\t<dt>[[%s]]</dt>\n\t\t<dd>%s</dd>\n", escape_fun = html_escape)


[[1]]
[[1]]$comp1
 [1] "A" "A" "B" "B" "B" "B" "B" "C" "C" "C" "C"

[[1]]$comp2
  col1 col2 col3
1    2    8    4
2    0    4    1
3    9    7    5


$fator
[1] brava    joaquina armação 
Levels: armação brava joaquina


In [202]:
df <- data.frame(nome=c('João', 'Maria'),
                 sobrenome=c('Silva', 'Sauro'),
                 animal=c(TRUE, FALSE),
                 quantidadeAnimal=c(2,0))
df

nome,sobrenome,animal,quantidadeAnimal
João,Silva,True,2
Maria,Sauro,False,0


In [203]:
df <- rbind(df, data.frame(nome='José', sobrenome='Santos', animal=TRUE, quantidadeAnimal=5))
df

nome,sobrenome,animal,quantidadeAnimal
João,Silva,True,2
Maria,Sauro,False,0
José,Santos,True,5


In [204]:
df <- cbind(df, data.frame(time=c('Inter', 'Grêmio', 'Juventude')))
df

nome,sobrenome,animal,quantidadeAnimal,time
João,Silva,True,2,Inter
Maria,Sauro,False,0,Grêmio
José,Santos,True,5,Juventude


# 8. Object-oriented programming

In [205]:
methods(mean)

[1] mean.Date     mean.default  mean.difftime mean.POSIXct  mean.POSIXlt 
see '?methods' for accessing help and source code

In [206]:
set.seed(42)
vec <- rnorm(100)
class(vec)

In [207]:
mean(vec)

There is a specific method defined to the numeric class, which is the mean.default. The generic function is the `mean()` and the method function is the `mean.default()`

In [208]:
mean.default(vec)

In [209]:
# we could force other method

mean.Date(vec)

In [210]:
mean

In [211]:
mean.default

In [212]:
# suppose we want the row mean

mat <- matrix(rnorm(50), nrow=5)
mean(mat)
# not what we want

In [213]:
# we could define a custom function for that

mean.matrix <- function(x, ...) rowMeans(x)

In [214]:
methods(mean)

[1] mean.Date     mean.default  mean.difftime mean.matrix   mean.POSIXct 
[6] mean.POSIXlt 
see '?methods' for accessing help and source code

In [215]:
class(mat)

In [216]:
mean(mat)

We could do the same for other classes, such as dataframes, lists etc.