# Lecture 3: An overview of R: Part II
- Assess the values of an object
- Enter or import data into R
- Export data
- Save and load data
- View data.

###### Before we start.
- I have been using print( ) to display R objects, which is totally redundant.
- Because Jupyter Notebook has its own different beautiful display.
- Jupyter is ideal for teaching - bigger display, live editing.
- Hope you like it.
- I am lazy and will not print( ) unless there is something I want to show you.
- It does NOT make a difference for you.

In [1]:
df <- data.frame(names = c("Lucy", "John", "Mark", "Candy"),
                score = c(67, 56, 87, 91))
df

names,score
Lucy,67
John,56
Mark,87
Candy,91


In [2]:
print(df)

  names score
1  Lucy    67
2  John    56
3  Mark    87
4 Candy    91


## 3.1 Assess the values of an object - the index system of R
###### Key Operators are "[ ]" and "$"
###### Recall object classes:
- Vector
- Matrix
- Array
    - Recall that these three are essentially the same thing.
- Data frame
- List
    - (Factor)
        - I wonder why [---] does not teach how to index a factor.

###### Again, if you decide not to read [---], good! If you have to read it, proofread!

### 3.1.1 Index a vector

In [3]:
vector <- 2:6
vector

In [4]:
# Pick the 2nd
vector[2]

In [5]:
# Pick 2nd - 4th
vector[2:4]

In [6]:
# Pick no. 1, 3, 5
vector[c(1, 3, 5)]

In [7]:
# Code like a pro
# This good practice makes it clearer for revisits and/or edits - Reproducibility!

# Pick no. 1, 3, 5
index <- c(1, 3, 5)

vector[index]

###### Use the names

In [8]:
# Recall that we could give names to vector entries
names(vector) <- letters[2:6]; vector

In [9]:
vector["b"]

###### Use "," to separate dimensions.
- 1st dimension: row
- 2nd dimension: column
- 3rd ...

### 3.1.2 Index a matrix

In [10]:
matrix <- matrix(c(3:14), nrow = 4, byrow = TRUE)
print(matrix)

     [,1] [,2] [,3]
[1,]    3    4    5
[2,]    6    7    8
[3,]    9   10   11
[4,]   12   13   14


In [11]:
matrix[2, 3]

In [12]:
matrix[2, ]

In [13]:
matrix[ , c(1, 3)]

0,1
3,5
6,8
9,11
12,14


###### Use the names

In [14]:
# Recall that we could give names to columns and rows
rownames <- c("row1", "row2", "row3", "row4")
colnames <- c("col1", "col2", "col3")
rownames(matrix) <- rownames
colnames(matrix) <- colnames
print(matrix)

     col1 col2 col3
row1    3    4    5
row2    6    7    8
row3    9   10   11
row4   12   13   14


In [15]:
matrix["row1", ]
# The output is a named vector as a result of dimension reduction

In [16]:
matrix["row2", "col3"]

### 3.1.3 Index an array

In [17]:
array <- array(3:14, dim = c(2, 3, 2))
print(array)

, , 1

     [,1] [,2] [,3]
[1,]    3    5    7
[2,]    4    6    8

, , 2

     [,1] [,2] [,3]
[1,]    9   11   13
[2,]   10   12   14



In [18]:
array[ , , 1]

0,1,2
3,5,7
4,6,8


In [19]:
array[2, 3, 2]

In [20]:
array[1, , 2]

### 3.1.4 Index a data frame

In [21]:
print(df)

  names score
1  Lucy    67
2  John    56
3  Mark    87
4 Candy    91


In [22]:
df[2, ]

Unnamed: 0,names,score
2,John,56


In [23]:
df[ , 1]

###### Use the names

In [24]:
# There are (column) names that are ready to use in data frames.
names(df)

In [25]:
df$names

###### Very useful stuff that are not in [---].

In [26]:
# What is John's score?
df[df$names == "John", ]

Unnamed: 0,names,score
2,John,56


In [27]:
# Anyone scored 100?
print(df[df$score == 100, ])

[1] names score
<0 rows> (or 0-length row.names)


In [28]:
# Highest score?
max(df$score)     # max() for maximum

In [29]:
# Who had the highes score?
df[df$score == max(df$score), ]

Unnamed: 0,names,score
4,Candy,91


In [30]:
# Just give me the name!
df[df$score == max(df$score), ]$names

### 3.1.5 Index a list

In [31]:
list <- list("Red", factor(c("a","b")), c(21,32,11), TRUE)
print(list)

[[1]]
[1] "Red"

[[2]]
[1] a b
Levels: a b

[[3]]
[1] 21 32 11

[[4]]
[1] TRUE



In [32]:
list[[1]]

In [33]:
list[[3]][2]

## 3.2 Enter or import data into R
Here we talk about importing data frames.
### 3.2.1 Direct data entering
Recall the data.frame( ) function. See the first code chunk of this lecture.

### 3.2.2 Use datasets that come with R or R packages
Many R packages come with datasets that help explain how the packages and functions work, including those already installed when you download R and those already loaded everytime you open R.

In [34]:
head(mtcars)

Unnamed: 0,mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb
Mazda RX4,21.0,6,160,110,3.9,2.62,16.46,0,1,4,4
Mazda RX4 Wag,21.0,6,160,110,3.9,2.875,17.02,0,1,4,4
Datsun 710,22.8,4,108,93,3.85,2.32,18.61,1,1,4,1
Hornet 4 Drive,21.4,6,258,110,3.08,3.215,19.44,1,0,3,1
Hornet Sportabout,18.7,8,360,175,3.15,3.44,17.02,0,0,3,2
Valiant,18.1,6,225,105,2.76,3.46,20.22,1,0,3,1


In [35]:
data() # Shows all datasets that are always available to you.

Some require loading the package that they are in.

In [36]:
# head(cancer)       # Run this line before loading the 'survival' package will result in error.
library(survival)
head(cancer)

inst,time,status,age,sex,ph.ecog,ph.karno,pat.karno,meal.cal,wt.loss
3,306,2,74,1,1,90,100,1175.0,
3,455,2,68,1,0,90,90,1225.0,15.0
3,1010,1,56,1,0,90,90,,15.0
5,210,2,57,1,1,90,60,1150.0,11.0
1,883,2,60,1,0,100,90,,0.0
12,1022,1,74,1,1,50,80,513.0,0.0


### 3.2.3 Read data files
It is necessary to import data into R before we start working on our analysis. R offers a wide range of packages for importing data in any format. Some functions are available by default: read.table( ), read.csv( ), read.csv2( ), read.delim( ) and read.delim2( ). There are also a number of packages that will read files from Excel, SPSS, SAS, Stata, and various relational databases.

In [37]:
?read.table

###### Example command
data <- read.table(file, header = TRUE, sep = ",", quote = "\"", dec = ".", fill = TRUE, comment.char = "")
- file: A local file with complete path or a URL
- header: Whether use the first row as the name of the columns
- sep: What separates the entries
- ...

In [41]:
x <- read.table(file = "https://rllabmcgill.github.io/COMP-652/assignments/hw1-q1x.csv", header = FALSE)
dim(x)

In [42]:
head(x)

V1,V2
-1.0326997,4.998879
6.1814855,4.385308
6.1754709,-1.663962
5.2243959,-1.311939
-0.5013456,1.195997
4.5960525,3.308643


In [44]:
xx <- read.table(file = "https://rllabmcgill.github.io/COMP-652/assignments/hw1-q1x.csv", header = TRUE)
dim(xx)

In [45]:
head(xx) # header = TRUE makes the first row column names.

X.1.0326997e.00,X4.9988785e.00
6.1814855,4.385308
6.1754709,-1.663962
5.2243959,-1.311939
-0.5013456,1.195997
4.5960525,3.308643
3.1859494,4.544457
