#### LISTS and DATAFRAMES in R

A list is a sequenced collection of different objects of R, like vectors, numbers, characters, other lists as well, and so on. 
List as a container of correlated information, well structured and easy to read. 

In [1]:
movie <- list("Toy Story", 1995, c("Animation", "Adventure", "Comedy"))

In [2]:
class(movie)

In [3]:
typeof(movie[0])

In [4]:
movie

In [5]:
movie[0]

In [6]:
movie[1]

In [7]:
movie[2:3]

In [8]:
a<-list("a",10,5.8,c("XYZ",60))

In [9]:
a

In [10]:
a[1]

In [11]:
a[1]

In [12]:
a[-1]

In [13]:
a

In [14]:
a[1:2]

### Named lists

In [15]:
movie<-list( name = "Toy Story",
             year = 1995,
             genre = c("Animation", "Adventure", "Comedy")
           )

In [16]:
movie

In [17]:
movie["genre"]

In [19]:
movie $genre

In [20]:
class(movie$name)
class(movie$foreign)

In [21]:
class(movie["name"])
class(movie["foreign"])

In [22]:
movie[["age"]] <- 5

In [23]:
movie

In [24]:
movie[["age"]] <- 6

In [25]:
movie

In [26]:
movie[["age"]] <- NULL
movie

#### Concatenating Lists

In [27]:
movie_part1 <- list(name = "Toy Story")
movie_part2 <- list(year = 1995, genre = c("Animation", "Adventure", "Comedy"))

In [28]:
movie_part1

In [29]:
movie_part2

In [30]:
movie_concatenated <- c(movie_part1, movie_part2)

In [31]:
movie_concatenated 

### DataFrames

In [32]:
movies <- data.frame(name = c("Toy Story", "Akira", "The Breakfast Club", "The Artist",
                              "Modern Times", "Fight Club", "City of God", "The Untouchables"),
                    year = c(1995, 1998, 1985, 2011, 1936, 1999, 2002, 1987)
                    )

In [33]:
movies

name,year
Toy Story,1995
Akira,1998
The Breakfast Club,1985
The Artist,2011
Modern Times,1936
Fight Club,1999
City of God,2002
The Untouchables,1987


In [34]:
movies <- data.frame(name = c("Toy Story", "Akira", "The Breakfast Club", "The Artist",
                              "Modern Times", "Fight Club", "City of God", "The Untouchables"),
                    year = c(1995, 1998, 1985, 2011, 1936, 1999, 2002, 1987),
                    stringsAsFactors=F)

In [35]:
movies

name,year
Toy Story,1995
Akira,1998
The Breakfast Club,1985
The Artist,2011
Modern Times,1936
Fight Club,1999
City of God,2002
The Untouchables,1987


In [36]:
head(movies)

name,year
Toy Story,1995
Akira,1998
The Breakfast Club,1985
The Artist,2011
Modern Times,1936
Fight Club,1999


In [37]:
tail(movies)

Unnamed: 0,name,year
3,The Breakfast Club,1985
4,The Artist,2011
5,Modern Times,1936
6,Fight Club,1999
7,City of God,2002
8,The Untouchables,1987


In [38]:
summary(movies)

     name                year     
 Length:8           Min.   :1936  
 Class :character   1st Qu.:1986  
 Mode  :character   Median :1996  
                    Mean   :1989  
                    3rd Qu.:2000  
                    Max.   :2011  

In [39]:
dim(movies)

In [40]:
movies$name

In [41]:
movies["name"]

name
Toy Story
Akira
The Breakfast Club
The Artist
Modern Times
Fight Club
City of God
The Untouchables


In [42]:
movies[1]

name
Toy Story
Akira
The Breakfast Club
The Artist
Modern Times
Fight Club
City of God
The Untouchables


str() is one of most useful functions in R. With this function you can obtain textual information about an object

In [43]:
str(movies)

'data.frame':	8 obs. of  2 variables:
 $ name: chr  "Toy Story" "Akira" "The Breakfast Club" "The Artist" ...
 $ year: num  1995 1998 1985 2011 1936 ...


In [44]:
movies

name,year
Toy Story,1995
Akira,1998
The Breakfast Club,1985
The Artist,2011
Modern Times,1936
Fight Club,1999
City of God,2002
The Untouchables,1987


In [45]:
class(movies$year)

In [46]:
movies[1,2]

In [47]:
head(movies)

name,year
Toy Story,1995
Akira,1998
The Breakfast Club,1985
The Artist,2011
Modern Times,1936
Fight Club,1999


In [48]:
tail(movies)

Unnamed: 0,name,year
3,The Breakfast Club,1985
4,The Artist,2011
5,Modern Times,1936
6,Fight Club,1999
7,City of God,2002
8,The Untouchables,1987


### Add new Column into our data frame

In [49]:
movies['length'] <- c(81, 125, 97, 100, 87, 139, 130, 119)
movies

name,year,length
Toy Story,1995,81
Akira,1998,125
The Breakfast Club,1985,97
The Artist,2011,100
Modern Times,1936,87
Fight Club,1999,139
City of God,2002,130
The Untouchables,1987,119


### Combine Data Frames
In R, we use the rbind() and the cbind() function to combine two data frames together.

#### rbind() - combines two data frames vertically
#### cbind() - combines two data frames horizontally

In [50]:
New_DF <- cbind(movies, movie_type=c("Comedy","Action","Thriller","Romantic","Comedy","Action","Thriller","Action"))

In [51]:
New_DF

name,year,length,movie_type
Toy Story,1995,81,Comedy
Akira,1998,125,Action
The Breakfast Club,1985,97,Thriller
The Artist,2011,100,Romantic
Modern Times,1936,87,Comedy
Fight Club,1999,139,Action
City of God,2002,130,Thriller
The Untouchables,1987,119,Action


### Add new row into our data frame

In [52]:
# add a new movie to our data set

In [53]:
movies <- rbind(movies, c(name="Dr. Strangelove", year=1964, length=94))
movies

name,year,length
Toy Story,1995,81
Akira,1998,125
The Breakfast Club,1985,97
The Artist,2011,100
Modern Times,1936,87
Fight Club,1999,139
City of God,2002,130
The Untouchables,1987,119
Dr. Strangelove,1964,94


In [54]:
movies <- movies[-12,]
movies

name,year,length
Toy Story,1995,81
Akira,1998,125
The Breakfast Club,1985,97
The Artist,2011,100
Modern Times,1936,87
Fight Club,1999,139
City of God,2002,130
The Untouchables,1987,119
Dr. Strangelove,1964,94


### To delete a column

In [55]:
movies[["length"]] <- NULL
movies

name,year
Toy Story,1995
Akira,1998
The Breakfast Club,1985
The Artist,2011
Modern Times,1936
Fight Club,1999
City of God,2002
The Untouchables,1987
Dr. Strangelove,1964


### Summarize the Data

In [56]:
movie

In [57]:
summary(movie)

      Length Class  Mode     
name  1      -none- character
year  1      -none- numeric  
genre 3      -none- character

In [58]:
length(movies)

### Combining Data Frames - Adding Rows rbind()

In [59]:
Data_Frame1 <- data.frame (
  Training = c("Strength", "Stamina", "Other"),
  Pulse = c(100, 150, 120),
  Duration = c(60, 30, 45)
)

In [60]:
Data_Frame1

Training,Pulse,Duration
Strength,100,60
Stamina,150,30
Other,120,45


In [61]:
Data_Frame2 <- data.frame (
  Training = c("Stamina", "Stamina", "Strength"),
  Pulse = c(140, 150, 160),
  Duration = c(30, 30, 20)
)

In [62]:
Data_Frame2

Training,Pulse,Duration
Stamina,140,30
Stamina,150,30
Strength,160,20


In [63]:
New_Data_Frame <- rbind(Data_Frame1, Data_Frame2)
New_Data_Frame

Training,Pulse,Duration
Strength,100,60
Stamina,150,30
Other,120,45
Stamina,140,30
Stamina,150,30
Strength,160,20


### Combining Data Frames - Adding Columns cbind()

In [64]:
Data_Frame3 <- data.frame (
  Training = c("Strength", "Stamina", "Other"),
  Pulse = c(100, 150, 120),
  Duration = c(60, 30, 45)
)

In [65]:
Data_Frame3

Training,Pulse,Duration
Strength,100,60
Stamina,150,30
Other,120,45


In [66]:
Data_Frame4 <- data.frame (
  Steps = c(3000, 6000, 2000),
  Calories = c(300, 400, 300)
)

In [67]:
Data_Frame4

Steps,Calories
3000,300
6000,400
2000,300


In [68]:
New_Data_Frame1 <- cbind(Data_Frame3, Data_Frame4)
New_Data_Frame1

Training,Pulse,Duration,Steps,Calories
Strength,100,60,3000,300
Stamina,150,30,6000,400
Other,120,45,2000,300


In [69]:
summary(New_Data_Frame1)

     Training     Pulse          Duration        Steps         Calories    
 Other   :1   Min.   :100.0   Min.   :30.0   Min.   :2000   Min.   :300.0  
 Stamina :1   1st Qu.:110.0   1st Qu.:37.5   1st Qu.:2500   1st Qu.:300.0  
 Strength:1   Median :120.0   Median :45.0   Median :3000   Median :300.0  
              Mean   :123.3   Mean   :45.0   Mean   :3667   Mean   :333.3  
              3rd Qu.:135.0   3rd Qu.:52.5   3rd Qu.:4500   3rd Qu.:350.0  
              Max.   :150.0   Max.   :60.0   Max.   :6000   Max.   :400.0  

In [70]:
New_Data_Frame1[1]

Training
Strength
Stamina
Other


In [71]:
Data_Frame <- data.frame (
  Training = c("Strength", "Stamina", "Other"),
  Pulse = c(100, 150, 120),
  Duration = c(60, 30, 45)
)

Data_Frame
# Add a new row
New_row_DF <- rbind(Data_Frame, c("Strength", 110, 110))

# Print the old row


# Print the new row
New_row_DF


Training,Pulse,Duration
Strength,100,60
Stamina,150,30
Other,120,45


Training,Pulse,Duration
Strength,100,60
Stamina,150,30
Other,120,45
Strength,110,110


In [72]:
Data_Frame <- data.frame (
  Training = c("Strength", "Stamina", "Other"),
  Pulse = c(100, 150, 120),
  Duration = c(60, 30, 45)
)

Data_Frame 

# Add a new column
New_col_DF <- cbind(Data_Frame, Steps = c(1000, 6000, 2000))

# Print the new column
New_col_DF

Training,Pulse,Duration
Strength,100,60
Stamina,150,30
Other,120,45


Training,Pulse,Duration,Steps
Strength,100,60,1000
Stamina,150,30,6000
Other,120,45,2000


### Remove Rows

In [73]:
Data_Frame <- data.frame (
  Training = c("Strength", "Stamina", "Other"),
  Pulse = c(100, 150, 120),
  Duration = c(60, 30, 45)
)
Data_Frame 
# Remove the first row and column


Training,Pulse,Duration
Strength,100,60
Stamina,150,30
Other,120,45


In [74]:
Data_Frame_New <- Data_Frame[-c(1)]

In [75]:
Data_Frame_New

Pulse,Duration
100,60
150,30
120,45


In [76]:
Data_Frame

Training,Pulse,Duration
Strength,100,60
Stamina,150,30
Other,120,45


In [77]:
data<- select(Data_Frame,-Duration)

ERROR: Error in select(Data_Frame, -Duration): could not find function "select"


In [None]:
Data_Frame_New1 <- Data_Frame[,-c(1)]

In [None]:
Data_Frame_New1

## Saving a Data Frame

In [79]:
Data_Frame_New

Pulse,Duration
100,60
150,30
120,45


In [78]:
write.csv(Data_Frame_New,file="Completed.csv")

In [82]:
write.csv(Data_Frame_New,file="Completed1.csv",row.names=FALSE)