# Basic R structures

This notebooks will deal with most common data and code structures in R.

## Data frames

The single most common object when dealing with data in R is the `data.frame`-structure.

`data.frame` contains data in tabular form.[1](https://stat.ethz.ch/R-manual/R-devel/library/base/html/data.frame.html)

Let us load a internal R sample data set that has car information.

In [65]:
df <- mtcars
is.data.frame(df)

# Print only first rows
head(df)

Unnamed: 0,mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb
Mazda RX4,21.0,6,160,110,3.9,2.62,16.46,0,1,4,4
Mazda RX4 Wag,21.0,6,160,110,3.9,2.875,17.02,0,1,4,4
Datsun 710,22.8,4,108,93,3.85,2.32,18.61,1,1,4,1
Hornet 4 Drive,21.4,6,258,110,3.08,3.215,19.44,1,0,3,1
Hornet Sportabout,18.7,8,360,175,3.15,3.44,17.02,0,0,3,2
Valiant,18.1,6,225,105,2.76,3.46,20.22,1,0,3,1


The names of the columns and rows can be accessed through `colnames` and `rownames`. 

Number of columns and rows can be obtained with `ncol` and `nrow`.

In [26]:
colnames(df)
rownames(df)
ncol(df)
nrow(df)

`data.frame` access is typically done through these names.

In [58]:
# Get specific row, all columns
df["Dodge Challenger",]
# Get all rows, specific columns
df[,c("cyl","disp")]
# Create a boolean vector where each row is TRUE or FALSE based on whether cyl == 6 or not
cyl6 <- df[,"cyl"] == 6
cyl6
# Get all rows based on an indexing vector
df[cyl6,]

Unnamed: 0,mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb
Dodge Challenger,15.5,8,318,150,2.76,3.52,16.87,0,0,3,2


Unnamed: 0,cyl,disp
Mazda RX4,6,160.0
Mazda RX4 Wag,6,160.0
Datsun 710,4,108.0
Hornet 4 Drive,6,258.0
Hornet Sportabout,8,360.0
Valiant,6,225.0
Duster 360,8,360.0
Merc 240D,4,146.7
Merc 230,4,140.8
Merc 280,6,167.6


Unnamed: 0,mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb
Mazda RX4,21.0,6,160.0,110,3.9,2.62,16.46,0,1,4,4
Mazda RX4 Wag,21.0,6,160.0,110,3.9,2.875,17.02,0,1,4,4
Hornet 4 Drive,21.4,6,258.0,110,3.08,3.215,19.44,1,0,3,1
Valiant,18.1,6,225.0,105,2.76,3.46,20.22,1,0,3,1
Merc 280,19.2,6,167.6,123,3.92,3.44,18.3,1,0,4,4
Merc 280C,17.8,6,167.6,123,3.92,3.44,18.9,1,0,4,4
Ferrari Dino,19.7,6,145.0,175,3.62,2.77,15.5,0,1,5,6


Let's create a new column from *mpg* that shows litre per 100 kilometres *lkm*: 

In [66]:
df["lkm"] = 235.214583 / df["mpg"]
head(df)

Unnamed: 0,mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb,lkm
Mazda RX4,21.0,6,160,110,3.9,2.62,16.46,0,1,4,4,11.20069
Mazda RX4 Wag,21.0,6,160,110,3.9,2.875,17.02,0,1,4,4,11.20069
Datsun 710,22.8,4,108,93,3.85,2.32,18.61,1,1,4,1,10.31643
Hornet 4 Drive,21.4,6,258,110,3.08,3.215,19.44,1,0,3,1,10.99134
Hornet Sportabout,18.7,8,360,175,3.15,3.44,17.02,0,0,3,2,12.57832
Valiant,18.1,6,225,105,2.76,3.46,20.22,1,0,3,1,12.99528
