# Data reshaping in `R` with `tidyr`

A "tidy" dataset is a dataset in which 

1. each variable is represented by a column
1. each observation is represented as its own row
1. each value has its own cell

The `R` package `tidyr` implements a handful of simple verbs that help organize data into a "tidy" format.

In [2]:
library(tidyr)

# An example un-tidy data.frame
grades <- data.frame(
    ID=c('jamie', 'cersei', 'hodor'),
    matrix(runif(15, 1, 10), nrow=3))
colnames(grades) <- c('ID', paste0('HW', 1:5))
grades$info <- c('male/lannister',
                 'female/lannister',
                 'male/stark')
grades

ID,HW1,HW2,HW3,HW4,HW5,info
jamie,7.922,4.843,4.012,2.345,6.429,male/lannister
cersei,6.545,1.428,4.475,8.489,6.267,female/lannister
hodor,1.08,1.886,7.811,7.035,3.228,male/stark


This data frame might seem harmless at first glance, but notice that the homework nubmer (e.g., HW1, HW2, ...), which is a variable in the dataset appears as a subset of the row names, instead of having its own column. Also, the actual score corresponding to each ID/HW pair should be a separate column as well.

We can use the `gather()` verb from `tidyr` to collect multiple columns into key-value pairs with the syntax

`gather(data, key, value, ...)`

where the `...` should be replaced by column specifications

In [4]:
grades.tidy <- gather(grades, HW, score, HW1:HW5)
grades.tidy

ID,info,HW,score
jamie,male/lannister,HW1,7.922
cersei,female/lannister,HW1,6.545
hodor,male/stark,HW1,1.08
jamie,male/lannister,HW2,4.843
cersei,female/lannister,HW2,1.428
hodor,male/stark,HW2,1.886
jamie,male/lannister,HW3,4.012
cersei,female/lannister,HW3,4.475
hodor,male/stark,HW3,7.811
jamie,male/lannister,HW4,2.345


Still, the `info` column of `grades.tidy` contains a pair of values. Truly "tidy" data frames will have one cell assigned to one value. In this case, we can use the `separate()` verb to split the `info` column into two columns.

In [6]:
grades.split <- separate(grades.tidy, info, into = c('sex', 'house'), sep='/')
grades.split

ID,sex,house,HW,score
jamie,male,lannister,HW1,7.922
cersei,female,lannister,HW1,6.545
hodor,male,stark,HW1,1.08
jamie,male,lannister,HW2,4.843
cersei,female,lannister,HW2,1.428
hodor,male,stark,HW2,1.886
jamie,male,lannister,HW3,4.012
cersei,female,lannister,HW3,4.475
hodor,male,stark,HW3,7.811
jamie,male,lannister,HW4,2.345
