# Tidyverse

[Hadley Wickham](http://hadley.nz/) created some amazing R packages for loading, editing, and visualizing data.
This collection of R packages in collectively called the [Tidyverse](https://www.tidyverse.org/).

* "Tidy" is refering to keeping data tidy
* This notebook is an adaptation from [this tutorial](http://tclavelle.github.io/dplyr-tidyr-tutorial/)

# Loading libraries

In [None]:
library(dplyr)
library(tidyr)
library(ggplot2)

# Read data

In [None]:
cars = read.delim('../../src/mtcars.txt', sep='\t')
cars

# pipeing

`%>%` is a "pipe" in R 

In [None]:
head(mtcars, n=3)

In [None]:
mtcars %>% head(., n=3)

# Tidy data

Formatting data using specific "verbs" such as...

## dplyr

  * `filter()` subset data based on logical criteria
  * `select()` select certain columns
  * `mutate()` create a new variable/column
  * `group_by()` group data by common variables for performing calculations
  * `summarize()` summarize data into a single row of values
  * `arrange()` order rows by value of a column
  * `rename()` rename columns

## tidyr

  * `gather()` and `spread()` convert data between wide and long format
  * `separate()` and `unite()` separate a single column into multiple columns and vice versa
  * `complete()` turns implicit missing values in explicit missing values by completing missing data combinations

## filter() & select()

In [None]:
cars %>% 
    filter(mpg > 21, hp < 100) 

## group_by() & summarize()

In [None]:
cars %>% 
    group_by(gear) %>%
    summarize(mean_mpg = mean(mpg)) 

# ggplot2

## Simple plotting

In [None]:
options(repr.plot.height=3, repr.plot.width=4)
ggplot(cars, aes(mpg, hp)) +
    geom_point() 

## Combining tidy data & ggplot

In [None]:
cars_summary = cars %>% 
    group_by(gear) %>%
    summarize(mean_mpg = mean(mpg)) 

options(repr.plot.height=3, repr.plot.width=4)
ggplot(cars_summary, aes(gear, mean_mpg)) +
    geom_bar(stat='identity')

# Resources

* [tidyr overview](https://blog.rstudio.com/2016/02/02/tidyr-0-4-0/)
* [data wrangling cheat sheet](https://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf)
* [ggplot2 cheat sheet](https://www.rstudio.com/wp-content/uploads/2015/03/ggplot2-cheatsheet.pdf)