---
title: "Take your data frames to the next level."
author: "Kiefer"
date: "March 30, 2017"
output: html_document
---

<img class="alignnone size-full wp-image-458 aligncenter" src="https://realdataweb.files.wordpress.com/2017/03/leo.jpg" alt="leo" width="600" height="400" />

While finishing up with R-rockstar Hadley Wickham's book (<a href="https://realdataweb.wordpress.com/2017/01/06/free-book-r-for-data-science/" target="_blank">Free Book – R for Data Science</a>), <a href="http://r4ds.had.co.nz/model-building.html">the section on model building</a> elaborates on something pretty cool - list columns.

Most of us have probably seen the following data frame column format:

In [1]:
df <- data.frame("col_uno" = c(1,2,3),"col_dos" = c('a','b','c'), "col_tres" = factor(c("google", "apple", "amazon")))

And the output

In [2]:
df

col_uno,col_dos,col_tres
1,a,google
2,b,apple
3,c,amazon


This is an awesome way to organize data and one of R's strong points.  However, we can use list functionality to go deeper.  Check this out:

In [3]:
library(tidyverse)
library(datasets)

Loading tidyverse: ggplot2
Loading tidyverse: tibble
Loading tidyverse: tidyr
Loading tidyverse: readr
Loading tidyverse: purrr
Loading tidyverse: dplyr
Conflicts with tidy packages ---------------------------------------------------
filter(): dplyr, stats
lag():    dplyr, stats


In [4]:
head(iris)

nested <- iris %>%
  group_by(Species) %>%
  nest()

Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species
5.1,3.5,1.4,0.2,setosa
4.9,3.0,1.4,0.2,setosa
4.7,3.2,1.3,0.2,setosa
4.6,3.1,1.5,0.2,setosa
5.0,3.6,1.4,0.2,setosa
5.4,3.9,1.7,0.4,setosa


Using `nest` we can compartmentalize our data frame for readability and more efficient iteration.  Here we can use `map` from the `purrr` package to compute the mean of each column in our nested data.

In [8]:
means <- map(nested$data, colMeans)
means

Once you're done messing around with data-ception, use `unnest` to revert your data back to its original state.

In [7]:
head(unnest(nested))

Species,Sepal.Length,Sepal.Width,Petal.Length,Petal.Width
setosa,5.1,3.5,1.4,0.2
setosa,4.9,3.0,1.4,0.2
setosa,4.7,3.2,1.3,0.2
setosa,4.6,3.1,1.5,0.2
setosa,5.0,3.6,1.4,0.2
setosa,5.4,3.9,1.7,0.4


I was pretty excited to learn about this property of data.frames and will definitely make use of it in the future.  If you have any neat ideas how this might be used, please feel free to share in the comments.