JSM 2016 talk: Thinking with Data using R and RStudio: powerful idioms for analysts
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.



JSM 2016 talk: Thinking with Data using R and RStudio: powerful idioms for analysts

Nicholas Horton: Sunday, July 31st, 2016

Making the Most of R Tools (Section on Statistical Computing)

4:05pm in CC-W183b


Statisticians and data scientists need to be able to "think with data" in order to answer statistical questions that arise from the flood of data that are now available. In this talk, I will introduce a set of key idioms due to Hadley Wickham that provide a framework to teach data management skills and facilitate loading, merging, and transforming large datasets.

This talk will demonstrate these idioms implemented in new packages in R (namely readr, dplyr, haven, lubridate, mosaic, rvest, stringr, and tidyr) to ingest, manage, transform, analyze, and model data. You'll see that it is easy to learn to use these packages, and that it is very worthwhile to do so. The talk provides a headstart on learning, then points out the next steps. No prior experience with R is expected.

Slides and R Markdown files


airlines package (to set up local airlines database using etl package)

etl (extract, transform, load) package)


Developing precursors to data science website

Setting the stage: integration of data management skills in introductory and second courses in statistics (Horton, Baumer, and Wickham)

Tidy data paper

Mere renovation is not enough website

RStudio cheatsheets

Thinking with data TAS editorial

Undergraduate statistics curriculum guidelines

R Markdown in introductory statistics (TISE) paper

Happy Git and GitHub for the useR

Last updated July 31, 2016 by Nicholas Horton