Nicholas Horton: Sunday, July 31st, 2016
Making the Most of R Tools (Section on Statistical Computing)
4:05pm in CC-W183b
Statisticians and data scientists need to be able to "think with data" in order to answer statistical questions that arise from the flood of data that are now available. In this talk, I will introduce a set of key idioms due to Hadley Wickham that provide a framework to teach data management skills and facilitate loading, merging, and transforming large datasets.
This talk will demonstrate these idioms implemented in new packages in R (namely readr, dplyr, haven, lubridate, mosaic, rvest, stringr, and tidyr) to ingest, manage, transform, analyze, and model data. You'll see that it is easy to learn to use these packages, and that it is very worthwhile to do so. The talk provides a headstart on learning, then points out the next steps. No prior experience with R is expected.
airlines package (to set up local airlines database using etl package)
etl (extract, transform, load) package)
Developing precursors to data science website
Mere renovation is not enough website
Thinking with data TAS editorial
Undergraduate statistics curriculum guidelines
R Markdown in introductory statistics (TISE) paper
Happy Git and GitHub for the useR
Last updated July 31, 2016 by Nicholas Horton