Purdue Seminar - R for Data Science
Please post comments and questions here: Gitter
Subscribe here: Google Calendar (that can't be embeded)
02/22, 03/01, 03/08, (SPRING BREAK), 03/22 - CANCELLED, 03/29, Done for semester
Topics to cover:
- 02/15 - ggplot2; √
- 02/22 - github, R markdown, tibbles and strings; √
- Learn: https://try.github.io
- Guides: https://guides.github.com
- Daring Fireball: https://daringfireball.net/projects/markdown/syntax
- Regular Expressions
- 03/01 - tidy data√ and data transformation√. Maybe relational data
- tidy data
- Data structure
- Columns are variables
- Rows are observations
- Data frame cells are values
- split - separate one column into many columns
- gather - merge multiple columns into a key column and value column
- Data structure
- Data transformations
- mutate - make a alteration to the data set
- select - grab certain columns or remove certain columns
- group_by - set up the data set ot be grouped by certain columns
- summarise - get summary metrics according to the grouped columns
- 03/08 - functions√ and pipes√
- 03/22 - CANCELLED!
- 03/29 - web scraping - updated presentation
- User provided dataset exploration - would like to do this twice. Please suggest datasets you'd like to explore!
- nested data.frames ?
- Suggestions? Have already add some string manipulations and webscripting.
- Exploratory Data Analysis?
- R for Data Science
- Link: http://r4ds.had.co.nz
- "This is the website for “R for Data Science”. This book will teach you how to do data science with R: You’ll learn how to get your data into R, get it into the most useful structure, transform it, visualise it and model it. In this book, you will find a practicum of skills for data science. Just as a chemist learns how to clean test tubes and stock a lab, you’ll learn how to clean data and draw plots—and many other things besides. These are the skills that allow data science to happen, and here you will find the best practices for doing each of these things with R. You’ll learn how to use the grammar of graphics, literate programming, and reproducible research to save time. You’ll also learn how to manage cognitive resources to facilitate discoveries when wrangling, visualising, and exploring data."
- Advanced R
- Link: http://adv-r.had.co.nz
- "This is the companion website for “Advanced R”, a book in Chapman & Hall’s R Series. The book is designed primarily for R users who want to improve their programming skills and understanding of the language. It should also be useful for programmers coming to R from other languages, as it explains some of R’s quirks and shows how some parts that seem horrible do have a positive side."
- Link: https://www.datacamp.com/getting-started?step=2&track=r
- Price: Maybe $30 a month. Worth the price.
- Link: http://swirlstats.com/students.html
- "Learn R, in R"
- Course Descriptions: https://github.com/swirldev/swirl_courses#swirl-courses
- Price: Free!
# setup install.packages("swirl") library("swirl") swirl::install_course_url("https://github.com/swirldev/swirl_courses/archive/master.zip", multi = TRUE) # regular useage library(swirl) swirl()
- RStudio Resources: