Skip to content

wdoyle42/vandy_ds_1000

Repository files navigation

Syllabus

Instructor Emails and Office Hours:

Prof. Clinton: josh.clinton@vanderbilt.edu Office Hours Signup

Prof. Doyle: w.doyle@vanderbilt.edu Office Hours Signup

Teaching Assistants:

Qi Xu: qi.xu.1@vanderbilt.edu Office Hours Mondays 3-4

Lecture Notes, Data and Code for Each Topic

Note: When new readings or async videos are available, they will be linked under the appropriate week.

To access lecture notes and/or data sets use ctrl+click on a mac or right click on a pc, then click "save link as" and save to the class directory on your computer.

1. The Wonderful World of Data Science

Lecture notes: PDF .Rmd

Survey Results: HTML

2. Hello, World!

Lecture notes: HTML PDF .Rmd

Data: College Debt Data

Part 2 of lecture notes: HTML PDF .Rmd

3. Data Wrangling

Lecture slides: PDF Lecture notes: PDF

Michigan Exit Poll Data

Johns Hopkins Covid Data

Readings/Resources

"The Gender gap was expected to be historic."

"Exit polls, election surveys and more"

Wickham & Grolemund, Chapter 3

4. Univariate Data Analysis

Lecture Notes, Part 1: .Rmd PDF HTML

Dataset: nba_players.Rds

Codebook for nba_players.Rds

Lecture Notes, Part 2:.Rmd PDF HTML

Dataset: game_summary.Rds

Codebook for game_summary.Rds

If you're interested, code to access the NBA reference database for player-level data

If you're interested, code to access the NBA reference database for game-level data

5. Univariate Data Visualization

Detailed Lecture Notes: PDF

Lecture slides: PDF

Dataset: Pres2020.PV.Rdata

Student Outline: .Rmd

If interested: Challenger: The Final Flight

6. Conditional Data Visualization

Student Outline: .Rmd

Complete Markdown Code: .Rmd

Lecture slides - Part 1: PDF

Lecture slides - Part 2: PDF

Optional Report on 2020 Polls: PDF

7. Resampling Redux

Partial List of functions used thus far: .Rmd HTML

Lecture Notes - Part 1: .Rmd PDF HTML

Lecture Notes - Part 2: .Rmd PDF HTML

Dataset: pa.sample.select.Rdata

Dataset: Pres2020.PV.Rdata

8. Predicting (And Mapping)

Predicting the Electoral College - JDC Code: .Rmd HTML

Predicting the Electoral College: .Rmd HTML

Dataset: Pres2020.StatePolls.Rdata

Optional Notes - Mapping in R: .Rmd PDF HTML

Optional Dataset: StatePresidentialVote2020.Rdata

Optional Dataset: StateECV.Rdata

9. Regression

Regression notes, part 1: .Rmd HTML

Regression notes, part 2: .Rmd HTML

Regression notes, part 3: .Rmd HTML

"Simpler" version of workflows

Movie Data: mv.Rds

In case you're interested, code for accessing movie data, including the IMDB data

10. Clustering, Text, and Twitter...

Clustering notes, part 1: .Rmd HTML

Clustering notes, part 2: .Rmd HTML

Predicting with Text, part 3: .Rmd HTML

Sentiment Analysis with Twttter Text, part 4: .Rmd HTML

Florida County Data - for part 1: FloridaCountData.Rda

Federalist Papers Data - for part 2: FederalistPaperCorpusTidy.Rda

Federalist Papers Data - for part 3: FederalistPaperDocumentTermMatrix.Rda

Tweet-Level Data on Trump Tweets - for part 4: Trumptweets.Rda

Word-Level Data on Trump Tweets - for part 4: Trump_tweet_words.Rda

11. Classification: Or, Become an Admissions Dean for Fun and Profit!

Lecture 1: The Admissions "Funnel" and the Problem of Classification

Rmd

HTML Notes

Data

Lecture 2: Logistic Regression and the AUC

Rmd

HTML Notes

Data

Lecture 3: Cross Validation and Feature Selection for Classifiers

Rmd Notes

HTML Notes

Data

Lecture 4: Changing Policy

Rmd Notes

HTML Notes

Data

Our Last Day!

What happened to Zillow Offers? part 1

What happened to Zillow Offers?, part 2

Leaderboard from Movies!

Google form for feedback

Helpful Resources

Rstudio Cheat Sheet: Data Wrangling

Rstudio Cheat Sheet: ggplot2

R-graphics Cookbook

... And the full list of Rstudio cheat sheets

Tidymodels Resources

About

Vanderbilt's Data Science 1000 Course

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages