For the collective campus data science course.
Introduction to R Reading in data from csv files Introduction to databases Extracting data from databases Merging data tables Tidy data Writing functions The split-apply-combine strategy using dplyr Generating summary statistics for arbitrary sub-groups Writing data to files & databases
Introduction to predictive modelling Structural modelling vs machine learning Predicting different data types Building a predictive model using linear regression Under the hood: maximum likelihood Feature selection & prediction using regularised GLMs A brief introduction to Bayesian modelling Classification and regression trees Boosted trees and Random Forests
What is causality, and why won’t predictive models help me? Data generating processes and observational equivalence Unobserved data and simultaneity The experimental ideal Natural experiments as a way of thinking about the world Instrumental variables Other techniques (Difference-in-differences, regression discontinuity) Matching routines
Introduction to ggplot2 Aesthetics – x, y, size, weight, group, colour, fill, etc. Chart types Data exploration using faceting and grouping Customising chart appearance Publishing work using Rmd and Rpres Celebratory breakup drinks (!)