Skip to content


Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

DSCI 561: Regression I

Inference on a numeric response, in the presence of predictors.

Course Learning Outcomes

By the end of the course, students are expected to:

  • Fit a linear regression model using R and broom.
  • Interpret how predictors influence a response using a fitted linear regression model.
  • Identify whether a linear regression model is appropriate for a given dataset.
  • Identify use cases of linear regression
  • Evaluate the fit of a regression model using residual plots
  • Evaluate the fit of a regression model using appropriate measures of model goodness (MSE and R-squared), and drawing the connection back to the null model.
  • Quantify estimation error vs. prediction error in the presence of predictors, and understand the decomposition of error in each case.
  • Understand the effect of multicollinearity on an OLS estimate.
  • Convert categorical predictors for use in a linear regression model.


This is an assignment-based course. You'll be evaluated as follows:

Assessment Weight Deadline Submit to...
Lab Assignment 1 15% Saturday, Nov 24 at 18:00 Github
Lab Assignment 2 15% Saturday, Dec 1 at 18:00 Github
Lab Assignment 3 15% Saturday, Dec 8 at 18:00 Github
Lab Assignment 4 15% Wed, Dec 12 at 18:00 Github
Quiz 1 20% Monday, Sept 24, 14:00-14:30 TBD (aiming for Canvas)
Quiz 2 20% Thursday, December 13 TBD (aiming for Canvas)

Lecture Schedule

Lecture Topic
1 Review of statistical inference, connection between 2-samples t-test, ANOVA and linear regression
2 Linear model in general matrix notation, different type of predictors, interpretation of coefficients and parametrizations, estimation and inference
3 Continuous and categorical predictors, interaction term, interpretation of coefficients, estimation and inference
4 Least squares estimation, fitted values, residuals, confidence intervals
5 Multiple linear regression, out-of-sample predictions, prediction intervals
6 Goodness of fit, estimation error, prediction error
7 Transformations, multicollinearity, diagnostics, unusual and influential data
8 Bootstrapping

Annotated Resources

  1. Intro to Statistical Learning (ISLR), especially Chapter 3.
    • A modern and approachable take on statistics / machine learning.
  2. R for Data Science (r4ds), especially Part IV.
    • Practical and approachable book on the use of R for data science.
  3. Linear Models with R
    • Comprehensive book on linear models.
  4. OpenIntro Statistics
    • Fairly accessible, seems to lean towards a traditional approach. Chapters 7 & 8 are relevant for linear regression.

6. Policies

Please see the general MDS policies.


No description or website provided.







No releases published


No packages published