Skip to content
Design Statistical Models on OpenClassrooms
Jupyter Notebook
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
P1CH3_01 Calculate Correlation.ipynb
P1CH3_02 Anscombes Quartet DatasaurusDozen.ipynb
P1CH4_01 Hypothesis Testing The T-test.ipynb
P1CH4_02 Hypothesis Testing The Kolmogorov Smirnoff test.ipynb
P2CH1 Univariate Regression.ipynb
P2CH2 Mutivariate Regression.ipynb
P2CH3_01 Assumptions of Linearity and Collinearity.ipynb
P2CH3_02 Linear Regression Assumptions on Residuals.ipynb
P3CH1 Logistic Regression.ipynb
P3CH2 Categorical Predictors.ipynb
P3CH3 Polynomial Regression.ipynb
P4CH1 Predicting and Model Selection.ipynb
P4CH2 Evaluating Classification Models.ipynb


This repository, contains the Jupyter notebooks and datasets companions to the OpenClassrooms course: Design Statistical Models

Part I

We start with the core concepts required to build linear regression models

  • Linearity
  • Correlation
  • Hypothesis testing

Part II

We move on to univariate and multivariate linear regression. We start with hands-on applications to standard datasets and follow up with the underlying theoretical basis.

  • Univariate and multivariate linear regression
  • The 5 assumptions of linear regression
  • The mathematical basis for Linear regression

Part III

We expand the framework of linear regression for classification, handling categorical variables and polynomial regression.

  • Logistic regression
  • Dealing with categorical variables
  • Polynomial regression

Part IV

In the last part we move from statistical modeling to a predictive analytics approach and address overfitting, cross validation and classification metrics.

  • Predicting with linear regression
  • Classification metrics and model selection

The datasets are available in the /data folder.

You can’t perform that action at this time.