This repository, contains the Jupyter notebooks and datasets companions to the OpenClassrooms course: Design Statistical Models
We start with the core concepts required to build linear regression models
- Linearity
- Correlation
- Hypothesis testing
We move on to univariate and multivariate linear regression. We start with hands-on applications to standard datasets and follow up with the underlying theoretical basis.
- Univariate and multivariate linear regression
- The 5 assumptions of linear regression
- The mathematical basis for Linear regression
We expand the framework of linear regression for classification, handling categorical variables and polynomial regression.
- Logistic regression
- Dealing with categorical variables
- Polynomial regression
In the last part we move from statistical modeling to a predictive analytics approach and address overfitting, cross validation and classification metrics.
- Predicting with linear regression
- Classification metrics and model selection
The datasets are available in the /data folder.