# DeltaAnalytics/machine_learning_for_good

Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
..
Failed to load latest commit information.
images
2_1_linear_regression-build_univariate_model.ipynb
2_2_linear_regression_check_assumptions.ipynb
2_3_linear_regression_build_multivariate_model.ipynb
2_4_polynomial_regression.ipynb
2_5_linear_regression_regularization.ipynb

# Module 2: Linear Regression

================================================================================

Welcome to module 2 of the introductory course to data for good where we will be exploring linear regression - the first machine learning algorithm of this course!

## Goals

By end of this module one should feel comfortable with the fundamentals of linear regression. Specific topics included are:

1. How to split the data between training and test data
2. Using training data to train a linear regression model
3. Analyzing the results of the model
4. Checking the assumptions of linear regression
5. Building a multivariate regressor

## Topic overview

Linear Regression is a parametric model which predicts a continuous outcome feature (Y) from one or more explanatory features (X).

Y = beta_0 + beta_1 * X

beta_0 is called the intercept term, and represents the expected mean value of Y when all explanatory features equal 0.
beta_1 is called a beta coefficient, and represents the expected change in the value of Y that results from a one unit change in X.

This is module fits a straight line to your data, where the value of the outcome feature can be calculated as a linear combination of the explanatory features. Sounds relatively simple? Afraid not, there are many nuances and conditions that need to be understood before using linear regression! We are going to delve into these assumptions and conditions and then demonstrate how to use this algorithm on the kiva dataset.

## Resources

Linear regression is one member of a family of linear parametric models. Some additional advanced topics we recommend looking up are...

### Logistic regression

Logistic regression is very similar to linear regression but has a categorical outcome instead. So rather than modeling a continuous dependent variable, it models a binary classification - yes or no, true or false, 1 or 0. This is still a linear model as it assumes a linear relationship between the independent variables and the link function.