# Introduction to Regression with statsmodels in Python

> Linear regression and logistic regression are two of the most widely used statistical models. They act like master keys, unlocking the secrets hidden in your data. Updating ...

- toc: true
- branch: master
- badges: true
- comments: true
- author: Datacamp
- categories: [statsmodels, Regression, Linear regression, logistic regression, Visualizations, Taiwanese house prices, Facebook advert clicks]
- image: images/intro_statsmodels.png
- hide: false
- search_exclude: true
- metadata_key1: metadata_value1
- metadata_key2: metadata_value2


[**Download Datasets and Presentation slides for this post HERE**](https://github.com/anhhaibkhn/Data-Science-selfstudy-notes-Blog/tree/master/_notebooks/Introduction%20to%20Regression%20with%20statsmodels%20in%20Python)

> Linear regression and logistic regression are two of the most widely used statistical models. They act like master keys, unlocking the secrets hidden in your data. In this course, you’ll gain the skills you need to fit simple linear and logistic regressions. Through hands-on exercises, you’ll explore the relationships between variables in real-world datasets, including motor insurance claims, Taiwan house prices, fish sizes, and more. By the end of this course, you’ll know how to make predictions from your data, quantify model performance, and diagnose problems with model fit.

In [1]:
import pandas as pd
import numpy as np
import warnings
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = [8, 6]

pd.set_option('display.expand_frame_repr', False)

warnings.filterwarnings("ignore", category=DeprecationWarning)
warnings.filterwarnings("ignore", category=FutureWarning)

## Summary Statistics

> Summary statistics gives you the tools you need to boil down massive datasets to reveal the highlights. In this chapter, you'll explore summary statistics including mean, median, and standard deviation, and learn how to accurately interpret them. You'll also develop your critical thinking skills, allowing you to choose the best summary statistics for your data.


## Simple Linear Regression Modeling

> You’ll learn the basics of this popular statistical model, what regression is, and how linear and logistic regressions differ. You’ll then learn how to fit simple linear regression models with numeric and categorical explanatory variables, and how to describe the relationship between the response and explanatory variables using model coefficients.


### A tale of two variables
> **Which one is the response variable?**
> **Visualizing two numeric variables**

### Fitting a linear regression
> **Estimate the intercept**
> **Estimate the slope**
> **Linear regression with ols()**

### Categorical explanatory variables
> **Visualizing numeric vs. categorical**
> **Calculating means by category**
> **Linear regression with a categorical explanatory variable**


## Predictions and model objects

> In this chapter, you’ll discover how to use linear regression models to make predictions on Taiwanese house prices and Facebook advert clicks. You’ll also grow your regression skills as you get hands-on with model objects, understand the concept of "regression to the mean", and learn how to transform variables in a dataset.


### Making predictions
> **Predicting house prices**
> **Visualizing predictions**
> **The limits of prediction**

### Working with model objects
> **Extracting model elements**
> **Manually predicting house prices**
> **Regression to the mean**
> **Home run!**
> **Plotting consecutive portfolio returns**
> **Modeling consecutive returns**

### Transforming variables
> **Transforming the explanatory variable**
> **Transforming the response variable too**
> **Back transformation**


## Assessing model fit

> In this chapter, you’ll learn how to ask questions of your model to assess fit. You’ll learn how to quantify how well a linear regression model fits, diagnose model problems using visualizations, and understand each observation's leverage and influence to create the model.


### Quantifying model fit
> **Coefficient of determination**
> **Residual standard error**

### Visualizing model fit
> **Residuals vs. fitted values**
> **Q-Q plot of residuals**
> **Scale-location**
> **Drawing diagnostic plots**

### Outliers, leverage, and influence
> **Leverage**
> **Influence**
> **xtracting leverage and influence**


## Simple Logistic Regression Modeling

> Learn to fit logistic regression models. Using real-world data, you’ll predict the likelihood of a customer closing their bank account as probabilities of success and odds ratios, and quantify model performance using confusion matrices.


### Why you need logistic regression
> **Exploring the explanatory variables**
> **Visualizing linear and logistic models**
> **Logistic regression with logit()**

### Predictions and odds ratios
> **Probabilities**
> **Most likely outcome**
> **Odds ratio**
> **Log odds ratio**

### Quantifying logistic regression fit
> **Calculating the confusion matrix**
> **Drawing a mosaic plot of the confusion matrix**
> **Accuracy, sensitivity, specificity**
> **Measuring logistic model performance**

### Congratulations