# Bayesian Linear Regression

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

import pymc3 as pm
import theano.tensor as tt

## Linear Regression

Linear regression is a standard quantitative tool in every statistician's toolbox. It provides a simple, easy-to-understand framework for expressing a linear relationship between dependent and independent variables and can be readily applied in many situations. A linear regression takes the form of:

$$ Y = X\beta + \epsilon $$

Where $Y$ is the dependent variable, $X$ is our independent variable, $\beta$ are the coefficients for each feature in $X$, and $\epsilon$ is our error, assumed to be normally-distributed.

There are several ways to fit the coefficients. We typically use either Ordinary Least Squares (OLS) or Maximum Likelihood Estimation (MLE).

## Probabilistic Regression

Our purpose here is to define a Bayesian, probabilistic linear regression method for easy use in Python. To this end, we will be using PyMC3, but it is good to review the basic theory of what we are dealing with. To reformulate this basic linear regression with a Bayesian methodology, we say that:

$$ Y \sim \mathcal{N}(X\beta, \sigma^2) $$

With Bayesian inference, a key component is that we define the pieces of our model as probability distributions. This makes our dependent variable a probability distribution as well. In this case, we are assuming that $Y$ is normally-distributed. The mean of this distribution is our linear value $X\beta$, while the variance is $\sigma^2$ (the variance of our error from the original model).