### Preamble and Imports

* This notebook is modified notebooks originally developed in [Pankaj Mehta's ML for physics course](http://physics.bu.edu/~pankajm/PY580.html), although I've added some modification based on [Volodymyr Kuleshov's applied ML course](https://github.com/kuleshov/cornell-cs5785-2022-applied-ml)*

In [1]:
## Preamble / required packages
import numpy as np
import scipy.sparse as sp
np.random.seed(0)

# Import local plotting functions and in-notebook display functions
import matplotlib.pyplot as plt
from IPython.display import Image, display
%matplotlib inline

import warnings
# Comment this out to activate warnings
warnings.filterwarnings('ignore')


# Bayesian methods



## Bayesian linear regression

In this notebook, we will explore Bayesian linear regression. We will start with a simple example and then move on to a more complex example. We will also compare the Bayesian approach with the frequentist approach.

### Simple example

Let's start with a simple example. We will generate some data from a linear model with Gaussian noise. We will then fit a linear model to the data using Bayesian linear regression. We will compare the results with the frequentist approach.

First, let's import the necessary libraries.

```python
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pymc3 as pm
import pandas as pd
import arviz as az
```

We will generate the data from a linear model with Gaussian noise. The model is

$$
y = \beta_0 + \beta_1 x + \epsilon
$$

where $\beta_0$ and $\beta_1$ are the parameters of the model and $\epsilon$ is the Gaussian noise. We will generate the data using the following code.

```python

# Generate data
np.random.seed(123)
n = 100
x = np.random.randn(n)
eps = np.random.randn(n)
beta0 = 1
beta1 = 2
y = beta0 + beta1 * x + eps
```

Let's plot the data.

```python
plt.plot(x, y, 'o')
plt.xlabel('x')
plt.ylabel('y')
```

We will now fit a linear model to the data using Bayesian linear regression. We will use the `pymc3` library to do this. The model is

$$
y \sim \mathcal{N}(\beta_0 + \beta_1 x, \sigma^2)
$$

where $\beta_0$, $\beta_1$, and $\sigma$ are the parameters of the model. We will use a uniform prior for $\beta_0$ and $\beta_1$ and a half-normal prior for $\sigma$. We will use the following code to fit the model.

```python
with pm.Model() as model:
    # Priors
    beta0 = pm.Uniform('beta0', lower=-10, upper=10)
    beta1 = pm.Uniform('beta1', lower=-10, upper=10)
    sigma = pm.HalfNormal('sigma', sd=1)
    
    # Likelihood
    y_pred = pm.Normal('y_pred', mu=beta0 + beta1 * x, sd=sigma, observed=y)
    
    # Inference
    trace = pm.sample(1000, tune=1000, cores=2)
```

Let's plot the posterior distributions of the parameters.

```python
az.plot_trace(trace)
```

We can also plot the posterior predictive distribution of the model.

```python
az.plot_posterior_predictive_glm(trace, samples=100, label='posterior predictive regression lines')
plt.plot(x, y, 'o', label='data')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
```


In [None]:
LightGBM
XGBOOST