Bayes Rule Book:

https://www.bayesrulesbook.com/chapter-11.html

Materials from the Bayes Rule github:

https://github.com/bayes-rules/bayesrules

# Imports

In [1]:
import math, pyreadr
import pandas as pd
import numpy as np
import plotly.graph_objects as go
import plotly.express as px
import plotly.figure_factory as ff
from plotly.subplots import make_subplots
from scipy.stats import norm, beta, binom, mode
from os.path import exists

import pyro
import torch as t
import pyro.distributions as dist
import pyro.distributions.constraints as constraints
from pyro.infer import MCMC
from pyro.infer.mcmc.nuts import HMC, NUTS

device = t.device("cuda" if t.cuda.is_available() else "cpu")
t.set_default_tensor_type(t.FloatTensor)
if t.cuda.is_available():
    t.set_default_tensor_type(t.cuda.FloatTensor)


# Bayesian Regression with predictor

## Australian Temperatures

## Data

In [5]:
file_name = 'weather_WU'
folder = 'ch11'

data_url = f"https://github.com/bayes-rules/bayesrules/raw/master/data/{file_name}.rda"

if exists(f"/Users/zr/Geek/tutorials/bayesian_rules/{folder}/{file_name}.csv"):
    df = pd.read_csv(f"/Users/zr/Geek/tutorials/bayesian_rules/{folder}/{file_name}.csv")
else:
    # pyreadr downloads remote file, saves locally and converts the RDA datafile to a pandas DataFrame
    file_path = f"/Users/zr/Geek/tutorials/bayesian_rules/{folder}/{file_name}.rda"
    pyreadr.download_file(data_url, file_path)
    result = pyreadr.read_r(file_path)
    df = result[file_name]
    df.to_csv(f"/Users/zr/Geek/tutorials/bayesian_rules/{folder}/{file_name}.csv")

In [11]:
keep_columns=[
    'location',
    'windspeed9am',
    'humidity9am',
    'pressure9am',
    'temp9am',
    'temp3pm'
]
df = df[keep_columns]

In [12]:
print(df.location.value_counts())
df.head().T

Wollongong    100
Uluru         100
Name: location, dtype: int64


Unnamed: 0,0,1,2,3,4
location,Uluru,Uluru,Uluru,Uluru,Uluru
windspeed9am,20,9,7,28,24
humidity9am,23,71,15,29,10
pressure9am,1023.3,1012.9,1012.3,1016,1010.5
temp9am,20.9,23.4,24.1,26.4,36.7
temp3pm,29.7,33.9,39.7,34.2,43.3


## Simple Modeling

>Let’s begin our analysis with the familiar: a simple Normal regression model of temp3pm with one quantitative predictor, the morning temperature temp9am, both measured in degrees Celsius. As you might expect, there’s a positive association between these two variables – the warmer it is in the morning, the warmer it tends to be in the afternoon:

In [14]:
px.scatter(x=df.temp9am, y=df.temp3pm)

Modeling the above:

Our **Slope**, looking at two arbitrary points: `(9.8, 18.5), (30, 36.3)`, our slope is $\sim \frac{30-9.8}{36.3-18.5} = \frac{20.2}{17.8} \sim 1.13$

For **Slope $\sigma$**, `2x` seems to high, while `0x` is too low. Let's go with $\sim 1$ 

Our **Intercept** then, using our first x point `(9.8, 18.5)` is $18.5 - 1.13 * 9.8 = 7.43$

And for **Intercept $\sigma$**, Looking at points around $x=20$, we have a range `[20.2 ... 32.4]` = `12.2`, so let's try $\sqrt{10} \sim 3$

Finally, for model wide $\sigma$, we'll use the same rational we used for **Intercept $\sigma$**, $\sim 3$

Our model:

- $Y_{i}|\beta_{0},\beta_{1},\sigma \sim N(\mu_{i}, \sigma^2) ~ with ~ \mu_{i} = \beta_{0} + \beta_{1} x_{i}$
- $\beta_{0} \sim N(7.43, 3)$
- $\beta_{1} \sim N(1.13, 1)$
- $\sigma \sim E(2)$ *#Used after graphing different loc values for the exponential dist*

Book uses the following, including a centered $\beta_{0,c}$:

- $Y_{i}|\beta_{0},\beta_{1},\sigma \sim N(\mu_{i}, \sigma^2) ~ with ~ \mu_{i} = \beta_{0} + \beta_{1} x_{i}$
- $\beta_{0} \sim N(25, 5)$
- $\beta_{1} \sim N(0, 3.1)$
- $\sigma \sim E(0.13)$ *

In [65]:
def model(x, y=None):
    intercept = pyro.sample('intercept', dist.Normal(7.43, 3))
    slope     = pyro.sample('slope', dist.Normal(2, 3.1))
    sigma     = pyro.sample('sigma', dist.Exponential(0.13))

    mu = intercept + slope * x

    with pyro.plate('data', len(x)):
        return pyro.sample('obs', dist.Normal(mu, sigma), obs=y)

In [70]:
x_data = t.tensor(df.temp9am.values, dtype=t.float)
y_data = t.tensor(df.temp3pm.values, dtype=t.float)

pyro.clear_param_store()
mcmc = MCMC(NUTS(model), num_samples=1000, warmup_steps=250)
mcmc.run(x_data, y_data)
mcmc.summary()

Sample: 100%|██████████| 1250/1250 [00:28, 44.06it/s, step size=2.22e-01, acc. prob=0.924]



                 mean       std    median      5.0%     95.0%     n_eff     r_hat
  intercept      4.47      0.87      4.46      3.01      5.82    420.54      1.00
      sigma      4.13      0.22      4.11      3.77      4.46    440.11      1.00
      slope      1.03      0.04      1.03      0.96      1.09    405.55      1.00

Number of divergences: 0


From Book,

>Per the 80% credible interval for β1, there’s an 80% posterior probability that for every one degree increase in `temp9am`, the average increase in `temp3pm` is somewhere between $0.98$ and $1.1$ degrees.

Above, we see our 95% interval for slope is $0.96$ to $1.09$ which is pretty close.

From Book,

>Further, per the 80% credible interval for standard deviation σ, this relationship is fairly strong – observed afternoon temperatures tend to fall somewhere between only $3.87$ and $4.41$ degrees

Above, our σ is nearly exact.


In [72]:
sample_betas = pd.DataFrame(mcmc.get_samples())[['intercept', 'slope']]

def pull_sample():
    b0, b1 = sample_betas.sample(1).values[0]
    X = np.arange(df.temp9am.min() - 5, df.temp9am.max() + 5)
    Y = [b0+b1*x for x in X]
    return X,Y

fig = go.Figure()
fig.add_trace(
    go.Scatter(x=df.temp9am, y=df.temp3pm, mode='markers')
)
for i in range(10):
    x, y = pull_sample()
    fig.add_trace(
        go.Scatter(x=x, y=y)
    )

fig.update_layout(showlegend=False)
fig

## Adding location feature

The `location` feature is categorical. Exploring its relationship to `temp3pm`

In [92]:
px.scatter(df, x='temp9am', y='temp3pm', color='location', trendline='ols')

We can convert our categorical feature, `location` by:
$
    x_{2} = \begin{cases} 
        1, & if ~ Wollongong \\
        0, & else
    \end{cases}
$

In [137]:
def model(x1, x2, y=None):
    intercept = pyro.sample('intercept', dist.Normal(7.43, 3))
    slope     = pyro.sample('slope', dist.Normal(2, 3.1))
    location  = pyro.sample('location', dist.Normal(0, 2))
    sigma     = pyro.sample('sigma', dist.Exponential(0.13))

    mu = intercept + slope * x1 + location * x2

    with pyro.plate('data', len(x1)):
        return pyro.sample('obs', dist.Normal(mu, sigma), obs=y)

In [138]:
x2_data = t.tensor(df.location.replace({'Uluru':0, 'Wollongong':1}).values, dtype=t.float)

pyro.clear_param_store()
mcmc = MCMC(NUTS(model), num_samples=1000, warmup_steps=200)
mcmc.run(x_data, x2_data, y_data)
mcmc.summary()

Sample: 100%|██████████| 1200/1200 [00:50, 23.79it/s, step size=2.09e-01, acc. prob=0.949]


                 mean       std    median      5.0%     95.0%     n_eff     r_hat
  intercept     10.87      0.61     10.90      9.88     11.86    266.94      1.00
   location     -6.79      0.33     -6.79     -7.41     -6.30    478.81      1.00
      sigma      2.39      0.12      2.39      2.18      2.57    639.57      1.00
      slope      0.87      0.03      0.87      0.83      0.92    272.18      1.00

Number of divergences: 0





In [141]:
sample_betas = pd.DataFrame(mcmc.get_samples())[['intercept', 'slope', 'location']]

def pull_sample(use_b2=False):
    b0, b1, b2 = sample_betas.sample(1).values[0]
    X = np.arange(df.temp9am.min() - 1, df.temp9am.max() + 1)
    Y = [b0+b1*x+(b2 if use_b2 else 0) for x in X]
    return X,Y

fig = go.Figure()
fig.add_trace(
    go.Scatter(x=df.temp9am, y=df.temp3pm, mode='markers')
)
for i in range(10):
    x, y = pull_sample(use_b2=False)
    fig.add_trace(
        go.Scatter(x=x, y=y, marker_color='red')
    )

    x, y = pull_sample(use_b2=True)
    fig.add_trace(
        go.Scatter(x=x, y=y, marker_color='blue')
    )

fig.update_layout(showlegend=False)
fig

## Adding an interaction term

Consider `humidity9am`:

In [143]:
px.scatter(df, x='humidity9am', y='temp3pm', color='location', trendline='ols')

Without interaction between `humidity` and `location`, our model would resemble:

$\mu = intercept+slope*x_{0}+location*x_{1}+humidity*x_{3}$

However, if `humidity` and `location` do interact, we can add an interactions term:

$\mu = intercept+slope*x_{0}+location*x_{1}+humidity*x_{2}+interaction_{loc,hum} * x_{1}x_{2}$

In [144]:
def model(x1, x2, x3, y=None):
    intercept   = pyro.sample('intercept', dist.Normal(7.43, 3))
    slope       = pyro.sample('slope', dist.Normal(2, 3.1))
    location    = pyro.sample('location', dist.Normal(0, 2))
    humidity    = pyro.sample('humidity', dist.Normal(0, 100))
    interaction = pyro.sample('interaction', dist.Normal(0, 100))
    sigma       = pyro.sample('sigma', dist.Exponential(0.13))

    mu = intercept + slope * x1 + location * x2 + humidity * x3 + interaction * x2 * x3

    with pyro.plate('data', len(x1)):
        return pyro.sample('obs', dist.Normal(mu, sigma), obs=y)

In [147]:
x3_data = t.tensor(df.humidity9am.values, dtype=t.float)

pyro.clear_param_store()
mcmc = MCMC(NUTS(model), num_samples=1000, warmup_steps=200)
mcmc.run(x_data, x2_data, x3_data, y_data)
mcmc.summary()

Warmup:   0%|          | 0/125 [53:18, ?it/s]0it/s, step size=3.19e-03, acc. prob=0.797]
Warmup:   0%|          | 0/125 [49:37, ?it/s]
Sample: 100%|██████████| 1200/1200 [02:09,  9.29it/s, step size=1.04e-01, acc. prob=0.936]


                   mean       std    median      5.0%     95.0%     n_eff     r_hat
     humidity     -0.03      0.01     -0.03     -0.05     -0.01    467.42      1.01
  interaction     -0.00      0.02     -0.00     -0.03      0.03    339.89      1.01
    intercept     13.01      0.94     13.05     11.38     14.40    366.39      1.01
     location     -6.10      1.06     -6.12     -7.86     -4.39    357.73      1.01
        sigma      2.33      0.12      2.32      2.14      2.53    852.96      1.00
        slope      0.83      0.03      0.83      0.78      0.88    432.16      1.00

Number of divergences: 0





In [165]:
sample_betas = pd.DataFrame(mcmc.get_samples())[['intercept', 'slope', 'location', 'humidity', 'interaction']]

def pull_sample():
    b0, b1, b2, b3, b4 = sample_betas.sample(1).values[0]
    y = [(b0+b1*x1+b2*x2+b3*x3+b4*x2*x3).item() for x1,x2,x3 in zip(x_data, x2_data, x3_data)]
    return y

fig = go.Figure()
fig.add_trace(
    go.Scatter(x=df.temp9am, y=df.temp3pm, mode='markers')
)
for i in range(10):
    y = pull_sample()
    fig.add_trace(
        go.Scatter(x=df.temp9am, y=y, marker_color='red', mode='markers')
    )

fig.update_layout(showlegend=False)
fig