# Bayesian linear regression with pystan

We will now conduct linear regression in a Bayesian framework. We use the [Stan probabilistic programming language](https://mc-stan.org/), which allows full Bayesian statistical inference. A Stan model is a block of text which can either be written in a separate file, or in the same script as the current code. A model defined in its own file can then be called within either language: R, Python, Julia...

This tutorial uses the same case study as the OLR tutorials with Python and R. Here, Stan is used with Python.

In the previous OLR tutorial, we selected a linear model with three predictors `(T_i-T_e)`, `I_{sol}` and `(T_i-T_s)`. The model can be written in probability form: each of the data points `e_{hp,n}` is normally distributed with a constant noise standard deviation $\sigma$:

$$ e_{hp,n} \sim N( \theta_1 (T_i-T_e)_n + \theta_2 I_{sol,n} + \theta_3 (T_i-T_s)_n, \sigma) $$

In [1]:
# The holy trinity
import pandas as pd
import stan

# Opening the data file and showing the timestamps to pandas
df = pd.read_csv('data/linearregression.csv')
df.set_index(pd.to_datetime(df['TIMESTAMP']), inplace=True, drop=True)

df['tits'] = df['ti'] - df['ts']
df['vtite'] = df['wind_speed'] * (df['ti'] - df['te'])
df['tite'] = df['ti'] - df['te']
df['titg'] = df['ti'] - df['tg']

lr_model= """
data {
  int<lower=0> N;   // number of data items
  int<lower=0> K;   // number of predictors
  matrix[N, K] x;   // predictor matrix
  vector[N] y;      // outcome vector
}
parameters {
  vector[K] theta;       // coefficients for predictors
  real<lower=0> sigma;  // error scale
}
model {
  y ~ normal(x * theta, sigma);  // likelihood
}
"""

lr_data = {'N': len(df),
            'K': 3,
            'x': df[['tite', 'i_sol', 'tits']].values,
            'y': df['e_hp'].values}

posterior = stan.build(lr_model, data=lr_data)
fit = posterior.sample(num_chains=2, num_samples=1000, cores=1)
df_post = fit.to_frame()

print(fit)

ERROR: Error in parse(text = x, srcfile = src): <text>:2:8: unexpected symbol
1: # The holy trinity
2: import pandas
          ^
