- 
          
 - 
                Notifications
    
You must be signed in to change notification settings  - Fork 124
 
Closed
Labels
Description
Summary:
We should talk about how to replace a normal over a data set with sufficient statistics.
Description:
These models induce the same posteriors over mu and sigma.  You can swap out the prior for something else.
normal.stan
data {
  int N;
  vector[N] y;
}
parameters {
  real mu;
  real<lower=0> sigma;
}
model {
  y ~ normal(mu, sigma);
}normal-suff.stan
data {
  int N;
  vector[N] y;
}
transformed data {
  real mean_y = mean(y);
  real<lower=0> var_y = variance(y);
  real nm1_over2 = (N - 1) / 2.0;
  real sqrt_N = sqrt(N);
}
parameters {
  real mu;
  real<lower=0> sigma;
}
model {
  mean_y ~ normal(mu, sigma / sqrt_N);
  var_y ~ gamma(nm1_over2, nm1_over2 / sigma^2);
}And you can add priors for mu and sigma and still get the same results.  Here's some simulation code in Python using cmdstanpy:
import numpy as np
import cmdstanpy as csp
N = 1000
mu = -1.3
sigma = 3.2
y = np.random.normal(size=1000, loc=mu, scale=sigma)
data = {'N': N, 'y': y}
for path in ['normal.stan', 'normal-suff.stan']:
    print(f"{path=}")
    model = csp.CmdStanModel(stan_file = path)
    fit = model.sample(data = data)
    print(fit.summary())Additional Information:
This originated from a forum discussion:
Current Version:
v2.18.0