You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This lecture uses Bayesian methods offered by [pymc](https://www.pymc.io/projects/docs/en/stable/) and [numpyro](https://num.pyro.ai/en/stable/) to make statistical inferences about two parameters of a univariate first-order autoregression.
46
+
This lecture uses Bayesian methods offered by [pymc](https://www.pymc.io/projects/docs/en/stable/) and [numpyro](https://num.pyro.ai/en/stable/) to make statistical inferences about two parameters of a univariate first-order autoregression.
47
47
48
48
49
-
The model is a good laboratory for illustrating
49
+
The model is a good laboratory for illustrating
50
50
consequences of alternative ways of modeling the distribution of the initial $y_0$:
51
51
52
52
- As a fixed number
53
-
53
+
54
54
- As a random variable drawn from the stationary distribution of the $\{y_t\}$ stochastic process
We want to study how inferences about the unknown parameters $(\rho, \sigma_x)$ depend on what is assumed about the parameters $\mu_0, \sigma_0$ of the distribution of $y_0$.
95
95
96
-
Below, we study two widely used alternative assumptions:
96
+
Below, we study two widely used alternative assumptions:
97
97
98
98
- $(\mu_0,\sigma_0) = (y_0, 0)$ which means that $y_0$ is drawn from the distribution ${\mathcal N}(y_0, 0)$; in effect, we are **conditioning on an observed initial value**.
99
99
100
100
- $\mu_0,\sigma_0$ are functions of $\rho, \sigma_x$ because $y_0$ is drawn from the stationary distribution implied by $\rho, \sigma_x$.
101
-
102
101
103
-
102
+
103
+
104
104
**Note:** We do **not** treat a third possible case in which $\mu_0,\sigma_0$ are free parameters to be estimated.
105
-
106
-
Unknown parameters are $\rho, \sigma_x$.
105
+
106
+
Unknown parameters are $\rho, \sigma_x$.
107
107
108
108
We have independent **prior probability distributions** for $\rho, \sigma_x$ and want to compute a posterior probability distribution after observing a sample $\{y_{t}\}_{t=0}^T$.
109
109
110
-
The notebook uses `pymc4` and `numpyro` to compute a posterior distribution of $\rho, \sigma_x$.
110
+
The notebook uses `pymc4` and `numpyro` to compute a posterior distribution of $\rho, \sigma_x$.
111
111
112
112
113
-
Thus, we explore consequences of making these alternative assumptions about the distribution of $y_0$:
113
+
Thus, we explore consequences of making these alternative assumptions about the distribution of $y_0$:
114
114
115
-
- A first procedure is to condition on whatever value of $y_0$ is observed. This amounts to assuming that the probability distribution of the random variable $y_0$ is a Dirac delta function that puts probability one on the observed value of $y_0$.
115
+
- A first procedure is to condition on whatever value of $y_0$ is observed. This amounts to assuming that the probability distribution of the random variable $y_0$ is a Dirac delta function that puts probability one on the observed value of $y_0$.
116
116
117
-
- A second procedure assumes that $y_0$ is drawn from the stationary distribution of a process described by {eq}`eq:themodel`
118
-
so that $y_0 \sim {\cal N} \left(0, {\sigma_x^2\over (1-\rho)^2} \right) $
117
+
- A second procedure assumes that $y_0$ is drawn from the stationary distribution of a process described by {eq}`eq:themodel`
118
+
so that $y_0 \sim {\cal N} \left(0, {\sigma_x^2\over (1-\rho)^2} \right) $
119
119
120
120
When the initial value $y_0$ is far out in a tail of the stationary distribution, conditioning on an initial value gives a posterior that is **more accurate** in a sense that we'll explain.
121
121
122
-
Basically, when $y_0$ happens to be in a tail of the stationary distribution and we **don't condition on $y_0$**, the likelihood function for $\{y_t\}_{t=0}^T$ adjusts the posterior distribution of the parameter pair $\rho, \sigma_x $ to make the observed value of $y_0$ more likely than it really is under the stationary distribution, thereby adversely twisting the posterior in short samples.
122
+
Basically, when $y_0$ happens to be in a tail of the stationary distribution and we **don't condition on $y_0$**, the likelihood function for $\{y_t\}_{t=0}^T$ adjusts the posterior distribution of the parameter pair $\rho, \sigma_x $ to make the observed value of $y_0$ more likely than it really is under the stationary distribution, thereby adversely twisting the posterior in short samples.
123
123
124
124
An example below shows how not conditioning on $y_0$ adversely shifts the posterior probability distribution of $\rho$ toward larger values.
125
125
126
126
127
-
We begin by solving a **direct problem** that simulates an AR(1) process.
127
+
We begin by solving a **direct problem** that simulates an AR(1) process.
128
128
129
129
How we select the initial value $y_0$ matters.
130
130
131
-
* If we think $y_0$ is drawn from the stationary distribution ${\mathcal N}(0, \frac{\sigma_x^{2}}{1-\rho^2})$, then it is a good idea to use this distribution as $f(y_0)$. Why? Because $y_0$ contains information
132
-
about $\rho, \sigma_x$.
133
-
131
+
* If we think $y_0$ is drawn from the stationary distribution ${\mathcal N}(0, \frac{\sigma_x^{2}}{1-\rho^2})$, then it is a good idea to use this distribution as $f(y_0)$. Why? Because $y_0$ contains information about $\rho, \sigma_x$.
132
+
134
133
* If we suspect that $y_0$ is far in the tails of the stationary distribution -- so that variation in early observations in the sample have a significant **transient component** -- it is better to condition on $y_0$ by setting $f(y_0) = 1$.
135
-
134
+
136
135
137
136
To illustrate the issue, we'll begin by choosing an initial $y_0$ that is far out in a tail of the stationary distribution.
@@ -402,4 +401,3 @@ is telling `numpyro` to explain what it interprets as "explosive" observations
402
401
Bayes' Law is able to generate a plausible likelihood for the first observation by driving $\rho \rightarrow 1$ and $\sigma \uparrow$ in order to raise the variance of the stationary distribution.
403
402
404
403
Our example illustrates the importance of what you assume about the distribution of initial conditions.
0 commit comments