Skip to content
Merged
1 change: 1 addition & 0 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ dependencies:
- ghp-import==1.1.0
- sphinxcontrib-youtube==1.1.0
- sphinx-togglebutton==0.3.1
- arviz==0.12.1
# Sandpit Requirements
- quantecon
- array-to-latex
Expand Down
8 changes: 3 additions & 5 deletions lectures/ar1_bayes.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,7 @@ kernelspec:
name: python3
---

## Posterior Distributions for AR(1) Parameters


# Posterior Distributions for AR(1) Parameters

We'll begin with some Python imports.

Expand Down Expand Up @@ -174,7 +172,7 @@ Now we shall use Bayes' law to construct a posterior distribution, conditioning

First we'll use **pymc4**.

### `PyMC` Implementation
## `PyMC` Implementation

For a normal distribution in `pymc`,
$var = 1/\tau = \sigma^{2}$.
Expand Down Expand Up @@ -286,7 +284,7 @@ We'll return to this issue after we use `numpyro` to compute posteriors under ou

We'll now repeat the calculations using `numpyro`.

### `Numpyro` Implementation
## `Numpyro` Implementation

```{code-cell} ipython3

Expand Down
89 changes: 39 additions & 50 deletions lectures/ar1_turningpts.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,12 @@ kernelspec:

# Forecasting an AR(1) process

```{code-cell} ipython3
:tags: [hide-output]

!pip install arviz pymc
```

This lecture describes methods for forecasting statistics that are functions of future values of a univariate autogressive process.

The methods are designed to take into account two possible sources of uncertainty about these statistics:
Expand All @@ -25,7 +31,7 @@ We consider two sorts of statistics:

- prospective values $y_{t+j}$ of a random process $\{y_t\}$ that is governed by the AR(1) process

- sample path properties that are defined as non-linear functions of future values $\{y_{t+j}\}_{j\geq 1}$ at time $t$.
- sample path properties that are defined as non-linear functions of future values $\{y_{t+j}\}_{j \geq 1}$ at time $t$.

**Sample path properties** are things like "time to next turning point" or "time to next recession"

Expand Down Expand Up @@ -54,54 +60,48 @@ logger = logging.getLogger('pymc')
logger.setLevel(logging.CRITICAL)
```

## A Univariate First-Order Autoregressive Process

## A Univariate First-Order Autoregressive Process

Consider the univariate AR(1) model:

$$
y_{t+1} = \rho y_t + \sigma \epsilon_{t+1}, \quad t \geq 0
$$ (eq1)
$$ (ar1-tp-eq1)

where the scalars $\rho$ and $\sigma$ satisfy $|\rho| < 1$ and $\sigma > 0$;
$\{\epsilon_{t+1}\}$ is a sequence of i.i.d. normal random variables with mean $0$ and variance $1$.

The initial condition $y_{0}$ is a known number.
The initial condition $y_{0}$ is a known number.

Equation {eq}`eq1` implies that for $t \geq 0$, the conditional density of $y_{t+1}$ is
Equation {eq}`ar1-tp-eq1` implies that for $t \geq 0$, the conditional density of $y_{t+1}$ is

$$
f(y_{t+1} | y_{t}; \rho, \sigma) \sim {\mathcal N}(\rho y_{t}, \sigma^2) \
$$ (eq2)


Further, equation {eq}`eq1` also implies that for $t \geq 0$, the conditional density of $y_{t+j}$ for $j \geq 1$ is
$$ (ar1-tp-eq2)

Further, equation {eq}`ar1-tp-eq1` also implies that for $t \geq 0$, the conditional density of $y_{t+j}$ for $j \geq 1$ is

$$
f(y_{t+j} | y_{t}; \rho, \sigma) \sim {\mathcal N}\left(\rho^j y_{t}, \sigma^2 \frac{1 - \rho^{2j}}{1 - \rho^2} \right)
$$ (eq3)
$$ (ar1-tp-eq3)


The predictive distribution {eq}`eq3` that assumes that the parameters $\rho, \sigma$ are known, which we express
The predictive distribution {eq}`ar1-tp-eq3` that assumes that the parameters $\rho, \sigma$ are known, which we express
by conditioning on them.

We also want to compute a predictive distribution that does not condition on $\rho,\sigma$ but instead takes account of our uncertainty about them.

We form this predictive distribution by integrating {eq}`eq3` with respect to a joint posterior distribution $\pi_t(\rho,\sigma | y^t )$
We form this predictive distribution by integrating {eq}`ar1-tp-eq3` with respect to a joint posterior distribution $\pi_t(\rho,\sigma | y^t)$
that conditions on an observed history $y^t = \{y_s\}_{s=0}^t$:

$$
f(y_{t+j} | y^t) = \int f(y_{t+j} | y_{t}; \rho, \sigma) \pi_t(\rho,\sigma | y^t ) d \rho d \sigma
$$ (eq4)


$$ (ar1-tp-eq4)

Predictive distribution {eq}`eq3` assumes that parameters $(\rho,\sigma)$ are known.
Predictive distribution {eq}`ar1-tp-eq3` assumes that parameters $(\rho,\sigma)$ are known.

Predictive distribution {eq}`eq4` assumes that parameters $(\rho,\sigma)$ are uncertain, but have known probability distribution $\pi_t(\rho,\sigma | y^t )$.
Predictive distribution {eq}`ar1-tp-eq4` assumes that parameters $(\rho,\sigma)$ are uncertain, but have known probability distribution $\pi_t(\rho,\sigma | y^t )$.

We also want to compute some predictive distributions of "sample path statistics" that might include, for example
We also want to compute some predictive distributions of "sample path statistics" that might include, for example

- the time until the next "recession",
- the minimum value of $Y$ over the next 8 periods,
Expand All @@ -115,20 +115,16 @@ To accomplish that for situations in which we are uncertain about parameter valu
- for each draw $n=0,1,...,N$, simulate a "future path" of length $T_1$ with parameters $\left(\rho_n,\sigma_n\right)$ and compute our three "sample path statistics";
- finally, plot the desired statistics from the $N$ samples as an empirical distribution.



## Implementation

First, we'll simulate a sample path from which to launch our forecasts.

In addition to plotting the sample path, under the assumption that the true parameter values are known,
we'll plot $.9$ and $.95$ coverage intervals using conditional distribution
{eq}`eq3` described above.
{eq}`ar1-tp-eq3` described above.

We'll also plot a bunch of samples of sequences of future values and watch where they fall relative to the coverage interval.



```{code-cell} ipython3
def AR1_simulate(rho, sigma, y0, T):

Expand Down Expand Up @@ -198,30 +194,30 @@ Wecker {cite}`wecker1979predicting` proposed using simulation techniques to char

He called these functions "path properties" to contrast them with properties of single data points.

He studied two special prospective path properties of a given series $\{y_t\}$.
He studied two special prospective path properties of a given series $\{y_t\}$.

The first was **time until the next turning point**
The first was **time until the next turning point**

* he defined a **"turning point"** to be the date of the second of two successive declines in $y$.
* he defined a **"turning point"** to be the date of the second of two successive declines in $y$.

To examine this statistic, let $Z$ be an indicator process
To examine this statistic, let $Z$ be an indicator process

$$
<!-- $$
Z_t(Y(\omega)) := \left\{
\begin{array} {c}
\ 1 & \text{if } Y_t(\omega)< Y_{t-1}(\omega)< Y_{t-2}(\omega) \geq Y_{t-3}(\omega) \\
0 & \text{otherwise}
\end{array} \right.
$$
$$ -->

Then the random variable **time until the next turning point** is defined as the following **stopping time** with respect to $Z$:

$$
W_t(\omega):= \inf \{ k\geq 1 \mid Z_{t+k}(\omega) = 1\}
$$

Wecker {cite}`wecker1979predicting` also studied **the minimum value of $Y$ over the next 8 quarters**
which can be defined as the random variable
Wecker {cite}`wecker1979predicting` also studied **the minimum value of $Y$ over the next 8 quarters**
which can be defined as the random variable

$$
M_t(\omega) := \min \{ Y_{t+1}(\omega); Y_{t+2}(\omega); \dots; Y_{t+8}(\omega)\}
Expand All @@ -230,36 +226,34 @@ $$
It is interesting to study yet another possible concept of a **turning point**.

Thus, let

<!--
$$
T_t(Y(\omega)) := \left\{
\begin{array}{c}
\ 1 & \text{if } Y_{t-2}(\omega)> Y_{t-1}(\omega) > Y_{t}(\omega) \ \text{and } \ Y_{t}(\omega) < Y_{t+1}(\omega) < Y_{t+2}(\omega) \\
\ -1 & \text{if } Y_{t-2}(\omega)< Y_{t-1}(\omega) < Y_{t}(\omega) \ \text{and } \ Y_{t}(\omega) > Y_{t+1}(\omega) > Y_{t+2}(\omega) \\
\ -1 & \text{if } Y_{t-2}(\omega)< Y_{t-1}(\omega) < Y_{t}(\omega) \ \text{and } \ Y_{t}(\omega) > Y_{t+1}(\omega) > Y_{t+2}(\omega) \\
0 & \text{otherwise}
\end{array} \right.
$$
$$ -->

Define a **positive turning point today or tomorrow** statistic as

$$
<!-- $$
P_t(\omega) := \left\{
\begin{array}{c}
\ 1 & \text{if } T_t(\omega)=1 \ \text{or} \ T_{t+1}(\omega)=1 \\
0 & \text{otherwise}
\end{array} \right.
$$
$$ -->

This is designed to express the event

- ``after one or two decrease(s), $Y$ will grow for two consecutive quarters''

- ``after one or two decrease(s), $Y$ will grow for two consecutive quarters''

Following {cite}`wecker1979predicting`, we can use simulations to calculate probabilities of $P_t$ and $N_t$ for each period $t$.

## A Wecker-Like Algorithm


The procedure consists of the following steps:

* index a sample path by $\omega_i$
Expand All @@ -270,11 +264,9 @@ $$
Y(\omega_i) = \left\{ Y_{t+1}(\omega_i), Y_{t+2}(\omega_i), \dots, Y_{t+N}(\omega_i)\right\}_{i=1}^I
$$

* for each path $\omega_i$, compute the associated value of $W_t(\omega_i), W_{t+1}(\omega_i), \dots$
* for each path $\omega_i$, compute the associated value of $W_t(\omega_i), W_{t+1}(\omega_i), \dots$

* consider the sets $
\{W_t(\omega_i)\}^{T}_{i=1}, \ \{W_{t+1}(\omega_i)\}^{T}_{i=1}, \ \dots, \ \{W_{t+N}(\omega_i)\}^{T}_{i=1}
$ as samples from the predictive distributions $f(W_{t+1} \mid \mathcal y_t, \dots)$, $f(W_{t+2} \mid y_t, y_{t-1}, \dots)$, $\dots$, $f(W_{t+N} \mid y_t, y_{t-1}, \dots)$.
* consider the sets $\{W_t(\omega_i)\}^{T}_{i=1}, \ \{W_{t+1}(\omega_i)\}^{T}_{i=1}, \ \dots, \ \{W_{t+N}(\omega_i)\}^{T}_{i=1}$ as samples from the predictive distributions $f(W_{t+1} \mid \mathcal y_t, \dots)$, $f(W_{t+2} \mid y_t, y_{t-1}, \dots)$, $\dots$, $f(W_{t+N} \mid y_t, y_{t-1}, \dots)$.


## Using Simulations to Approximate a Posterior Distribution
Expand All @@ -283,7 +275,6 @@ The next code cells use `pymc` to compute the time $t$ posterior distribution of

Note that in defining the likelihood function, we choose to condition on the initial value $y_0$.


```{code-cell} ipython3
def draw_from_posterior(sample):
"""
Expand Down Expand Up @@ -328,7 +319,6 @@ The graphs on the left portray posterior marginal distributions.

## Calculating Sample Path Statistics


Our next step is to prepare Python codeto compute our sample path statistics.

```{code-cell} ipython3
Expand Down Expand Up @@ -455,9 +445,9 @@ plt.show()
## Extended Wecker Method

Now we apply we apply our "extended" Wecker method based on predictive densities of $y$ defined by
{eq}`eq4` that acknowledge posterior uncertainty in the parameters $\rho, \sigma$.
{eq}`ar1-tp-eq4` that acknowledge posterior uncertainty in the parameters $\rho, \sigma$.

To approximate the intergration on the right side of {eq}`eq4`, we repeately draw parameters from the joint posterior distribution each time we simulate a sequence of future values from model {eq}`eq1`.
To approximate the intergration on the right side of {eq}`ar1-tp-eq4`, we repeately draw parameters from the joint posterior distribution each time we simulate a sequence of future values from model {eq}`ar1-tp-eq1`.

```{code-cell} ipython3
def plot_extended_Wecker(post_samples, initial_path, N, ax):
Expand Down Expand Up @@ -525,4 +515,3 @@ plot_extended_Wecker(post_samples, initial_path, 1000, ax)
plt.legend()
plt.show()
```

6 changes: 2 additions & 4 deletions lectures/bayes_nonconj.md
Original file line number Diff line number Diff line change
Expand Up @@ -888,7 +888,7 @@ SVI_num_steps = 5000
true_theta = 0.8
```

#### Beta Prior and Posteriors
### Beta Prior and Posteriors:

Let's compare outcomes when we use a Beta prior.

Expand Down Expand Up @@ -944,7 +944,7 @@ Here the MCMC approximation looks good.

But the VI approximation doesn't look so good.

* even though we use the beta distribution as our guide, the VI approximated posterior distributions do not closely resemble the posteriors that we had just computed analytically.
* even though we use the beta distribution as our guide, the VI approximated posterior distributions do not closely resemble the posteriors that we had just computed analytically.

(Here, our initial parameter for Beta guide is (0.5, 0.5).)

Expand All @@ -960,8 +960,6 @@ BayesianInferencePlot(true_theta, num_list, BETA_numpyro).SVI_plot(guide_dist='b
```




## Non-conjugate Prior Distributions

Having assured ourselves that our MCMC and VI methods can work well when we have conjugate prior and so can also compute analytically, we
Expand Down
Loading