In [1]:
import numpy as np
import pandas as pd
from autogluon.timeseries import TimeSeriesDataFrame, TimeSeriesPredictor
from autogluon.timeseries.models import PatchTSTModel, TemporalFusionTransformerModel
from plotnine import *
from statsmodels.tsa.arima_process import ArmaProcess

When we write a function, we ought to test it to ensure that it works. When a function operates on data, one way to test it is to simulate data from a model and verify that the function returns what it should for the simulated data.

One of the simplest time series models is the AR(1) model, the autoregressive model of order 1. It is defined by the equation
$$
Y_t = \phi Y_{t - 1} + \epsilon_t, \ t \in \mathbb{Z},
$$
where the $\epsilon_t$'s are uncorrelated random variables with common mean zero and common variance $\sigma^2$, and $\epsilon_t$ is independent of $Y_{t - 1}, Y_{t - 2}, Y_{t - 3}, \ldots$. The $\epsilon_t$'s are called *innovations*. For a Gaussian AR(1) model, the innovations are $N(0, \sigma^2)$ random variables.

One desirable property of a time series model is *stationarity*. If the model is stationary, then the mean and variance of $Y_t$ don't depend on $t$. Also, the covariance between $Y_t$ and $Y_u$ depends on $t$ and $u$ only through $|t - u|$, so we can talk about *the* covariance at lag $\ell$, $\text{Cov}(Y_t, Y_{t + \ell})$, which doesn't depend on $t$. The covariances at the various lags are called *autocovariances*. It can be shown that the AR(1) model is stationary if and only if $|\phi| < 1$.

Define a function `simulate_ar1` that draws a sample of size $n$ from a Gaussian AR(1) model with coefficient $\phi$ and innovation standard deviation $\sigma$.
- Check whether $|\phi| < 1$ - raise a `ValueError` if it isn't.
- Use `statsmodels.tsa.arima_process.ArmaProcess` to create an object representing the time series.
- Use the object's `generate_sample` method to generate a sample of size $n$.
- Return the sample in a `TimeSeriesDataFrame`.

In [None]:
def simulate_ar1(phi: float, sigma: float, n: int) -> TimeSeriesDataFrame:
    pass

For a model of the form $Y = f(X) + \epsilon$, where $X$ and $\epsilon$ are independent, the *signal-to-noise ratio (SNR)* is defined as
$$
\frac{\text{Var}(f(X))}{\text{Var}(\epsilon)}.
$$
The fraction of the variance of $Y$ explained by the signal $f(X)$, which we'll call the FVE, is
$$
\frac{\text{Var}(f(X))}{\text{Var}(Y)} = \frac{\text{Var}(f(X))}{\text{Var}(f(X)) + \text{Var}(\epsilon)} = \frac{\text{SNR}}{\text{SNR} + 1}.
$$

For a stationary AR(1) model, derive expressions for the SNR and FVE. Then define a function `calc_phi_from_fve` that takes an FVE and returns the nonnegative $\phi$ that yields that FVE.

In [None]:
def calc_phi_from_fve(fve: float) -> float:
    pass