<big> Time Series Analysis Code Sample </big>

**Author:** Liam Mason

**Note:** *This is code from my Applied Time Series Analysis (ATSA) course. Originally, I wrote this code in RStudio. While I have less experience with Python than R and Stata, I wanted to demonstrate my ability to use Python for statistical analysis. Therefore, I took one of my earlier problem sets in the ATSA class and performed the assignment in Python. More of my comments are below.*

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm
from statsmodels.graphics.tsaplots import plot_acf

Problem 2.5: Simulating a Random Walk Time Series

*In this problem, I generate a random walk time series with mean 1, standard deviation 10, and starting point a_0 = -2.3. I calculate the mean and variance of the random walk. A plot of this walk is visible below. I then run 500 simulations of the random walk, compute the average mean and variance, and compare the sample mean and variance to the theoretical mean and variance.*

In [None]:
np.random.seed(2645)
a0 = -2.3
n = 1000
mu = 1
sigma = 10
Xi = np.random.normal(mu, sigma, n)

S = np.zeros(n + 1)
S[0] = a0
S[1: ] = a0 + np.cumsum(Xi)

Sn = S[-1]
print("The value of S(n) is:", Sn)

mean_S = np.mean(S)
var_S = np.var(S)
print("The Mean of the random walk series is", mean_S)
print("The variance of the random walk series is", var_S)

plt.figure(figsize=(10, 4))
plt.plot(range(0, n + 1), S, linestyle = '-', color='blue')
plt.xlabel("Step")
plt.ylabel("S(t)")
plt.title("Random Walk Simulation")
plt.grid(True)
plt.show()

num_simulations = 500
Sn_values = np.empty(num_simulations)
for sim in range(num_simulations):
    Xi = np.random.normal(mu, sigma, n)
    S = np.zeros(n + 1)
    S[0] = a0
    for i in range(1, n + 1):
        S[i] = S[i - 1] + Xi[i - 1]
    Sn_values[sim] = S[-1]

sample_mean = np.mean(Sn_values)
sample_var = np.var(Sn_values)
print("After 500 simulations, the average value of S(n) is:", sample_mean)
print("After 500 simulations, the variance of S(n) is:", sample_var)

theoretical_mean = a0 + n * mu
theoretical_var = n * (sigma ** 2)
print("Theoretical mean of S(n):", theoretical_mean)
print("Theoretical variance of S(n):", theoretical_var)
# At 500 simulations, the theoretical and actual values are quite close. For an even greater sample size (e.g. n = 1000), these values would be significantly closer.


Problem 2.6: Calculating the Moving Average of a Random Walk

*In this problem, I create a moving average process of the form $X_t = W_{t-1} + 2W_t + W_{t-2}$ , where $W_t$ are independent with zero means and variance $\sigma_W^2$. I plot the autocorrelation function and numerical values and compare these to the theoretical acf values.*

In [None]:
np.random.seed(2625)
w = np.random.normal(0.0, 1.0, 502)
# I generate 502 observations for the white noise and drop the first and last values while applying the moving average to avoid missing values.

smoother = np.array([1, 2, 1])                         
ma_series = np.convolve(w, smoother, mode="valid")
nlags = 26
acf_sample = sm.tsa.stattools.acf(ma_series, nlags=nlags)

print("Autocorrelations of series 'ma_series', by lag")
for lag, val in enumerate(acf_sample):
    print(f"{lag:2d}: {val: .3f}")

plot_acf(ma_series, lags=nlags, zero=True)
plt.show()
# It can be shown that for different values of lag k, the theoretical acf values are:
# k = 0: rho = 1
# k = 1: rho = 2/3
# k = 2: rho = 1/6
# k > 2: rho = 0
# Therefore, the sample acf values closely match the theoretical acf values (e.g. 1 vs 1, 6.59 vs 6.67, etc.).
# Once again, for a larger sample size, the sample acf values would be even closer to the theoretical acf values.
