# Time Series Analysis Tutorials

This notebook demonstrates simple time series analysis techniques. We will work with both a small synthetic dataset and a real-world carbon dioxide dataset from `statsmodels`. 
Each section introduces a different operation so you can see step by step how common time series tools are used.


Run the cell below if you need to install the required packages.
These include `pandas` for data manipulation, `matplotlib` for plotting and `statsmodels` for forecasting utilities.
On Google Colab the packages are usually preinstalled, but on your local machine you may need to execute this cell first.

In [None]:
!pip install pandas matplotlib statsmodels

## Simple moving average

1. Create a daily date range starting at 2020-01-01 for 200 days.
2. Generate a sine wave and add some random noise to simulate observations.
3. Put the data into a `pandas.Series` so we can use rolling-window functions.
4. Compute 20-day and 50-day simple moving averages with `.rolling().mean()`.
5. Also compute a 20-day exponential moving average with `.ewm()`.
6. Plot everything to see how the moving averages smooth the noisy signal.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

rng = pd.date_range(start='2020-01-01', periods=200, freq='D')
data = np.sin(np.linspace(0, 8*np.pi, len(rng))) + np.random.normal(scale=0.5, size=len(rng))
series = pd.Series(data, index=rng)

sma20 = series.rolling(window=20).mean()
sma50 = series.rolling(window=50).mean()
ema20 = series.ewm(span=20, adjust=False).mean()

plt.figure(figsize=(10,6))
plt.subplot(2,1,1)
plt.plot(series, label='Noisy series', alpha=0.6)
plt.plot(sma20, label='20-day SMA')
plt.plot(sma50, label='50-day SMA')
plt.title('Simple Moving Averages')
plt.legend()

plt.subplot(2,1,2)
plt.plot(series, label='Noisy series', alpha=0.6)
plt.plot(ema20, label='20-day EMA', color='tab:orange')
plt.title('Exponential Moving Average')
plt.legend()

plt.tight_layout()
plt.show()


## ARIMA forecasting

The next example uses the atmospheric CO2 dataset that ships with `statsmodels`.
We resample the series to monthly averages and fill any missing values.
The final two years are held out as test data while the rest is used for training.
An ARIMA(1,1,1) model is then fit to the training portion.
After fitting we forecast 24 months into the future and plot the predictions along with a 95% confidence interval.
The shaded region highlights the forecast horizon.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm
from statsmodels.tsa.arima.model import ARIMA

co2 = sm.datasets.co2.load_pandas().data
co2 = co2['co2'].resample('MS').mean().ffill()

train = co2.iloc[:-24]
test = co2.iloc[-24:]

model = ARIMA(train, order=(1,1,1))
model_fit = model.fit()
pred = model_fit.get_forecast(steps=24)
pred_mean = pred.predicted_mean
pred_ci = pred.conf_int()

plt.figure(figsize=(10,6))
plt.plot(co2, label='Observed')
plt.plot(pred_mean.index, pred_mean, color='red', label='Forecast')
plt.fill_between(pred_ci.index, pred_ci.iloc[:,0], pred_ci.iloc[:,1], color='red', alpha=0.3, label='95% CI')
plt.axvspan(test.index[0], test.index[-1], color='gray', alpha=0.1, label='Forecast Horizon')
plt.title('ARIMA(1,1,1) Forecast of CO2')
plt.legend()
plt.tight_layout()
plt.show()


## Seasonal decomposition

Classical decomposition separates a series into trend, seasonal and residual components.
Using `seasonal_decompose` with an additive model and a 12 month seasonal period, 
we can visualize these three pieces of the CO2 time series to better understand its structure.

In [None]:
from statsmodels.tsa.seasonal import seasonal_decompose
result = seasonal_decompose(co2, model='additive', period=12)
fig = result.plot()
fig.set_size_inches(10, 6)
plt.tight_layout()
plt.show()


## Holt-Winters exponential smoothing

Holt-Winters models level, trend and seasonality using exponential smoothing.
Again we reserve the last two years of data for testing.
An additive trend and seasonality model is fitted to the training set and used to forecast 24 months ahead.
The resulting forecast is plotted along with the observed series for comparison.

In [None]:
from statsmodels.tsa.holtwinters import ExponentialSmoothing
train = co2.iloc[:-24]
test = co2.iloc[-24:]
model = ExponentialSmoothing(train, seasonal='add', trend='add', seasonal_periods=12)
fit = model.fit()
forecast = fit.forecast(24)
plt.figure(figsize=(10,6))
plt.plot(co2, label='Observed')
plt.plot(forecast.index, forecast, color='red', label='Holt-Winters Forecast')
plt.axvspan(test.index[0], test.index[-1], color='gray', alpha=0.1, label='Forecast Horizon')
plt.title('Holt-Winters Forecast of CO2')
plt.legend()
plt.tight_layout()
plt.show()


## Autocorrelation and partial autocorrelation

Finally we inspect the autocorrelation function (ACF) and partial autocorrelation function (PACF) of the CO2 series.
These plots reveal how observations are correlated with previous time steps and help in selecting AR and MA orders for models like ARIMA.

In [None]:
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
fig, axes = plt.subplots(2,1,figsize=(10,8))
plot_acf(co2, lags=40, ax=axes[0])
plot_pacf(co2, lags=40, ax=axes[1])
axes[0].set_title('Autocorrelation')
axes[1].set_title('Partial Autocorrelation')
plt.tight_layout()
plt.show()
