# Basics of time series forecasting

This lab introduces the fundamentals of time series forecasting with deep learning. 
You will learn how to build data loaders for time series data, implement simple 
autoregressive models, and experiment with preprocessing techniques such as 
standardization and differencing. The lab covers training and evaluation procedures, 
and compares the impact of different modeling choices on forecasting performance.

In [None]:
import torch
import numpy
import matplotlib.pyplot as plt

## Part 1: Data loaders

In this section, you will load a standard time series forecasting dataset and prepare a data loader for it.

To begin, visit <https://github.com/zhouhaoyi/ETDataset> and download the ETTh1 dataset as a CSV file.

**Question 1.** Visualize the dataset, focusing on the univariate time series corresponding to the target variable. Do you observe a trend? Periodicity? Any abnormal segments?

**Question 2.** Implement a PyTorch `DataLoader` that reads the CSV file at initialization time, allows you to specify the past window length and forecast horizon, and provides batches of `(past, horizon)` pairs for the univariate forecasting problem.

**Question 3.** Improve your `build_dataloader` function above to build both a training data loader and a validation data loader. What would be appropriate choices for a clean separation between training and validation datasets?

## Part 2: First models

In this part, you will build your first few forecasting models, train them, and
experiment with classical detrending techniques used in time series analysis.

**Question 4.** Implement a simple autoregressive (AR) model in `torch`.

The model should:
- take a past window of shape `(batch, window)`
- output a forecast of shape `(batch, horizon)`
- be linear in the inputs

**Question 5.** Train the AR model using mean squared error (MSE) and evaluate
it on the validation set.

Implement:
- a training loop
- a validation loop
- reporting of train and validation losses

**Question 6.** Add input/output standardization layers.

**Question 7.** Implement a differencing layer that removes local trends:

  $x_t' = x_t - x_{t-1}$

Then implement the inverse operation (integration) to recover forecasts
back to the original scale.

Experiment with how differencing affects convergence and final error, and 
compare forecasts qualitatively.

**Question 8.** Replace the linear layer in your best-performing AR model with MLPs.

Compare:
- convergence speed
- validation error

**Wrap-up question.** Which factor had the largest impact on performance?
1. model complexity
2. scaling
3. differencing

Justify your answer with quantitative results and/or plots.