# sktime worksheet - forecasting

This is an exercise style worksheet introducing the [sktime](https://github.com/alan-turing-institute/sktime) forecasting module, for self-study and user testing.

This worksheet contains exercises for you to attempt after:
* going through the basic `sktime` demo, live or [on youtube](https://www.youtube.com/watch?v=wqQKFu41FIw)
* having had a look at the tutorial notebooks yourself.

For this sheet, you are actively encouraged to seek out the sktime documentation and tutorial notebooks to solve the exercises.

We would very much appreciate if you could leave feedback, especially critical feedback, in the markdown fields meant for this. You can also raise issues on any bugs or improvement suggestions.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

### Exercise 1: basic forecasting

(a) use an ARMA(10,10) forecaster to forecast the series `y` below for the next 100 indices ahead.

(b) plot the series and the forecast

(c) obtain 5%/95% prediction intervals, and add them to the plot

In [None]:
y = np.sin(np.linspace(0, 30, 1000)) + 0.1*np.random.randn(1000)

Questions:
* was the forecaster easy/difficult to find? was the documentation helpful?
* was the forecasting interface easy/difficult to understand?
* anything else you would like to tell us?

Remarks:

### Exercise 2: evaluating forecasters

(a) evaluate the forecaster above, backtesting on the last 20% of the series `y`, evaluating by mean absolute error of prediction

(b) plot the actual series against the prediction, with 5%/95% prediction intervals

(c) evaluate the forecaster above, taking double average of absolute error over a rolling 10-steps-ahead prediction, starting at the 200th index, with a rolling step size of 10

(d) obtain prediction intervals, and add them to the plot

Questions:
* was it easy/difficult to find out how to do backtesting evaluation?
* was it easy/difficult to find out how to do rolling evaluation?
* was the absolute error metric easy/difficult to find?
* was it easy/difficult to find out how to obtain prediction intervals?

### Exercise 3: composite forecasters

(a) build the following forecaster: reduction to supervised learning using the direct recursive strategy with ordinary least squares regression, on a rolling window of length 10.

(b) use the forecaster in (a) to forecast the series `y` below for the next 100 indices ahead.

(c) to the forecaster in (a), add deseasonalization as a pre-processing step. Tune the de-seasonalization parameter using grid parameter search with a suitable sliding window splitter. Try to avoid overly high runtimes.

(d) fit the (deseasonalized) forecaster in (c) to `y`, and check whether the tuning procedure results in the "obvious" seasonality.

Questions:
* was it easy/difficult to find out how to construct the reduction?
* was it easy/difficult to find out how to deseasonalize the forecaster?
* was it easy/difficult to find out how to tune the forecaster?

### Exercise 4: Model selection
Construct a forecaster which, using model selection (=autoML) based on sliding window evaluation via mean absolute error, at each time step selects one of: (i) deseasonalized moving average, (ii) autoARIMA

Questions:
* was it easy/difficult to find out how to construct the model selection composite?
* was it easy/difficult to find out how to use autoARIMA?

### General feedback:
* What was the most difficult part to understand?
* Did you get stuck anywhere or had to skip? 
* How can we improve the documentation? 
* What additional features would you like to see?