Skip to content

Commit

Permalink
Modifications to the forecaster tutorial
Browse files Browse the repository at this point in the history
  • Loading branch information
kejsitake committed Feb 19, 2022
1 parent 2825392 commit b8aedd3
Show file tree
Hide file tree
Showing 2 changed files with 76 additions and 24 deletions.
99 changes: 75 additions & 24 deletions examples/01_forecasting.ipynb
Expand Up @@ -60,7 +60,8 @@
" * [1.2.1 Basic deployment workflow in a nutshell](#section_1_2_1)\n",
" * [1.2.2 Forecasters that require the horizon already in `fit`](#section_1_2_2)\n",
" * [1.2.3 Forecasters that can make use of exogeneous data](#section_1_2_3)\n",
" * [1.2.4 Prediction intervals](#section_1_2_4) \n",
" * [1.2.4 Multivariate Forecasters](#section_1_2_4)\n",
" * [1.2.5 Prediction intervals](#section_1_2_5) \n",
" * [1.3 basic evaluation workflow - evaluating a batch of forecasts against ground truth observations](#section_1_3) \n",
" * [1.3.1 The basic batch forecast evaluation workflow in a nutshell - function metric interface](#section_1_3_1)\n",
" * [1.3.2 The basic batch forecast evaluation workflow in a nutshell - metric class interface](#section_1_3_2) \n",
Expand Down Expand Up @@ -618,11 +619,60 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 1.2.4 prediction intervals<a class=\"anchor\" id=\"section_1_2_4\"></a>\n",
"#### 1.2.4. multivariate forecasting <a class=\"anchor\" id=\"section_1_2_4\"></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Forecasters can also be used with multivariate datasets. Below is an example using the longley dataset from `sktime.datasets`. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sktime.datasets import load_longley\n",
"from sktime.forecasting.compose import ColumnEnsembleForecaster\n",
"from sktime.forecasting.exp_smoothing import ExponentialSmoothing\n",
"from sktime.forecasting.trend import PolynomialTrendForecaster\n",
"\n",
"_, y = load_longley()\n",
"\n",
"y = y.drop(columns=[\"UNEMP\", \"ARMED\", \"POP\"])\n",
"\n",
"forecasters = [\n",
" (\"trend\", PolynomialTrendForecaster(), 0),\n",
" (\"ses\", ExponentialSmoothing(trend=\"add\"), 1),\n",
"]\n",
"\n",
"forecaster = ColumnEnsembleForecaster(forecasters=forecasters)\n",
"forecaster.fit(y, fh=[1, 2, 3])\n",
"\n",
"y_pred = forecaster.predict()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"y_pred"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 1.2.5 prediction intervals <a class=\"anchor\" id=\"section_1_2_5\"></a>\n",
"\n",
"`sktime` provides a unified interface to return prediction interval when forecasting. This is possible directly in the `predict` function, by setting the `return_pred_int` argument to `True`. The `predict` method then returns a second argument, Not all forecasters are capable of returning prediction intervals, in which case an error will be raised.\n",
"`sktime` provides a unified interface to return prediction interval when forecasting. This is possible using the `predict_interval` function. Not all forecasters are capable of returning prediction intervals, in which case an error will be raised.\n",
"\n",
"Obtaining prediction intervals can be done as part of any workflow involving `predict`, by adding the argument `return_pred_int` - below, we illustrate this by modifying the basic workflow in Section 1.2:"
"`alpha` argument can be interpreted as the coverage of predictive interval(s) and should be a float of list of floats. For example, a coverage of `90` returns values at the lower: `5 (50 - (alpha/2)` and upper: `95 (50 + (alpha/2)` percentiles."
]
},
{
Expand All @@ -640,8 +690,11 @@
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"# simple workflow\n",
"y = load_airline()\n",
"\n",
"fh = np.arange(1, 13)\n",
"\n",
"forecaster = ThetaForecaster(sp=12)\n",
Expand All @@ -650,23 +703,24 @@
"# setting return_pred_int argument to True; alpha determines percentiles\n",
"# intervals are lower = alpha/2-percentile, upper = (1-alpha/2)-percentile\n",
"alpha = 0.05 # 2.5%/97.5% prediction intervals\n",
"y_pred, y_pred_ints = forecaster.predict(fh, return_pred_int=True, alpha=alpha)"
"y_pred = forecaster.predict(fh)\n",
"y_pred_ints = forecaster.predict_interval(fh, coverage=alpha)"
]
},
{
"cell_type": "markdown",
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"`y_pred_ints` is a `pandas.DataFrame` with columns `lower` and `upper`, and rows the indices for which forecasts were made (same as in `y_pred`). Entries are lower/upper (as column name) bound of the nominal alpha predictive interval for the index in the same row."
"y_pred_ints"
]
},
{
"cell_type": "code",
"execution_count": null,
"cell_type": "markdown",
"metadata": {},
"outputs": [],
"source": [
"y_pred_ints"
"`y_pred_ints` is a `pandas.DataFrame` with columns `lower` and `upper`, and rows the indices for which forecasts were made (same as in `y_pred`). Entries are lower/upper (as column name) bound of the nominal alpha predictive interval for the index in the same row."
]
},
{
Expand Down Expand Up @@ -694,8 +748,8 @@
"fig, ax = plot_series(y, y_pred, labels=[\"y\", \"y_pred\"])\n",
"ax.fill_between(\n",
" ax.get_lines()[-1].get_xdata(),\n",
" y_pred_ints[\"lower\"],\n",
" y_pred_ints[\"upper\"],\n",
" y_pred_ints[\"Coverage\"][alpha][\"lower\"],\n",
" y_pred_ints[\"Coverage\"][alpha][\"upper\"],\n",
" alpha=0.2,\n",
" color=ax.get_lines()[-1].get_c(),\n",
" label=f\"{1 - alpha}% prediction intervals\",\n",
Expand Down Expand Up @@ -2792,8 +2846,7 @@
"hash": "fcc5fed35031463a248402718f3bbb1a61c709e60741a9777b2268658fc045fd"
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"display_name": "Python 3.8.12 64-bit",
"name": "python3"
},
"language_info": {
Expand All @@ -2804,19 +2857,17 @@
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.8"
"version": ""
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
"autoclose": false,
"autocomplete": true,
"bibliofile": "biblio.bib",
"cite_by": "apalike",
"current_citInitial": 1,
"current_citInitial": 1.0,
"eqLabelWithNumbers": true,
"eqNumInitial": 1,
"eqNumInitial": 1.0,
"hotkeys": {
"equation": "Ctrl-E",
"itemize": "Ctrl-I"
Expand All @@ -2827,7 +2878,7 @@
"user_envs_cfg": false
},
"toc": {
"base_numbering": 1,
"base_numbering": 1.0,
"nav_menu": {},
"number_sections": true,
"sideBar": true,
Expand All @@ -2841,9 +2892,9 @@
},
"varInspector": {
"cols": {
"lenName": 16,
"lenType": 16,
"lenVar": 40
"lenName": 16.0,
"lenType": 16.0,
"lenVar": 40.0
},
"kernels_config": {
"python": {
Expand Down
1 change: 1 addition & 0 deletions sktime/forecasting/theta.py
Expand Up @@ -84,6 +84,7 @@ class ThetaForecaster(ExponentialSmoothing):

_fitted_param_names = ("initial_level", "smoothing_level")
_tags = {
"scitype:y": "univariate",
"ignores-exogeneous-X": True,
"capability:pred_int": True,
"requires-fh-in-fit": False,
Expand Down

0 comments on commit b8aedd3

Please sign in to comment.