Modifications to the forecaster tutorial

sktime · Feb 19, 2022 · b8aedd3 · b8aedd3
1 parent 2825392
commit b8aedd3
Show file tree

Hide file tree

Showing 2 changed files with 76 additions and 24 deletions.
diff --git a/examples/01_forecasting.ipynb b/examples/01_forecasting.ipynb
@@ -60,7 +60,8 @@
     "        * [1.2.1 Basic deployment workflow in a nutshell](#section_1_2_1)\n",
     "        * [1.2.2 Forecasters that require the horizon already in `fit`](#section_1_2_2)\n",
     "        * [1.2.3 Forecasters that can make use of exogeneous data](#section_1_2_3)\n",
-    "        * [1.2.4 Prediction intervals](#section_1_2_4)      \n",
+    "        * [1.2.4 Multivariate Forecasters](#section_1_2_4)\n",
+    "        * [1.2.5 Prediction intervals](#section_1_2_5)      \n",
     "    * [1.3 basic evaluation workflow - evaluating a batch of forecasts against ground truth observations](#section_1_3)   \n",
     "        * [1.3.1 The basic batch forecast evaluation workflow in a nutshell - function metric interface](#section_1_3_1)\n",
     "        * [1.3.2 The basic batch forecast evaluation workflow in a nutshell - metric class interface](#section_1_3_2)           \n",
@@ -618,11 +619,60 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### 1.2.4 prediction intervals<a class=\"anchor\" id=\"section_1_2_4\"></a>\n",
+    "#### 1.2.4. multivariate forecasting <a class=\"anchor\" id=\"section_1_2_4\"></a>"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Forecasters can also be used with multivariate datasets. Below is an example using the longley dataset from `sktime.datasets`. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from sktime.datasets import load_longley\n",
+    "from sktime.forecasting.compose import ColumnEnsembleForecaster\n",
+    "from sktime.forecasting.exp_smoothing import ExponentialSmoothing\n",
+    "from sktime.forecasting.trend import PolynomialTrendForecaster\n",
+    "\n",
+    "_, y = load_longley()\n",
+    "\n",
+    "y = y.drop(columns=[\"UNEMP\", \"ARMED\", \"POP\"])\n",
+    "\n",
+    "forecasters = [\n",
+    "    (\"trend\", PolynomialTrendForecaster(), 0),\n",
+    "    (\"ses\", ExponentialSmoothing(trend=\"add\"), 1),\n",
+    "]\n",
+    "\n",
+    "forecaster = ColumnEnsembleForecaster(forecasters=forecasters)\n",
+    "forecaster.fit(y, fh=[1, 2, 3])\n",
+    "\n",
+    "y_pred = forecaster.predict()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "y_pred"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### 1.2.5 prediction intervals <a class=\"anchor\" id=\"section_1_2_5\"></a>\n",
     "\n",
-    "`sktime` provides a unified interface to return prediction interval when forecasting. This is possible directly in the `predict` function, by setting the `return_pred_int` argument to `True`. The `predict` method then returns a second argument, Not all forecasters are capable of returning prediction intervals, in which case an error will be raised.\n",
+    "`sktime` provides a unified interface to return prediction interval when forecasting. This is possible using the `predict_interval` function. Not all forecasters are capable of returning prediction intervals, in which case an error will be raised.\n",
     "\n",
-    "Obtaining prediction intervals can be done as part of any workflow involving `predict`, by adding the argument `return_pred_int` - below, we illustrate this by modifying the basic workflow in Section 1.2:"
+    "`alpha` argument can be interpreted as the coverage of predictive interval(s) and should be a float of list of floats. For example, a coverage of `90` returns values at the lower: `5 (50 - (alpha/2)` and upper: `95 (50 + (alpha/2)` percentiles."
    ]
   },
   {
@@ -640,8 +690,11 @@
    "metadata": {},
    "outputs": [],
    "source": [
+    "import numpy as np\n",
+    "\n",
     "# simple workflow\n",
     "y = load_airline()\n",
+    "\n",
     "fh = np.arange(1, 13)\n",
     "\n",
     "forecaster = ThetaForecaster(sp=12)\n",
@@ -650,23 +703,24 @@
     "# setting return_pred_int argument to True; alpha determines percentiles\n",
     "#  intervals are lower = alpha/2-percentile, upper = (1-alpha/2)-percentile\n",
     "alpha = 0.05  # 2.5%/97.5% prediction intervals\n",
-    "y_pred, y_pred_ints = forecaster.predict(fh, return_pred_int=True, alpha=alpha)"
+    "y_pred = forecaster.predict(fh)\n",
+    "y_pred_ints = forecaster.predict_interval(fh, coverage=alpha)"
    ]
   },
   {
-   "cell_type": "markdown",
+   "cell_type": "code",
+   "execution_count": null,
    "metadata": {},
+   "outputs": [],
    "source": [
-    "`y_pred_ints` is a `pandas.DataFrame` with columns `lower` and `upper`, and rows the indices for which forecasts were made (same as in `y_pred`). Entries are lower/upper (as column name) bound of the nominal alpha predictive interval for the index in the same row."
+    "y_pred_ints"
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
+   "cell_type": "markdown",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "y_pred_ints"
+    "`y_pred_ints` is a `pandas.DataFrame` with columns `lower` and `upper`, and rows the indices for which forecasts were made (same as in `y_pred`). Entries are lower/upper (as column name) bound of the nominal alpha predictive interval for the index in the same row."
    ]
   },
   {
@@ -694,8 +748,8 @@
     "fig, ax = plot_series(y, y_pred, labels=[\"y\", \"y_pred\"])\n",
     "ax.fill_between(\n",
     "    ax.get_lines()[-1].get_xdata(),\n",
-    "    y_pred_ints[\"lower\"],\n",
-    "    y_pred_ints[\"upper\"],\n",
+    "    y_pred_ints[\"Coverage\"][alpha][\"lower\"],\n",
+    "    y_pred_ints[\"Coverage\"][alpha][\"upper\"],\n",
     "    alpha=0.2,\n",
     "    color=ax.get_lines()[-1].get_c(),\n",
     "    label=f\"{1 - alpha}% prediction intervals\",\n",
@@ -2792,8 +2846,7 @@
    "hash": "fcc5fed35031463a248402718f3bbb1a61c709e60741a9777b2268658fc045fd"
   },
   "kernelspec": {
-   "display_name": "Python 3",
-   "language": "python",
+   "display_name": "Python 3.8.12 64-bit",
    "name": "python3"
   },
   "language_info": {
@@ -2804,19 +2857,17 @@
    "file_extension": ".py",
    "mimetype": "text/x-python",
    "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.8.8"
+   "version": ""
   },
   "latex_envs": {
    "LaTeX_envs_menu_present": true,
    "autoclose": false,
    "autocomplete": true,
    "bibliofile": "biblio.bib",
    "cite_by": "apalike",
-   "current_citInitial": 1,
+   "current_citInitial": 1.0,
    "eqLabelWithNumbers": true,
-   "eqNumInitial": 1,
+   "eqNumInitial": 1.0,
    "hotkeys": {
     "equation": "Ctrl-E",
     "itemize": "Ctrl-I"
@@ -2827,7 +2878,7 @@
    "user_envs_cfg": false
   },
   "toc": {
-   "base_numbering": 1,
+   "base_numbering": 1.0,
    "nav_menu": {},
    "number_sections": true,
    "sideBar": true,
@@ -2841,9 +2892,9 @@
   },
   "varInspector": {
    "cols": {
-    "lenName": 16,
-    "lenType": 16,
-    "lenVar": 40
+    "lenName": 16.0,
+    "lenType": 16.0,
+    "lenVar": 40.0
    },
    "kernels_config": {
     "python": {

diff --git a/sktime/forecasting/theta.py b/sktime/forecasting/theta.py
@@ -84,6 +84,7 @@ class ThetaForecaster(ExponentialSmoothing):
 
     _fitted_param_names = ("initial_level", "smoothing_level")
     _tags = {
+        "scitype:y": "univariate",
         "ignores-exogeneous-X": True,
         "capability:pred_int": True,
         "requires-fh-in-fit": False,