**Set-up instructions:** this notebook give a tutorial on the forecasting learning task supported by `sktime`.
On binder, this should run out-of-the-box.

To run this notebook as intended, ensure that `sktime` with basic dependency requirements is installed in your python environment.

To run this notebook with a local development version of sktime, either uncomment and run the below, or `pip install -e` a local clone of the `sktime` `main` branch.

In [None]:
# from os import sys
# sys.path.append("..")

# In-memory data representations and data loading

`sktime` provides modules for a number of time series related learning tasks.

These modules use `sktime` specific in-memory (i.e., python workspace) representations for time series and related objects, most importantly individual time series and time series panels. `sktime`'s in-memory representations rely on `pandas` and `numpy`, with additional conventions on the `pandas` and `numpy` object.

Users of `sktime` should be aware of these representations, since presenting the data in an `sktime` compatible representation is usually the first step in using any of the `sktime` modules.

This notebook introduces the data types used in `sktime`, related functionality such as converters and validity checkers, and common workflows for loading and conversion:

**Section 1** introduces in-memory data containers used in `sktime`, with examples.

**Section 2** introduces validity checkers and conversion functionality for in-memory data containers.

**Section 3** introduces common workflows to load data from file formats

## Section 1: in-memory data containers

This section provides a reference to data containers used for time series and related objets in `sktime`.

Conceptually, `sktime` distinguishes:

* the *scientific type* (or short: scitype) of a data container, defined by relational and statistical properties of the data being represented - for instance, a (scientific) "time series" or a (scientific) "time series panel", in the mathematical-abstract sense, without specifying a machine representation
* the *machine type* (or short: mtype) of a data container, which, for a defined *scientific type*, specifies the python type and conventions on structure and value of the python in-memory object. For instance, a specific (scientific) time series is represented by a concrete `pandas.DataFrame` in `sktime`, subject to certain conventions on the `pandas.DataFrame`. Formally, these conventions form a specific mtype, i.e., a way to represent the (abstract) "time series" scitype.

In `sktime`, the same scitype can be represented by multiple mtypes. For instance, `sktime` allows the user to specify time series as `pandas.DataFrame`, as `pandas.Series`, or as a `numpy.ndarray`. These are different mtypes which are admissible representations of the same scitype, "time series". Also, not all mtypes are equally rich in metadata - for instance, `pandas.DataFrame` can store column names, while this is not possible in `numpy.ndarray`.

Both scitypes and mtypes are encoded by strings in `sktime`, for easy reference.

This section introduces the mtypes for the following scitypes:
* `"Series"`, the `sktime` scitype for time series of any kind
* `"Panel"`, the `sktime` scitype for time series panels of any kind

### Section 1.1: Time series - the `"Series"` scitype

The major representations of time series in `sktime` are:

* `pd.DataFrame` - uni- or multivariate series, with rows corresponding to different time points
* `pd.Series` - univariate series, with entries corresponding to different time points
* `np.ndarray` - 


#### step 2 - specifying the forecasting horizon

Now we need to specify the forecasting horizon and pass that to our forecasting algorithm.

There are two main ways:

* using a `numpy.array` of integers. This assumes either integer index or periodic index (`PeriodIndex`) in the time series; the integer indicates the number of time points or periods ahead we want to make a forecast for. E.g., `1` means forecast the next period, `2` the second next period, and so on.
* using a `ForecastingHorizon` object. This can be used to define forecast horizons, using any supported index type as an argument. No periodic index is assumed.

Forecasting horizons can be absolute, i.e., referencing specific time points in the future, or relative, i.e., referencing time differences to the present. As a default, the present is that latest time point seen in any `y` passed to the forecaster.

`numpy.array` based forecasting horizons are always relative; `ForecastingHorizon` objects can be both relative and absolute. In particular, absolute forecasting horizons can only be specified using `ForecastingHorizon`.