# **Sktime**

Sktime explores a blend of both features of popular time series algorithms, and the sci-kit learn library. It uses sklearn algorithms in the reduction of vast tabular data. Other features include time series regression, classification(multivariate and univariate), time series clustering, time-series annotations, forecasting, estimation, transformation, datasets, feature tools and utility functions (preprocessing and plotting). 

You can read about it more in [this](https://analyticsindiamag.com/sktime-library/) article.

## **Installation**

In [None]:

!python -m pip install pip --upgrade --user -q --no-warn-script-location
!python -m pip install numpy pandas seaborn matplotlib scipy statsmodels sklearn tensorflow keras torch torchvision \
    tqdm scikit-image pmdarima pystan==2.19.1.1 --user -q --no-warn-script-location

import IPython
IPython.Application.instance().kernel.do_shutdown(True)


In [None]:
!python -m pip install fbprophet sktime --user -q
import IPython
IPython.Application.instance().kernel.do_shutdown(True)

### Forecasting

In [None]:
from sktime.forecasting.all import *
from sktime.performance_metrics.forecasting import (
    mean_absolute_percentage_error
)
y = load_airline()
y_train, y_test = temporal_train_test_split(y)
fh = ForecastingHorizon(y_test.index, is_relative=False)
forecaster = ThetaForecaster(sp=12)  # monthly seasonal periodicity
forecaster.fit(y_train)
y_pred = forecaster.predict(fh)
mean_absolute_percentage_error(y_test, y_pred) 

### Time Series Classification

In [None]:
from sktime.classification.all import *
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
X, y = load_arrow_head(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y)
classifier = TimeSeriesForestClassifier()
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
accuracy_score(y_test, y_pred) 

### Univariate Time Series Classification with sktime

A single time series variable and a corresponding label for multiple instances. The aim is to find a suitable classifier model that can be used to learn the relationship between time-series data and label and predict likewise the new series’s label.

In [None]:
import matplotlib.pyplot as plt
import numpy as np
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.tree import DecisionTreeClassifier
from sktime.classification.all import TimeSeriesForestClassifier
from sktime.datasets import load_arrow_head
from sktime.utils.slope_and_trend import _slope 

Loading data

In this notebook, we use the arrowhead problem.

The arrowhead dataset is a time-series dataset containing outlines of the images of arrowheads. In anthropology, the classification of projectile points is an important topic. The classes are categorized based on shape distinctions eg. – the presence and location of a notch in the arrow.

Data representation

In [None]:
X, y = load_arrow_head(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y)
print(X_train.shape, y_train.shape, X_test.shape, y_test.shape)
#(158, 1) (158,) (53, 1) (53,)
# univariate time series input data
X_train.head() 

binary target variables

In [None]:
labels, counts = np.unique(y_train, return_counts=True)
print(labels, counts) 

In [None]:
fig, ax = plt.subplots(1, figsize=plt.figaspect(0.25))
for label in labels:
    X_train.loc[y_train == label, "dim_0"].iloc[0].plot(ax=ax, label=f"class {label}")
plt.legend()
ax.set(title="Example time series", xlabel="Time");

Time series forest

In [None]:
from sktime.transformations.panel.summarize import RandomIntervalFeatureExtractor
steps = [
    (
        "extract",
        RandomIntervalFeatureExtractor(
            n_intervals="sqrt", features=[np.mean, np.std, _slope]
        ),
    ),
    ("clf", DecisionTreeClassifier()),
]
time_series_tree = Pipeline(steps) 

We can directly fit and evaluate the single time series tree (which is simply a pipeline).

In [None]:
time_series_tree.fit(X_train, y_train)
time_series_tree.score(X_test, y_test) 

 For time series forest classifier, we can simply use the single tree as the base estimator in the forest ensemble.

In [None]:
tsf = TimeSeriesForestClassifier(
    n_estimators=100,
    random_state=1,
    n_jobs=-1,
) 

Fitting and obtaining the out-of-bag score:

In [None]:
tsf.fit(X_train, y_train)
if tsf.oob_score:
    print(tsf.oob_score_)
tsf = TimeSeriesForestClassifier()
tsf.fit(X_train, y_train)
tsf.score(X_test, y_test) 

algorithms for plotting feature importance graph over time to obtain feature importances for the different features and intervals.

In [None]:
fi = tsf.feature_importances_
fig, ax = plt.subplots(1, figsize=plt.figaspect(0.25))
plt.plot(fi)
ax.set(xlabel="Time", ylabel="Feature importance"); 

#**Related Articles:**

> * [Sktime](https://analyticsindiamag.com/sktime-library/)

> * [Time Series Forecasting with Streamlit](https://analyticsindiamag.com/how-to-deploy-time-series-forecasting-models-using-streamlit/)

> * [STRIPE](https://analyticsindiamag.com/guide-to-stripe-shape-and-time-diversity-in-probabilistic-forecast/)

> * [SelfTime](https://analyticsindiamag.com/guide-to-selftime-self-supervised-time-series-representation-learning-framework-with-python-code/)

> * [Giotta Time](https://analyticsindiamag.com/guide-to-giotto-time-a-time-series-forecasting-python-library/)

> * [Facebook Prophet](https://analyticsindiamag.com/comprehensive-guide-to-facebooks-prophet-with-python-code/)
