# Modeling wit SKTIME

## Reference

The code in this notebook is used for modelling with all types of SKTIME classifiers. Here is the documentation: http://www.sktime.net/en/latest/api_reference/classification.html

### Import Dependencies

In [1]:
# general 
import numpy as np
import pandas as pd
!pip install sktime

# for data pre-processing
from sklearn.model_selection import train_test_split

# for model evaluation
from sklearn.metrics import multilabel_confusion_matrix, accuracy_score, confusion_matrix, classification_report



### Load datasets

Edit path for your computer

In [2]:
#loading our preprocessed datasets
X = np.load('/Users/ronjaweiblen/Bootcamp/Capstone_Project_SignMeUp/data/X-data.npy')
y = np.load('/Users/ronjaweiblen/Bootcamp/Capstone_Project_SignMeUp/data/y-data.npy')

#make y 1-Dimensional because this is what SKTIME wants
y = np.argmax(y, axis=1)

#defining signs --> edit for specific subset of data
actions = np.array ( ['alligator', 'radio', 'moon', 'sleep', 'grandpa', 'tiger', 'pencil', 'sleepy', 'grandma', 'chocolate'])

### Splitting Train and Test Data

In [3]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

**Here you can add all kinds of models from SKTIME. Just follow the steps of model 1.**

### Model 1 : KNN Time Series 

Step 1: Import Model

In [4]:
from sktime.classification.distance_based import KNeighborsTimeSeriesClassifier

Step 2: Train Model

In [5]:
classifier = KNeighborsTimeSeriesClassifier(distance="euclidean")
classifier.fit(X_train, y_train)

KNeighborsTimeSeriesClassifier(distance='euclidean')

Step 3: Make Predictions

In [6]:
y_pred = classifier.predict(X_test)

  mode, _ = stats.mode(_y[neigh_ind, k], axis=1)


Step 4: Test Model

In [7]:
# accuracy
#print(accuracy_score(y_test, y_pred))
print(classification_report(y_test, y_pred))
# confusion matrix
print(confusion_matrix(y_test, y_pred))
#multilabel confusion matrix
multilabel_confusion_matrix(y_test, y_pred)



              precision    recall  f1-score   support

           0       0.42      0.62      0.50         8
           1       0.18      0.22      0.20         9
           2       0.00      0.00      0.00         6
           3       0.44      0.44      0.44         9
           4       0.50      0.20      0.29        10
           5       0.36      0.36      0.36        11
           6       0.22      0.33      0.27         6
           7       0.29      0.22      0.25         9
           8       0.12      0.11      0.12         9
           9       0.20      0.33      0.25         6

    accuracy                           0.29        83
   macro avg       0.27      0.29      0.27        83
weighted avg       0.29      0.29      0.28        83

[[5 2 0 0 0 0 1 0 0 0]
 [2 2 0 0 1 1 0 0 2 1]
 [1 1 0 0 0 2 0 1 1 0]
 [2 0 0 4 0 1 1 0 1 0]
 [2 1 1 0 2 0 2 0 1 1]
 [0 1 1 1 1 4 1 1 1 0]
 [0 0 0 2 0 0 2 0 1 1]
 [0 0 0 1 0 1 1 2 0 4]
 [0 2 0 1 0 2 0 2 1 1]
 [0 2 0 0 0 0 1 1 0 2]]


array([[[68,  7],
        [ 3,  5]],

       [[65,  9],
        [ 7,  2]],

       [[75,  2],
        [ 6,  0]],

       [[69,  5],
        [ 5,  4]],

       [[71,  2],
        [ 8,  2]],

       [[65,  7],
        [ 7,  4]],

       [[70,  7],
        [ 4,  2]],

       [[69,  5],
        [ 7,  2]],

       [[67,  7],
        [ 8,  1]],

       [[69,  8],
        [ 4,  2]]])

Step 5: Save Model

In [8]:
#This is where I assume some ML Flow would come in

### Model 1 : KNN Time Series 

Step 1: Import Model

In [9]:
from sktime.classification.kernel_based import RocketClassifier

Step 2: Train Model

In [10]:
clf = RocketClassifier(num_kernels=500) 
clf.fit(X_train, y_train) 

RocketClassifier(num_kernels=500)

Step 3: Make Predictions

In [11]:
y_pred = clf.predict(X_test)

Step 4: Test Model

In [12]:
# accuracy
#print(accuracy_score(y_test, y_pred))
print(classification_report(y_test, y_pred))
# confusion matrix
print(confusion_matrix(y_test, y_pred))
#multilabel confusion matrix
multilabel_confusion_matrix(y_test, y_pred)



              precision    recall  f1-score   support

           0       0.88      0.88      0.88         8
           1       0.67      0.44      0.53         9
           2       0.50      0.17      0.25         6
           3       0.64      0.78      0.70         9
           4       0.67      0.40      0.50        10
           5       0.39      0.64      0.48        11
           6       0.50      0.50      0.50         6
           7       0.50      0.56      0.53         9
           8       0.43      0.33      0.38         9
           9       0.56      0.83      0.67         6

    accuracy                           0.55        83
   macro avg       0.57      0.55      0.54        83
weighted avg       0.57      0.55      0.54        83

[[7 0 0 0 0 1 0 0 0 0]
 [0 4 0 0 1 1 1 0 2 0]
 [0 0 1 0 0 4 0 0 0 1]
 [0 0 0 7 1 0 0 1 0 0]
 [0 1 1 0 4 1 0 1 1 1]
 [0 0 0 2 0 7 1 0 1 0]
 [0 0 0 2 0 0 3 0 0 1]
 [1 0 0 0 0 2 0 5 0 1]
 [0 1 0 0 0 2 0 3 3 0]
 [0 0 0 0 0 0 1 0 0 5]]


array([[[74,  1],
        [ 1,  7]],

       [[72,  2],
        [ 5,  4]],

       [[76,  1],
        [ 5,  1]],

       [[70,  4],
        [ 2,  7]],

       [[71,  2],
        [ 6,  4]],

       [[61, 11],
        [ 4,  7]],

       [[74,  3],
        [ 3,  3]],

       [[69,  5],
        [ 4,  5]],

       [[70,  4],
        [ 6,  3]],

       [[73,  4],
        [ 1,  5]]])

Step 5: Save Model

In [13]:
#This is where I assume some ML Flow would come in

### Dealing with multivariate classification --> doesn#t work yet!

Info from here: https://sktime-backup.readthedocs.io/en/v0.13.3/examples/02_classification.html



* many classifiers (f.e.: ROCKET, HC2)are configured to work with multivariate input.
* sktime offers two other ways of solving multivariate time series classification problems:
    * Concatenation of time series columns into a single long time series column via *ColumnConcatenator* and apply a classifier to the concatenated data
    * Dimension ensembling via *ColumnEnsembleClassifier* in which one classifier is fitted for each time series column/dimension of the time series and their predictions are combined through a voting scheme.

Concatenating columns:

In [14]:
from sktime.transformations.panel.compose import ColumnConcatenator

#import classifier you want to use
from sktime.classification.dictionary_based import BOSSEnsemble


In [15]:

clf = ColumnConcatenator() * BOSSEnsemble(max_ensemble_size=3) 
#clf.fit(X_train, y_train)
#clf.score(motions_test_X, motions_test_y)

In [16]:
clf

ClassifierPipeline(classifier=BOSSEnsemble(max_ensemble_size=3),
                   transformers=[ColumnConcatenator()])

In [17]:
clf.fit(X_train, y_train)

  warn(msg)


KeyboardInterrupt: 

In [None]:
#get column ensembler
from sktime.classification.compose import ColumnEnsembleClassifier

#get models we want to use
from sktime.classification.dictionary_based import WEASEL
from sktime.classification.dictionary_based import BOSSEnsemble

In [None]:
from sktime.classification.dictionary_based import BOSSEnsemble

In [None]:
#dimension ensembling for our data
clf = ColumnEnsembleClassifier(
    estimators=[
        ("WEASEL0", WEASEL(window_inc=4), [0]),#this is the classifier used for dimension 0 
        ("TDE3", TemporalDictionaryEnsemble(max_ensemble_size=5), [3]),
    ]
)
