# Timeseries Classification with the `sert` Python Package

**Author:** Amin Shoari Nejad &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**Date created:** 2023/09/04

In this notebook, we demonstrate how to utilize the sert package for time series classification. The process includes data loading, preprocessing, model instantiation, training, and evaluation.

# 1. Imports

Assuming that you have installed the sert package, we can import the necessary modules.

In [2]:
from sert.models import TimeSERT
from sert.preprocessing import SeqDataPreparer
from sert.datasets import ts_classification
from sert.losses import WeightedCrossentropy

- **TimeSERT**: This is one of the primary classes of the `sert` package, suitable for both time series classification and multivariate forecasting. In this notebook, we'll use it for time series classification.

- **SeqDataPreparer**: This class assists in preparing data for time series classification. Models within the `sert` package expect data in a particular format, and `SeqDataPreparer` facilitates this transformation. It employs the same `fit_transform` and `transform` syntax found in scikit-learn.

- **ts_classification**: This function creates a thousand irregular time series of varied lengths. Specifically, both the number of observations and their respective times are randomly determined. 30% of these time series are chosen at random to be anomalous, signifying the presence of an outlier. The function returns the time series data and labels, with "one" indicating a time series containing an outlier and "zero" otherwise. The objective is to train a model that can discern whether a time series is anomalous. Users can adjust various parameters of the function, like the number of time series or the balance ratio. For comprehensive details, invoke `help(ts_classification)`.

- **WeightedCrossentropy**: This is a custom loss function for training models, essentially a weighted iteration of the crossentropy loss function. The weights can be adjusted to balance the loss function. For instance, in situations where data is skewed, the loss function might be weighted to prioritize the underrepresented class. In this notebook, we'll utilize this loss function for training, though we won't assign varied weights to the classes.


# 2. Load the Dataset

Our sert package conveniently provides a time series classification dataset for demonstration purposes.

In [2]:
X_train, X_test, y_train, y_test = ts_classification()
X_train

Unnamed: 0,id,time,value,var_name
0,0,1.220307,0.886436,var1
1,0,6.661911,0.281329,var1
2,0,10.291603,0.532506,var1
3,0,14.312799,0.546110,var1
4,0,16.673076,0.836684,var1
...,...,...,...,...
37432,699,85.584778,0.156501,var1
37433,699,89.386548,0.604532,var1
37434,699,92.356314,0.609343,var1
37435,699,94.740727,0.075573,var1


In the next step we'll use the `SeqDataPreparer` to transform the data into a format suitable for training. 

# 3. Data Preprocessing

In order to feed our data into the **TimeSERT** model, we need to perform some preprocessing steps. Specifically, we'll convert the data into a list of numpy arrays suitable for the model. We utilize the `SeqDataPreparer` class for this, which accepts a single argument: `token_capacity`. This represents the maximum number of observations (a.k.a tokens) expected in the training data. 

After instantiating the `SeqDataPreparer` class, you'll need to:

- Call the `fit_transform` method on the training data. This method returns a list of numpy arrays prepared for training.
- Call the `transform` method on the test data. This method returns a list of numpy arrays ready for testing. Note that this method assumes that the `fit_transform` method has already been called on the training data and the training and test data are of the same format.

The `fit_transform` method requires four arguments:

1. **index**: The name of the column that contains the unique identifier for each sequence. In our dataset, this is the `id` column.
2. **times**: The name of the column that contains the time variable for each observation. For us, it's the `time` column.
3. **names**: The column name that represents the name of the sequence. In this example, it's the `var_name` column. Note that while this example uses a single variable `var1`, you could generally have multiple variables.
4. **values**: This refers to the column that contains the values of the variable for each observation. In our example, this is the `value` column.


In [8]:
# Determine the token capacity based on the maximum sequence length in the training set
token_cap = X_train.groupby('id').size().max()

processor = SeqDataPreparer(token_capacity=token_cap)

train_input = processor.fit_transform(X_train,
                                      index='id',
                                      times='time',
                                      names='var_name',
                                      values='value')

test_input = processor.transform(X_test)

# 4. Model Instantiation

Now, we'll instantiate the TimeSERT model with appropriate hyperparameters.

In [9]:
model = TimeSERT(num_var=1,
                 emb_dim=15,
                 num_head=3,
                 ffn_dim=5,
                 num_repeat=1,
                 num_out=y_train.shape[1],
                 task='classification')

### Hyperparameters:

- **num_var**: Represents the number of variables in the dataset required by the embedding layer to encode variable names. In this case, there's 1 variable, which is `var1`.
- **emb_dim**: Dimension of the embedding layer. Represents the dimension of the latent space.
- **num_head**: Number of attention heads.
- **ffn_dim**: Dimension of the feedforward layer.
- **num_repeat**: Number of times the encoder block is repeated.
- **num_out**: Number of output classes.

The `emb_dim`, `num_head`, `ffn_dim`, and `num_repeat` hyperparameters determine the size of the model. The larger these values are, the more complex the model becomes. The `num_out` hyperparameter is set to 2, as there are two classes: normal and anomalous. The `task` argument is set to `classification` since the goal is time series classification. For regression, the `task` argument is set to `regression`, which is the default.


# 5. Model Compilation

The instantiated model is a TensorFlow model. We must compile it using the `compile` method, specifying the optimizer, loss function, and metrics we want to track, as with any TensorFlow model. For this example, we are using the Adam optimizer, the weighted cross-entropy loss function provided by the package, and the accuracy as the metric. Note that the `WeightedCrossentropy` object requires the target variable to be one-hot encoded. Here's what the binary target in this example looks like: 

In [4]:
print(y_train)

[[1 0]
 [1 0]
 [1 0]
 ...
 [1 0]
 [1 0]
 [0 1]]


In [10]:
model.compile(optimizer='adam',
              loss=WeightedCrossentropy([1, 1]),
              metrics=['accuracy'])

 In this example, we are utilizing the weight vector [1,1] for the weighted cross-entropy loss function. This indicates that both classes are treated with equal importance by the loss function. You can adjust the weights to prioritize one class over the other. For instance, if the data is skewed, you might want to assign a higher weight to the underrepresented class. You can use `compute_class_weight` from scikit-learn to compute the weights and use them in the loss function like below: 

```python
from sklearn.utils.class_weight import compute_class_weight
import tensorflow as tf

class_weights = compute_class_weight(class_weight='balanced', classes=[0,1], y=y_train[:,0]])
class_weights = tf.cast(class_weights, dtype=tf.float32)
model.compile(optimizer='adam', loss=WeightedCrossentropy(class_weights), metrics=['accuracy'])
```

# 6. Model Training

With our data and model ready, we can now train the TimeSERT model.

In [11]:
model.fit(train_input, y_train, epochs=100, batch_size=250, verbose=0)

<keras.src.callbacks.History at 0x12268f670>

# 7. Predictions

Post-training, we can use the model to make predictions on the test set.

In [None]:
y_pred = model.predict(test_input)

# 8. Model Evaluation

Lastly, let's evaluate the model's performance on the test dataset.

In [12]:
_, accuracy = model.evaluate(test_input, y_test)
print(f"Test Accuracy: {accuracy:.2f}")

Test Accuracy: 1.00


# Conclusion

This notebook demonstrated how to effectively use the `sert` package for time series classification. With its intuitive API and preprocessing utilities, working with irregular time series data becomes efficient and straightforward.

### Few notes:

- In this tutorial, we didn't discuss the hyperparameter tuning process as the goal was to demonstrate how to work with the package API. Nonetheless, the model performed perfectly on the test data with arbitrary hyperparameters.

- The package also provides another alternative model to `TimeSERT` called `TimeSERNN` which doesn't use the transformer architecture and only relies on set encoding and feedforward layers. `TimeSERNN` runs much faster but might compromise performance. You can simply replace `TimeSERT` with `TimeSERNN`, which has fewer hyperparameters, like below:

```python
from sert.models import TimeSERNN

model = TimeSERNN(num_var=1,
                 emb_dim=15,
                 num_out=y_train.shape[1],
                 task='classification')
                 
```