<a href="https://github.com/timeseriesAI/tsai-rs" target="_parent"><img src="https://img.shields.io/badge/tsai--rs-Time%20Series%20AI%20in%20Rust-blue" alt="tsai-rs"/></a>

# Intro to Time Series Classification with tsai-rs

This notebook demonstrates time series classification using **tsai-rs**, a Rust implementation of the tsai library with Python bindings.

## Purpose

This notebook shows how to:
1. Import tsai-rs and load UCR datasets
2. Prepare time series data
3. Configure models (InceptionTimePlus, PatchTST, etc.)
4. Use analysis tools (confusion matrix, metrics)
5. Apply transforms (standardization, augmentation)

## Install tsai-rs

First, build and install the tsai-rs Python bindings:

```bash
cd crates/tsai_python
maturin develop --release
```

## Import Libraries

In [None]:
import tsai_rs
import numpy as np
import sklearn.metrics as skm

# Display version info
print(f"tsai-rs version: {tsai_rs.version()}")
tsai_rs.my_setup()

## Prepare Data

### List Available Datasets

tsai-rs provides access to UCR Time Series Classification datasets.

In [None]:
# List univariate datasets (128 datasets)
univariate_datasets = tsai_rs.get_UCR_univariate_list()
print(f"Available univariate datasets ({len(univariate_datasets)}):")
print(univariate_datasets[:20], "...")

In [None]:
# List multivariate datasets (30 datasets)
multivariate_datasets = tsai_rs.get_UCR_multivariate_list()
print(f"Available multivariate datasets ({len(multivariate_datasets)}):")
print(multivariate_datasets)

### Download and Load Data

Let's load the NATOPS dataset - a multivariate time series classification problem.

In [None]:
# Load dataset with train/test split
dsid = 'NATOPS'
X_train, y_train, X_test, y_test = tsai_rs.get_UCR_data(dsid, return_split=True)

print(f"Dataset: {dsid}")
print(f"X_train shape: {X_train.shape}")
print(f"y_train shape: {y_train.shape}")
print(f"X_test shape: {X_test.shape}")
print(f"y_test shape: {y_test.shape}")

In [None]:
# Alternatively, load combined data with split indices
X, y, splits = tsai_rs.get_UCR_data(dsid, return_split=False)
train_idx, test_idx = splits

print(f"Combined X shape: {X.shape}")
print(f"Combined y shape: {y.shape}")
print(f"Train indices: {len(train_idx)}, Test indices: {len(test_idx)}")

### Data Format

Time series data in tsai-rs uses the format: `(N, V, L)`
- **N**: Number of samples
- **V**: Number of variables/channels
- **L**: Sequence length (time steps)

In [None]:
n_samples, n_vars, seq_len = X_train.shape
n_classes = len(np.unique(y_train))

print(f"Samples: {n_samples}")
print(f"Variables: {n_vars}")
print(f"Sequence length: {seq_len}")
print(f"Number of classes: {n_classes}")
print(f"Classes: {np.unique(y_train)}")

### Create TSDataset

In [None]:
# Create TSDataset objects for train and test
train_ds = tsai_rs.TSDataset(X_train, y_train)
test_ds = tsai_rs.TSDataset(X_test, y_test)

print(f"Train dataset: {train_ds}")
print(f"Test dataset: {test_ds}")
print(f"Train n_vars: {train_ds.n_vars}, seq_len: {train_ds.seq_len}")

## Configure Model

tsai-rs provides configurations for state-of-the-art time series models.

In [None]:
# InceptionTimePlus configuration
inception_config = tsai_rs.InceptionTimePlusConfig(
    n_vars=n_vars,
    seq_len=seq_len,
    n_classes=n_classes,
    n_blocks=6,
    n_filters=32
)
print(f"InceptionTimePlus: {inception_config}")
print(f"Config JSON:\n{inception_config.to_json()}")

In [None]:
# ResNetPlus configuration
resnet_config = tsai_rs.ResNetPlusConfig(
    n_vars=n_vars,
    seq_len=seq_len,
    n_classes=n_classes
)
print(f"ResNetPlus: {resnet_config}")

In [None]:
# PatchTST configuration (Transformer-based)
patchtst_config = tsai_rs.PatchTSTConfig.for_classification(
    n_vars=n_vars,
    seq_len=seq_len,
    n_classes=n_classes
)
print(f"PatchTST: {patchtst_config}")
print(f"  d_model: {patchtst_config.d_model}")
print(f"  n_heads: {patchtst_config.n_heads}")
print(f"  n_layers: {patchtst_config.n_layers}")
print(f"  n_patches: {patchtst_config.n_patches}")

In [None]:
# TST (Time Series Transformer) configuration
tst_config = tsai_rs.TSTConfig(
    n_vars=n_vars,
    seq_len=seq_len,
    n_classes=n_classes,
    d_model=128,
    n_heads=8,
    n_layers=3
)
print(f"TST: {tst_config}")

In [None]:
# RNNPlus configuration (LSTM/GRU)
rnn_config = tsai_rs.RNNPlusConfig(
    n_vars=n_vars,
    seq_len=seq_len,
    n_classes=n_classes,
    hidden_size=128,
    n_layers=2,
    rnn_type='lstm',
    bidirectional=True
)
print(f"RNNPlus: {rnn_config}")

In [None]:
# MiniRocket configuration
minirocket_config = tsai_rs.MiniRocketConfig(
    n_vars=n_vars,
    seq_len=seq_len,
    n_classes=n_classes,
    n_features=10000
)
print(f"MiniRocket: {minirocket_config}")

## Training Configuration

In [None]:
# Learner configuration
learner_config = tsai_rs.LearnerConfig(
    lr=1e-3,
    weight_decay=0.01,
    grad_clip=1.0
)
print(f"Learner config: {learner_config}")

In [None]:
# One-cycle learning rate scheduler
n_epochs = 25
steps_per_epoch = len(X_train) // 64  # batch_size = 64
total_steps = n_epochs * steps_per_epoch

scheduler = tsai_rs.OneCycleLR.simple(max_lr=1e-3, total_steps=total_steps)

# Get LR schedule for visualization
lr_schedule = scheduler.get_lr_schedule(total_steps)
print(f"Total steps: {total_steps}")
print(f"LR at step 0: {scheduler.get_lr(0):.6f}")
print(f"LR at step {total_steps//2}: {scheduler.get_lr(total_steps//2):.6f}")
print(f"LR at step {total_steps-1}: {scheduler.get_lr(total_steps-1):.6f}")

## Data Preprocessing

### Standardization

In [None]:
# Standardize data (by sample)
X_train_std = tsai_rs.ts_standardize(X_train.astype(np.float32), by_sample=True)
X_test_std = tsai_rs.ts_standardize(X_test.astype(np.float32), by_sample=True)

print(f"Before standardization - mean: {X_train[0].mean():.4f}, std: {X_train[0].std():.4f}")
print(f"After standardization - mean: {X_train_std[0].mean():.4f}, std: {X_train_std[0].std():.4f}")

### Data Augmentation

In [None]:
# Add Gaussian noise
X_noisy = tsai_rs.add_gaussian_noise(X_train_std, std=0.1, seed=42)
print(f"Original sample std: {X_train_std[0].std():.4f}")
print(f"Noisy sample std: {X_noisy[0].std():.4f}")

In [None]:
# Magnitude scaling
X_scaled = tsai_rs.mag_scale(X_train_std, scale_range=(0.8, 1.2), seed=42)
print(f"Original max: {X_train_std[0].max():.4f}")
print(f"Scaled max: {X_scaled[0].max():.4f}")

## Time Series to Image Transforms

In [None]:
# Get a single univariate time series
sample_ts = X_train_std[0, 0, :].astype(np.float32)  # First sample, first variable
print(f"Time series shape: {sample_ts.shape}")

In [None]:
# Compute GASF (Gramian Angular Summation Field)
gasf_image = tsai_rs.compute_gasf(sample_ts)
print(f"GASF image shape: {gasf_image.shape}")

In [None]:
# Compute GADF (Gramian Angular Difference Field)
gadf_image = tsai_rs.compute_gadf(sample_ts)
print(f"GADF image shape: {gadf_image.shape}")

In [None]:
# Compute Recurrence Plot
rp_image = tsai_rs.compute_recurrence_plot(sample_ts, threshold=0.1)
print(f"Recurrence plot shape: {rp_image.shape}")

## Analysis: Confusion Matrix and Metrics

Let's simulate some predictions to demonstrate the analysis tools.

In [None]:
# Simulate predictions (in practice, these come from your trained model)
np.random.seed(42)
y_test_int = y_test.astype(np.int64)

# Create "predictions" with ~93% accuracy
y_pred = y_test_int.copy()
n_wrong = int(len(y_test) * 0.07)  # 7% error rate
wrong_idx = np.random.choice(len(y_test), n_wrong, replace=False)
y_pred[wrong_idx] = np.random.randint(0, n_classes, n_wrong)

print(f"True labels: {y_test_int[:10]}")
print(f"Predictions: {y_pred[:10]}")

In [None]:
# Compute confusion matrix using tsai-rs
cm = tsai_rs.confusion_matrix(y_pred, y_test_int, n_classes=n_classes)
print(f"Confusion Matrix: {cm}")
print(f"\nAccuracy: {cm.accuracy():.4f}")
print(f"Macro F1: {cm.macro_f1():.4f}")

In [None]:
# Per-class metrics
print("\nPer-class metrics:")
for i in range(n_classes):
    print(f"  Class {i}: Precision={cm.precision(i):.4f}, Recall={cm.recall(i):.4f}, F1={cm.f1(i):.4f}")

In [None]:
# Get the confusion matrix as numpy array
cm_matrix = cm.matrix()
print(f"\nConfusion matrix:\n{cm_matrix}")

In [None]:
# Compare with sklearn
sklearn_accuracy = skm.accuracy_score(y_test_int, y_pred)
sklearn_f1 = skm.f1_score(y_test_int, y_pred, average='macro')
print(f"sklearn accuracy: {sklearn_accuracy:.4f}")
print(f"sklearn macro F1: {sklearn_f1:.4f}")
print(f"\ntsai-rs matches sklearn: {np.isclose(cm.accuracy(), sklearn_accuracy)}")

## Top Losses Analysis

In [None]:
# Simulate per-sample losses and probabilities
losses = np.random.rand(len(y_test)).astype(np.float32)
losses[wrong_idx] = losses[wrong_idx] + 2.0  # Higher losses for wrong predictions
probs = np.random.rand(len(y_test)).astype(np.float32) * 0.5 + 0.5  # 0.5 to 1.0

# Find top 10 losses
top_10_losses = tsai_rs.top_losses(losses, y_test_int, y_pred, probs, k=10)

print("Top 10 losses:")
for tl in top_10_losses:
    print(f"  {tl}")

## Train/Test Split Utilities

In [None]:
# Create custom train/test split indices
n_total = len(X)
train_indices, test_indices = tsai_rs.train_test_split_indices(
    n_samples=n_total,
    test_size=0.2,
    shuffle=True,
    seed=42
)

print(f"Total samples: {n_total}")
print(f"Train samples: {len(train_indices)}")
print(f"Test samples: {len(test_indices)}")

In [None]:
# Combine separate splits back together
X_combined, y_combined, combined_splits = tsai_rs.combine_split_data(
    [X_train, X_test],
    [y_train, y_test]
)

print(f"Combined X shape: {X_combined.shape}")
print(f"Combined y shape: {y_combined.shape}")
print(f"Split 0 (train) size: {len(combined_splits[0])}")
print(f"Split 1 (test) size: {len(combined_splits[1])}")

## Summary

This notebook demonstrated the key features of tsai-rs:

1. **Data Loading**: `get_UCR_data`, `get_UCR_univariate_list`, `get_UCR_multivariate_list`
2. **Dataset**: `TSDataset` for storing and manipulating time series data
3. **Model Configs**: `InceptionTimePlusConfig`, `ResNetPlusConfig`, `PatchTSTConfig`, `TSTConfig`, `RNNPlusConfig`, `MiniRocketConfig`
4. **Training**: `LearnerConfig`, `OneCycleLR` scheduler
5. **Preprocessing**: `ts_standardize`
6. **Augmentation**: `add_gaussian_noise`, `mag_scale`
7. **TS-to-Image**: `compute_gasf`, `compute_gadf`, `compute_recurrence_plot`
8. **Analysis**: `confusion_matrix`, `top_losses`
9. **Utilities**: `train_test_split_indices`, `combine_split_data`

For full training with GPU acceleration, use the Rust API directly via the CLI or Rust code.

In [None]:
# Quick summary code
dsid = 'NATOPS'
X_train, y_train, X_test, y_test = tsai_rs.get_UCR_data(dsid, return_split=True)

# Preprocess
X_train_std = tsai_rs.ts_standardize(X_train.astype(np.float32))
X_test_std = tsai_rs.ts_standardize(X_test.astype(np.float32))

# Configure model
n_vars, seq_len = X_train.shape[1], X_train.shape[2]
n_classes = len(np.unique(y_train))

config = tsai_rs.InceptionTimePlusConfig(
    n_vars=n_vars,
    seq_len=seq_len,
    n_classes=n_classes
)

print(f"Ready to train {config} on {dsid}!")