<a href="https://github.com/timeseriesAI/tsai-rs" target="_parent"><img src="https://img.shields.io/badge/tsai--rs-Time%20Series%20AI%20in%20Rust-blue" alt="tsai-rs"/></a>

# Intro to Time Series Regression with tsai-rs

This notebook demonstrates time series regression using **tsai-rs**, a Rust implementation with Python bindings.

## Purpose

Time series regression is a task where you predict a continuous value from a univariate or multivariate time series. Unlike classification (which predicts discrete categories), regression predicts numerical values.

Common applications:
- Energy consumption prediction
- Weather forecasting
- Stock price prediction
- Sensor-based measurements

## Install tsai-rs

```bash
cd crates/tsai_python
maturin develop --release
```

## Import Libraries

In [None]:
import tsai_rs
import numpy as np
import sklearn.metrics as skm

print(f"tsai-rs version: {tsai_rs.version()}")
tsai_rs.my_setup()

## Understanding Regression vs Classification

| Aspect | Classification | Regression |
|--------|----------------|------------|
| Target | Discrete categories | Continuous values |
| Loss | Cross-Entropy | MSE, MAE |
| Metrics | Accuracy, F1 | RMSE, MAE, R2 |
| Output | Probabilities | Single value |

## Load Sample Data

We'll use a UCR classification dataset but treat the class labels as continuous targets for demonstration.

In [None]:
# Load multivariate dataset
dsid = 'NATOPS'
X_train, y_train, X_test, y_test = tsai_rs.get_UCR_data(dsid, return_split=True)

print(f"Dataset: {dsid}")
print(f"X_train shape: {X_train.shape} (samples, variables, length)")
print(f"X_test shape: {X_test.shape}")
print(f"Original y_train (labels): {np.unique(y_train)}")

In [None]:
# For regression demo, convert labels to continuous targets
# In a real scenario, you would have actual continuous target values
y_train_reg = y_train.astype(np.float32) + np.random.randn(len(y_train)).astype(np.float32) * 0.1
y_test_reg = y_test.astype(np.float32) + np.random.randn(len(y_test)).astype(np.float32) * 0.1

print(f"Regression targets (y_train):")
print(f"  Min: {y_train_reg.min():.4f}")
print(f"  Max: {y_train_reg.max():.4f}")
print(f"  Mean: {y_train_reg.mean():.4f}")
print(f"  Std: {y_train_reg.std():.4f}")
print(f"  First 10 values: {y_train_reg[:10]}")

## Prepare Data for Regression

In [None]:
# Standardize input data
X_train_std = tsai_rs.ts_standardize(X_train.astype(np.float32), by_sample=True)
X_test_std = tsai_rs.ts_standardize(X_test.astype(np.float32), by_sample=True)

print(f"Standardized X_train shape: {X_train_std.shape}")
print(f"Sample 0 mean: {X_train_std[0].mean():.6f}")
print(f"Sample 0 std: {X_train_std[0].std():.6f}")

## Configure Models for Regression

For regression tasks, we set `n_classes=1` (single output) instead of the number of classes.

In [None]:
n_vars = X_train.shape[1]
seq_len = X_train.shape[2]
n_outputs = 1  # Single continuous output for regression

print(f"Variables: {n_vars}")
print(f"Sequence length: {seq_len}")
print(f"Outputs: {n_outputs}")

In [None]:
# InceptionTimePlus for regression
inception_config = tsai_rs.InceptionTimePlusConfig(
    n_vars=n_vars,
    seq_len=seq_len,
    n_classes=n_outputs  # 1 for regression
)
print(f"InceptionTimePlus (Regression): {inception_config}")

In [None]:
# ResNetPlus for regression
resnet_config = tsai_rs.ResNetPlusConfig(
    n_vars=n_vars,
    seq_len=seq_len,
    n_classes=n_outputs
)
print(f"ResNetPlus (Regression): {resnet_config}")

In [None]:
# TSTConfig for regression
tst_config = tsai_rs.TSTConfig(
    n_vars=n_vars,
    seq_len=seq_len,
    n_classes=n_outputs,
    d_model=64,
    n_heads=4,
    n_layers=2
)
print(f"TST (Regression): {tst_config}")

In [None]:
# RNNPlus for regression
rnn_config = tsai_rs.RNNPlusConfig(
    n_vars=n_vars,
    seq_len=seq_len,
    n_outputs=n_outputs,  # RNNPlus uses n_outputs
    hidden_dim=64,
    n_layers=2,
    rnn_type='lstm',
    bidirectional=True
)
print(f"RNNPlus (Regression): {rnn_config}")

## Training Configuration

In [None]:
# Learner configuration for regression
learner_config = tsai_rs.LearnerConfig(
    lr=1e-3,
    weight_decay=0.01,
    grad_clip=1.0
)
print(f"Learner config: {learner_config}")

In [None]:
# One-cycle learning rate scheduler
n_epochs = 50
batch_size = 32
steps_per_epoch = len(X_train) // batch_size
total_steps = n_epochs * steps_per_epoch

scheduler = tsai_rs.OneCycleLR.simple(max_lr=1e-3, total_steps=total_steps)

print(f"Training setup:")
print(f"  Epochs: {n_epochs}")
print(f"  Batch size: {batch_size}")
print(f"  Steps per epoch: {steps_per_epoch}")
print(f"  Total steps: {total_steps}")

## Regression Metrics

Common metrics for regression tasks:

In [None]:
def calculate_regression_metrics(y_true, y_pred):
    """Calculate common regression metrics."""
    mse = skm.mean_squared_error(y_true, y_pred)
    rmse = np.sqrt(mse)
    mae = skm.mean_absolute_error(y_true, y_pred)
    r2 = skm.r2_score(y_true, y_pred)
    
    return {
        'MSE': mse,
        'RMSE': rmse,
        'MAE': mae,
        'R2': r2
    }

# Simulate predictions for demonstration
np.random.seed(42)
y_pred_sim = y_test_reg + np.random.randn(len(y_test_reg)).astype(np.float32) * 0.5

metrics = calculate_regression_metrics(y_test_reg, y_pred_sim)
print("Regression Metrics (simulated):")
for name, value in metrics.items():
    print(f"  {name}: {value:.4f}")

## Data Augmentation for Regression

In [None]:
# Apply augmentation to training data
X_train_aug = tsai_rs.add_gaussian_noise(X_train_std, std=0.05, seed=42)
X_train_aug = tsai_rs.mag_scale(X_train_aug, scale_range=(0.9, 1.1), seed=42)

print(f"Original data stats:")
print(f"  Mean: {X_train_std.mean():.6f}")
print(f"  Std: {X_train_std.std():.6f}")

print(f"\nAugmented data stats:")
print(f"  Mean: {X_train_aug.mean():.6f}")
print(f"  Std: {X_train_aug.std():.6f}")

## Create TSDataset for Regression

In [None]:
# Create datasets
train_ds = tsai_rs.TSDataset(X_train_std, y_train_reg)
test_ds = tsai_rs.TSDataset(X_test_std, y_test_reg)

print(f"Train dataset: {train_ds}")
print(f"Test dataset: {test_ds}")

## Multi-output Regression

For predicting multiple continuous values simultaneously:

In [None]:
# Example: Predicting multiple targets
n_outputs_multi = 3

# Simulate multi-output targets
y_train_multi = np.column_stack([
    y_train_reg,
    y_train_reg * 0.5 + np.random.randn(len(y_train)).astype(np.float32) * 0.1,
    y_train_reg * 0.25 + np.random.randn(len(y_train)).astype(np.float32) * 0.05
])

print(f"Multi-output targets shape: {y_train_multi.shape}")
print(f"First sample targets: {y_train_multi[0]}")

In [None]:
# Configure model for multi-output regression
inception_multi_config = tsai_rs.InceptionTimePlusConfig(
    n_vars=n_vars,
    seq_len=seq_len,
    n_classes=n_outputs_multi  # Multiple outputs
)
print(f"InceptionTimePlus (Multi-output): {inception_multi_config}")

## MiniRocket for Regression

In [None]:
# MiniRocket can also be used for regression
minirocket_config = tsai_rs.MiniRocketConfig(
    n_vars=n_vars,
    seq_len=seq_len,
    n_classes=n_outputs,  # 1 for single-output regression
    n_kernels=10000
)
print(f"MiniRocket (Regression): {minirocket_config}")

## Working with Multiple Datasets

In [None]:
# Test configuration with different UCR datasets
datasets = ['ECG200', 'FordA', 'Wafer', 'GunPoint']

print("Regression configurations for various datasets:")
print("-" * 70)

for dsid in datasets:
    try:
        X_train, y_train, X_test, y_test = tsai_rs.get_UCR_data(dsid, return_split=True)
        n_vars = X_train.shape[1]
        seq_len = X_train.shape[2]
        
        config = tsai_rs.InceptionTimePlusConfig(
            n_vars=n_vars,
            seq_len=seq_len,
            n_classes=1  # Regression
        )
        
        print(f"{dsid:20} | train: {X_train.shape[0]:4d} | test: {X_test.shape[0]:4d} | "
              f"vars: {n_vars:2d} | len: {seq_len:4d} | config: {config}")
    except Exception as e:
        print(f"{dsid:20} | Error: {e}")

## Summary

This notebook demonstrated time series regression with tsai-rs:

### Key Differences from Classification
1. Set `n_classes=1` for single-output regression
2. Set `n_classes=N` for multi-output regression (N outputs)
3. Target values (y) should be floats, not integers
4. Use regression metrics: MSE, RMSE, MAE, R2

### Model Configurations for Regression
- `InceptionTimePlusConfig`: CNN-based, fast and accurate
- `ResNetPlusConfig`: Residual connections
- `TSTConfig`: Transformer-based
- `RNNPlusConfig`: LSTM/GRU-based
- `MiniRocketConfig`: Feature-based, very fast

### Preprocessing
- `ts_standardize`: Normalize inputs
- `add_gaussian_noise`, `mag_scale`: Data augmentation

In [None]:
# Complete regression workflow example
dsid = 'NATOPS'
X_train, y_train, X_test, y_test = tsai_rs.get_UCR_data(dsid, return_split=True)

# Convert to regression targets (in practice, you'd have real continuous targets)
y_train_reg = y_train.astype(np.float32)
y_test_reg = y_test.astype(np.float32)

# Preprocess
X_train_std = tsai_rs.ts_standardize(X_train.astype(np.float32), by_sample=True)
X_test_std = tsai_rs.ts_standardize(X_test.astype(np.float32), by_sample=True)

# Configure model for regression
n_vars, seq_len = X_train.shape[1], X_train.shape[2]

config = tsai_rs.InceptionTimePlusConfig(
    n_vars=n_vars,
    seq_len=seq_len,
    n_classes=1  # Single output for regression
)

print(f"Ready for regression training on {dsid}!")
print(f"Config: {config}")
print(f"Target range: [{y_train_reg.min():.2f}, {y_train_reg.max():.2f}]")