# sEMG-HHT CNN Classifier for Movement Quality and Fatigue Classification

This notebook implements a Convolutional Neural Network (CNN) encoder with SVM classifier for classifying surface electromyography (sEMG) signals based on movement quality and fatigue levels.

## Architecture Overview
- **Input**: 256×256 matrix from HHT (Hilbert-Huang Transform) of sEMG signals
- **Encoder**: 3-layer CNN with Conv2D + InstanceNorm + LeakyReLU
- **Pooling**: Global Average Pooling
- **Classifier**: SVM for multi-class classification

## Requirements
- PyTorch
- scikit-learn
- NumPy
- Matplotlib

# Kaggle Integration | Kaggle 集成

This notebook is configured to work with the **HILBERTMATRIX_NPZ** dataset on Kaggle.

本笔记本配置为使用 Kaggle 上的 **HILBERTMATRIX_NPZ** 数据集。

Data path: `/kaggle/input/hilbertmatrix-npz/hht_matrices/`

# sEMG-HHT CNN Classifier - Kaggle Training
# sEMG-HHT CNN 分类器 - Kaggle 训练

This notebook trains a CNN-SVM classifier on sEMG Hilbert-Huang Transform data for 6-class classification:
本笔记本训练一个 CNN-SVM 分类器，用于 sEMG 希尔伯特-黄变换数据的 6 类分类：

- **Gender (性别)**: Male (男性, M), Female (女性, F)
- **Movement Quality (动作质量)**: Full (完整), Half (半程), Invalid (无效)

## Dataset Integration | 数据集集成

This notebook uses the **HILBERTMATRIX_NPZ** dataset from Kaggle.
本笔记本使用 Kaggle 上的 **HILBERTMATRIX_NPZ** 数据集。

Data location: `/kaggle/input/hilbertmatrix-npz/hht_matrices/`
数据位置：`/kaggle/input/hilbertmatrix-npz/hht_matrices/`

In [None]:
# Configure data paths for Kaggle
# 配置 Kaggle 数据路径
import os

# Check if running on Kaggle
# 检查是否在 Kaggle 上运行
IS_KAGGLE = os.path.exists('/kaggle/input')

if IS_KAGGLE:
    DATA_DIR = '/kaggle/input/hilbertmatrix-npz/hht_matrices'
    CHECKPOINT_DIR = '/kaggle/working/checkpoints'
    print(f'Running on Kaggle | 在 Kaggle 上运行')
    print(f'Data directory | 数据目录: {DATA_DIR}')
else:
    DATA_DIR = './data'
    CHECKPOINT_DIR = './checkpoints'
    print(f'Running locally | 本地运行')
    print(f'Data directory | 数据目录: {DATA_DIR}')

os.makedirs(CHECKPOINT_DIR, exist_ok=True)
print(f'Checkpoint directory | 检查点目录: {CHECKPOINT_DIR}')

In [None]:
# Install dependencies (uncomment if running on Kaggle or fresh environment)
# !pip install torch torchvision scikit-learn numpy matplotlib scipy PyEMD

In [None]:
import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
from typing import Tuple, Optional
import warnings

# Suppress specific warnings that are not critical for this demo
warnings.filterwarnings('ignore', category=FutureWarning)
warnings.filterwarnings('ignore', category=UserWarning, module='sklearn')

# Set random seeds for reproducibility
SEED = 42
torch.manual_seed(SEED)
np.random.seed(SEED)

# Check for GPU availability
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

## 1. CNN Encoder Architecture

The encoder consists of 3 convolutional layers, each with:
- Conv2D (kernel=3, stride=2, padding=1)
- Instance Normalization
- LeakyReLU activation

This progressively reduces the spatial dimensions while extracting features.

In [None]:
class ConvBlock(nn.Module):
    """Convolutional block with Conv2D, InstanceNorm, and LeakyReLU."""
    
    def __init__(self, in_channels: int, out_channels: int, 
                 kernel_size: int = 3, stride: int = 2, padding: int = 1,
                 leaky_slope: float = 0.2):
        super(ConvBlock, self).__init__()
        self.conv = nn.Conv2d(in_channels, out_channels, 
                              kernel_size=kernel_size, 
                              stride=stride, 
                              padding=padding)
        self.instance_norm = nn.InstanceNorm2d(out_channels)
        self.activation = nn.LeakyReLU(negative_slope=leaky_slope)
    
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        x = self.conv(x)
        x = self.instance_norm(x)
        x = self.activation(x)
        return x


class sEMGHHTEncoder(nn.Module):
    """
    CNN Encoder for sEMG-HHT matrix classification.
    
    Architecture:
    - Input: 1×256×256 (single-channel HHT matrix)
    - 3 ConvBlocks with increasing channels
    - Global Average Pooling
    - Output: Feature vector for SVM classification
    """
    
    def __init__(self, in_channels: int = 1, 
                 base_channels: int = 64,
                 num_layers: int = 3,
                 leaky_slope: float = 0.2):
        super(sEMGHHTEncoder, self).__init__()
        
        self.in_channels = in_channels
        self.base_channels = base_channels
        self.num_layers = num_layers
        
        # Build convolutional layers
        layers = []
        current_channels = in_channels
        
        for i in range(num_layers):
            out_channels = base_channels * (2 ** i)
            layers.append(ConvBlock(
                in_channels=current_channels,
                out_channels=out_channels,
                kernel_size=3,
                stride=2,
                padding=1,
                leaky_slope=leaky_slope
            ))
            current_channels = out_channels
        
        self.encoder = nn.Sequential(*layers)
        self.global_avg_pool = nn.AdaptiveAvgPool2d((1, 1))
        
        # Calculate output feature dimension
        self.feature_dim = base_channels * (2 ** (num_layers - 1))
    
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        """
        Forward pass through the encoder.
        
        Args:
            x: Input tensor of shape (batch, channels, height, width)
        
        Returns:
            Feature tensor of shape (batch, feature_dim)
        """
        x = self.encoder(x)
        x = self.global_avg_pool(x)
        x = x.view(x.size(0), -1)  # Flatten to (batch, feature_dim)
        return x
    
    def get_feature_dim(self) -> int:
        """Return the output feature dimension."""
        return self.feature_dim

## 2. Complete Classification Pipeline

This class combines the CNN encoder with an SVM classifier for end-to-end classification.

In [None]:
class sEMGHHTClassifier:
    """
    Complete classification pipeline combining CNN encoder and SVM classifier.
    
    The pipeline:
    1. Extracts features using CNN encoder
    2. Normalizes features using StandardScaler
    3. Classifies using SVM (supports multi-class)
    """
    
    def __init__(self, 
                 encoder: Optional[sEMGHHTEncoder] = None,
                 svm_kernel: str = 'rbf',
                 svm_C: float = 1.0,
                 svm_gamma: str = 'scale',
                 device: torch.device = torch.device('cpu')):
        """
        Initialize the classifier.
        
        Args:
            encoder: Pre-trained or new CNN encoder (creates default if None)
            svm_kernel: SVM kernel type ('rbf', 'linear', 'poly')
            svm_C: SVM regularization parameter
            svm_gamma: SVM gamma parameter
            device: Device to run the encoder on
        """
        self.device = device
        
        # Initialize encoder
        if encoder is None:
            self.encoder = sEMGHHTEncoder(
                in_channels=1, 
                base_channels=64, 
                num_layers=3
            )
        else:
            self.encoder = encoder
        
        self.encoder.to(self.device)
        
        # Initialize scaler and SVM
        self.scaler = StandardScaler()
        self.svm = SVC(
            kernel=svm_kernel,
            C=svm_C,
            gamma=svm_gamma,
            decision_function_shape='ovr',  # One-vs-Rest for multi-class
            probability=True  # Enable probability estimates
        )
        
        self._is_fitted = False
    
    def extract_features(self, X: np.ndarray, batch_size: int = 32) -> np.ndarray:
        """
        Extract features from HHT matrices using the CNN encoder.
        
        Args:
            X: Input array of shape (n_samples, height, width) or (n_samples, 1, height, width)
            batch_size: Batch size for processing
        
        Returns:
            Feature array of shape (n_samples, feature_dim)
        """
        self.encoder.eval()
        
        # Ensure correct shape
        if X.ndim == 3:
            X = X[:, np.newaxis, :, :]  # Add channel dimension
        
        features_list = []
        n_samples = X.shape[0]
        
        with torch.no_grad():
            for i in range(0, n_samples, batch_size):
                batch = torch.tensor(X[i:i+batch_size], dtype=torch.float32).to(self.device)
                batch_features = self.encoder(batch)
                features_list.append(batch_features.cpu().numpy())
        
        return np.vstack(features_list)
    
    def fit(self, X: np.ndarray, y: np.ndarray, batch_size: int = 32):
        """
        Fit the classifier (extract features and train SVM).
        
        Args:
            X: Training HHT matrices of shape (n_samples, height, width)
            y: Training labels of shape (n_samples,)
            batch_size: Batch size for feature extraction
        """
        print("Extracting features from training data...")
        features = self.extract_features(X, batch_size)
        
        print("Normalizing features...")
        features_scaled = self.scaler.fit_transform(features)
        
        print("Training SVM classifier...")
        self.svm.fit(features_scaled, y)
        
        self._is_fitted = True
        print("Training complete!")
    
    def predict(self, X: np.ndarray, batch_size: int = 32) -> np.ndarray:
        """
        Predict class labels for samples.
        
        Args:
            X: Input HHT matrices of shape (n_samples, height, width)
            batch_size: Batch size for feature extraction
        
        Returns:
            Predicted labels of shape (n_samples,)
        """
        if not self._is_fitted:
            raise RuntimeError("Classifier must be fitted before predicting")
        
        features = self.extract_features(X, batch_size)
        features_scaled = self.scaler.transform(features)
        return self.svm.predict(features_scaled)
    
    def predict_proba(self, X: np.ndarray, batch_size: int = 32) -> np.ndarray:
        """
        Predict class probabilities for samples.
        
        Args:
            X: Input HHT matrices of shape (n_samples, height, width)
            batch_size: Batch size for feature extraction
        
        Returns:
            Probability array of shape (n_samples, n_classes)
        """
        if not self._is_fitted:
            raise RuntimeError("Classifier must be fitted before predicting")
        
        features = self.extract_features(X, batch_size)
        features_scaled = self.scaler.transform(features)
        return self.svm.predict_proba(features_scaled)
    
    def evaluate(self, X: np.ndarray, y: np.ndarray, 
                 batch_size: int = 32) -> dict:
        """
        Evaluate the classifier on test data.
        
        Args:
            X: Test HHT matrices
            y: True labels
            batch_size: Batch size for feature extraction
        
        Returns:
            Dictionary containing accuracy, predictions, and classification report
        """
        y_pred = self.predict(X, batch_size)
        accuracy = accuracy_score(y, y_pred)
        
        return {
            'accuracy': accuracy,
            'predictions': y_pred,
            'classification_report': classification_report(y, y_pred),
            'confusion_matrix': confusion_matrix(y, y_pred)
        }

## 3. End-to-End Training with Encoder Fine-tuning (Optional)

For better performance, you can fine-tune the encoder alongside training a neural network classifier.

In [None]:
class sEMGHHTEndToEndClassifier(nn.Module):
    """
    End-to-end trainable classifier with CNN encoder and linear classification head.
    
    This version allows the encoder to be fine-tuned during training.
    """
    
    def __init__(self, 
                 n_classes: int = 4,
                 in_channels: int = 1,
                 base_channels: int = 64,
                 num_encoder_layers: int = 3,
                 dropout_rate: float = 0.5):
        super(sEMGHHTEndToEndClassifier, self).__init__()
        
        self.encoder = sEMGHHTEncoder(
            in_channels=in_channels,
            base_channels=base_channels,
            num_layers=num_encoder_layers
        )
        
        feature_dim = self.encoder.get_feature_dim()
        
        # Classification head
        self.classifier = nn.Sequential(
            nn.Dropout(dropout_rate),
            nn.Linear(feature_dim, 128),
            nn.ReLU(),
            nn.Dropout(dropout_rate),
            nn.Linear(128, n_classes)
        )
    
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        features = self.encoder(x)
        logits = self.classifier(features)
        return logits
    
    def get_features(self, x: torch.Tensor) -> torch.Tensor:
        """Extract features without classification."""
        return self.encoder(x)

In [None]:
def train_end_to_end(model: nn.Module,
                     X_train: np.ndarray,
                     y_train: np.ndarray,
                     X_val: np.ndarray,
                     y_val: np.ndarray,
                     epochs: int = 50,
                     batch_size: int = 16,
                     learning_rate: float = 0.001,
                     device: torch.device = torch.device('cpu')) -> dict:
    """
    Train the end-to-end model.
    
    Args:
        model: The model to train
        X_train, y_train: Training data and labels
        X_val, y_val: Validation data and labels
        epochs: Number of training epochs
        batch_size: Batch size
        learning_rate: Learning rate
        device: Device to train on
    
    Returns:
        Dictionary containing training history
    """
    model = model.to(device)
    
    # Ensure correct shape
    if X_train.ndim == 3:
        X_train = X_train[:, np.newaxis, :, :]
        X_val = X_val[:, np.newaxis, :, :]
    
    # Create data loaders
    train_dataset = torch.utils.data.TensorDataset(
        torch.tensor(X_train, dtype=torch.float32),
        torch.tensor(y_train, dtype=torch.long)
    )
    train_loader = torch.utils.data.DataLoader(
        train_dataset, batch_size=batch_size, shuffle=True
    )
    
    val_dataset = torch.utils.data.TensorDataset(
        torch.tensor(X_val, dtype=torch.float32),
        torch.tensor(y_val, dtype=torch.long)
    )
    val_loader = torch.utils.data.DataLoader(
        val_dataset, batch_size=batch_size, shuffle=False
    )
    
    # Loss and optimizer
    criterion = nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
    scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(
        optimizer, mode='max', factor=0.5, patience=5
    )
    
    history = {
        'train_loss': [],
        'train_acc': [],
        'val_loss': [],
        'val_acc': []
    }
    
    best_val_acc = 0.0
    
    for epoch in range(epochs):
        # Training phase
        model.train()
        train_loss = 0.0
        train_correct = 0
        train_total = 0
        
        for batch_X, batch_y in train_loader:
            batch_X, batch_y = batch_X.to(device), batch_y.to(device)
            
            optimizer.zero_grad()
            outputs = model(batch_X)
            loss = criterion(outputs, batch_y)
            loss.backward()
            optimizer.step()
            
            train_loss += loss.item() * batch_X.size(0)
            _, predicted = outputs.max(1)
            train_total += batch_y.size(0)
            train_correct += predicted.eq(batch_y).sum().item()
        
        train_loss /= train_total
        train_acc = train_correct / train_total
        
        # Validation phase
        model.eval()
        val_loss = 0.0
        val_correct = 0
        val_total = 0
        
        with torch.no_grad():
            for batch_X, batch_y in val_loader:
                batch_X, batch_y = batch_X.to(device), batch_y.to(device)
                
                outputs = model(batch_X)
                loss = criterion(outputs, batch_y)
                
                val_loss += loss.item() * batch_X.size(0)
                _, predicted = outputs.max(1)
                val_total += batch_y.size(0)
                val_correct += predicted.eq(batch_y).sum().item()
        
        val_loss /= val_total
        val_acc = val_correct / val_total
        
        # Update learning rate
        scheduler.step(val_acc)
        
        # Save history
        history['train_loss'].append(train_loss)
        history['train_acc'].append(train_acc)
        history['val_loss'].append(val_loss)
        history['val_acc'].append(val_acc)
        
        if val_acc > best_val_acc:
            best_val_acc = val_acc
        
        if (epoch + 1) % 10 == 0:
            print(f"Epoch {epoch+1}/{epochs}: "
                  f"Train Loss: {train_loss:.4f}, Train Acc: {train_acc:.4f}, "
                  f"Val Loss: {val_loss:.4f}, Val Acc: {val_acc:.4f}")
    
    print(f"\nBest Validation Accuracy: {best_val_acc:.4f}")
    return history

## 4. Model Saving and Loading

In [None]:
import pickle

def save_svm_classifier(classifier: sEMGHHTClassifier, path: str):
    """Save the SVM-based classifier to disk."""
    # Save encoder
    torch.save(classifier.encoder.state_dict(), f"{path}_encoder.pt")
    
    # Save scaler and SVM
    with open(f"{path}_scaler.pkl", 'wb') as f:
        pickle.dump(classifier.scaler, f)
    
    with open(f"{path}_svm.pkl", 'wb') as f:
        pickle.dump(classifier.svm, f)
    
    print(f"Classifier saved to {path}_*.pt/pkl")

def load_svm_classifier(path: str, device: torch.device = torch.device('cpu')) -> sEMGHHTClassifier:
    """Load a saved SVM-based classifier."""
    classifier = sEMGHHTClassifier(device=device)
    
    # Load encoder
    classifier.encoder.load_state_dict(torch.load(f"{path}_encoder.pt", map_location=device))
    
    # Load scaler and SVM
    with open(f"{path}_scaler.pkl", 'rb') as f:
        classifier.scaler = pickle.load(f)
    
    with open(f"{path}_svm.pkl", 'rb') as f:
        classifier.svm = pickle.load(f)
    
    classifier._is_fitted = True
    print(f"Classifier loaded from {path}_*.pt/pkl")
    return classifier

def save_e2e_model(model: nn.Module, path: str):
    """Save the end-to-end model."""
    torch.save(model.state_dict(), path)
    print(f"Model saved to {path}")

def load_e2e_model(path: str, n_classes: int = 4, 
                   device: torch.device = torch.device('cpu')) -> sEMGHHTEndToEndClassifier:
    """Load a saved end-to-end model."""
    model = sEMGHHTEndToEndClassifier(n_classes=n_classes)
    model.load_state_dict(torch.load(path, map_location=device))
    model.to(device)
    model.eval()
    print(f"Model loaded from {path}")
    return model

## Real Data Training | 真实数据训练

Train the classifier on real sEMG HHT data from the Kaggle dataset.
在 Kaggle 数据集的真实 sEMG HHT 数据上训练分类器。

In [None]:
# Import required modules
# 导入所需模块
import glob
import re
from sklearn.preprocessing import LabelEncoder

def parse_filename(filename):
    """
    Parse filename to extract labels.
    解析文件名以提取标签。
    
    Returns None if filename starts with 'Test' (unlabeled test data)
    如果文件名以 'Test' 开头则返回 None（未标记的测试数据）
    """
    basename = os.path.basename(filename)
    
    # Skip files that start with 'Test'
    # 跳过以 'Test' 开头的文件
    if basename.lower().startswith('test'):
        return None
    
    # Extract gender (M or F)
    # 提取性别（M 或 F）
    gender_match = re.search(r'[_-]([MF])[_-]', basename)
    if not gender_match:
        return None
    gender = gender_match.group(1)
    
    # Extract movement quality
    # 提取动作质量
    basename_lower = basename.lower()
    if 'fatiguetest' in basename_lower or 'full' in basename_lower:
        movement = 'full'
    elif 'half' in basename_lower:
        movement = 'half'
    elif 'invalid' in basename_lower or 'wrong' in basename_lower:
        movement = 'invalid'
    else:
        return None
    
    return {'gender': gender, 'movement': movement}

def load_data_from_directory(data_dir):
    """
    Load HHT matrices from npz files.
    从 npz 文件加载 HHT 矩阵。
    """
    npz_files = glob.glob(os.path.join(data_dir, '*.npz'))
    
    X_list = []
    y_list = []
    filenames = []
    test_files = []
    
    # Create label encoder
    # 创建标签编码器
    all_classes = ['M_full', 'M_half', 'M_invalid', 'F_full', 'F_half', 'F_invalid']
    label_encoder = LabelEncoder()
    label_encoder.fit(all_classes)
    
    for npz_file in npz_files:
        labels = parse_filename(npz_file)
        
        if labels is None:
            test_files.append(npz_file)
            continue
        
        try:
            data = np.load(npz_file)
            if 'hht' in data:
                hht_matrix = data['hht']
            else:
                hht_matrix = data[list(data.keys())[0]]
            
            if hht_matrix.shape != (256, 256):
                continue
            
            # Create combined label
            # 创建组合标签
            combined = f"{labels['gender']}_{labels['movement']}"
            label = label_encoder.transform([combined])[0]
            
            X_list.append(hht_matrix)
            y_list.append(label)
            filenames.append(npz_file)
            
        except Exception as e:
            print(f'Error loading {npz_file}: {e}')
            continue
    
    X = np.array(X_list, dtype=np.float32)
    y = np.array(y_list)
    
    print(f'\nLoaded {len(X)} training samples | 加载了 {len(X)} 个训练样本')
    print(f'Found {len(test_files)} test files | 找到 {len(test_files)} 个测试文件')
    print(f'\nClass distribution | 类别分布:')
    for i, class_name in enumerate(label_encoder.classes_):
        count = np.sum(y == i)
        print(f'  {class_name}: {count} samples')
    
    return X, y, filenames, test_files, label_encoder

# Load data from Kaggle dataset
# 从 Kaggle 数据集加载数据
if os.path.exists(DATA_DIR):
    X, y, filenames, test_files, label_encoder = load_data_from_directory(DATA_DIR)
else:
    print(f'Data directory not found | 数据目录未找到: {DATA_DIR}')
    print('Please ensure the HILBERTMATRIX_NPZ dataset is added to this notebook.')
    print('请确保 HILBERTMATRIX_NPZ 数据集已添加到此笔记本。')

## 6. Real HHT Transformation (For Reference)

When you have real sEMG data, you can use the following functions to perform HHT transformation.

In [None]:
# Train the classifier | 训练分类器
from sklearn.model_selection import train_test_split

if 'X' in locals() and len(X) > 0:
    # Split data | 分割数据
    X_train, X_val, y_train, y_val = train_test_split(
        X, y, test_size=0.2, random_state=42, stratify=y
    )
    
    print(f'Training set | 训练集: {X_train.shape[0]} samples')
    print(f'Validation set | 验证集: {X_val.shape[0]} samples')
    
    # Initialize classifier | 初始化分类器
    classifier = sEMGHHTClassifier(
        encoder=None,
        svm_kernel='rbf',
        svm_C=10.0,
        svm_gamma='scale',
        device=device
    )
    
    # Train | 训练
    print('\n' + '='*60)
    print('Training SVM Classifier | 训练 SVM 分类器')
    print('='*60)
    classifier.fit(X_train, y_train, batch_size=32)
    
    # Evaluate on training set | 在训练集上评估
    train_results = classifier.evaluate(X_train, y_train, batch_size=32)
    print(f'\nTraining Accuracy | 训练准确率: {train_results["accuracy"]:.4f}')
    
    # Evaluate on validation set | 在验证集上评估
    val_results = classifier.evaluate(X_val, y_val, batch_size=32)
    print(f'Validation Accuracy | 验证准确率: {val_results["accuracy"]:.4f}')
    print('\nClassification Report | 分类报告:')
    print(val_results['classification_report'])
else:
    print('No data loaded. Please check the data directory.')
    print('未加载数据。请检查数据目录。')

In [None]:
def compute_hht_matrix(signal: np.ndarray, 
                       fs: float, 
                       matrix_size: int = 256,
                       max_imf: int = 10) -> np.ndarray:
    """
    Compute HHT (Hilbert-Huang Transform) matrix from a signal.
    
    This is a reference implementation. For real usage, you may need to install:
    pip install PyEMD scipy
    
    Args:
        signal: 1D input signal
        fs: Sampling frequency
        matrix_size: Output matrix size (matrix_size × matrix_size)
        max_imf: Maximum number of IMFs to extract
    
    Returns:
        HHT matrix of shape (matrix_size, matrix_size)
    """
    try:
        from PyEMD import EMD
        from scipy.signal import hilbert
    except ImportError:
        raise ImportError("Please install PyEMD and scipy: pip install PyEMD scipy")
    
    # Perform EMD
    emd = EMD()
    imfs = emd(signal, max_imf=max_imf)
    
    # Compute Hilbert transform for each IMF
    n_samples = len(signal)
    t = np.arange(n_samples) / fs
    
    # Initialize time-frequency matrix
    freq_bins = np.linspace(0, fs/2, matrix_size)
    time_bins = np.linspace(0, t[-1], matrix_size)
    hht_matrix = np.zeros((matrix_size, matrix_size))
    
    for imf in imfs:
        # Compute analytic signal
        analytic = hilbert(imf)
        amplitude = np.abs(analytic)
        phase = np.unwrap(np.angle(analytic))
        
        # Compute instantaneous frequency
        inst_freq = np.diff(phase) / (2 * np.pi) * fs
        inst_freq = np.concatenate([inst_freq, [inst_freq[-1]]])
        inst_freq = np.clip(inst_freq, 0, fs/2)
        
        # Map to time-frequency matrix
        for i, (ti, fi, ai) in enumerate(zip(t, inst_freq, amplitude)):
            t_idx = int(ti / t[-1] * (matrix_size - 1))
            f_idx = int(fi / (fs/2) * (matrix_size - 1))
            
            t_idx = np.clip(t_idx, 0, matrix_size - 1)
            f_idx = np.clip(f_idx, 0, matrix_size - 1)
            
            hht_matrix[f_idx, t_idx] += ai
    
    # Normalize
    if hht_matrix.max() > 0:
        hht_matrix = hht_matrix / hht_matrix.max()
    
    return hht_matrix.astype(np.float32)


# Example usage (uncomment when you have real data):
# signal = np.random.randn(1000)  # Replace with real sEMG signal
# fs = 1000  # Sampling frequency in Hz
# hht_matrix = compute_hht_matrix(signal, fs, matrix_size=256)
# plt.imshow(hht_matrix, aspect='auto', cmap='hot', origin='lower')
# plt.colorbar()
# plt.show()

In [None]:
# Run inference on test files | 对测试文件进行推理
if 'classifier' in locals() and len(test_files) > 0:
    print('\n' + '='*60)
    print('Running Inference on Test Files | 对测试文件进行推理')
    print('='*60)
    
    X_test_list = []
    valid_test_files = []
    
    for test_file in test_files[:10]:  # Limit to first 10 for demo
        try:
            data = np.load(test_file)
            if 'hht' in data:
                hht_matrix = data['hht']
            else:
                hht_matrix = data[list(data.keys())[0]]
            
            if hht_matrix.shape == (256, 256):
                X_test_list.append(hht_matrix)
                valid_test_files.append(test_file)
        except Exception as e:
            print(f'Error loading {test_file}: {e}')
    
    if len(X_test_list) > 0:
        X_test = np.array(X_test_list, dtype=np.float32)
        y_test_pred = classifier.predict(X_test, batch_size=32)
        y_test_proba = classifier.predict_proba(X_test, batch_size=32)
        
        svm_classes = classifier.svm.classes_
        
        print(f'\nPredictions for {len(valid_test_files)} test files:')
        print(f'前 {len(valid_test_files)} 个测试文件的预测结果：\n')
        for i, (filename, pred, proba) in enumerate(zip(valid_test_files, y_test_pred, y_test_proba)):
            pred_idx = np.where(svm_classes == pred)[0][0]
            class_name = label_encoder.classes_[pred]
            confidence = proba[pred_idx]
            print(f'{os.path.basename(filename)}: {class_name} (confidence: {confidence:.4f})')
else:
    print('No test files to process or classifier not trained.')
    print('没有测试文件或分类器未训练。')

## 7. Summary and Next Steps

This notebook provides:

1. **CNN Encoder Architecture**: 3-layer convolutional network with Instance Normalization and LeakyReLU
2. **SVM Classifier**: Multi-class classification using extracted CNN features
3. **End-to-End Model**: Optional fully trainable model with neural network classifier
4. **Real Data Training**: Load and train on actual HHT matrices from Kaggle dataset
5. **HHT Computation**: Reference implementation for real sEMG signals

### Hyperparameter Tuning:
- `base_channels`: Number of channels in first conv layer (default: 64)
- `svm_C`: SVM regularization parameter
- `svm_kernel`: SVM kernel type ('rbf', 'linear', 'poly')
- `learning_rate`: Learning rate for end-to-end training
- `dropout_rate`: Dropout rate in end-to-end model

In [None]:
print("\n" + "="*60)
print("sEMG-HHT CNN Classifier - Ready for Use")
print("="*60)
print(f"\nDevice: {device}")
print(f"Encoder feature dimension: {encoder.get_feature_dim()}")
print(f"Number of classes: {len(class_names)}")
print(f"\nClass names:")
for i, name in enumerate(class_names):
    print(f"  {i}: {name}")