Signal2Recurrence: Deep Metric Learning for Sequential Signal Analysis

A generalizable deep learning pipeline for analyzing sequential signals through recurrence quantification analysis (RQA) and metric learning. Originally developed for character-level linguistic biomarkers in dementia detection (95.9% AUC), this methodology can be applied to any sequential signal data.

📖 Overview

This repository implements a novel methodology that transforms sequential signals into visual recurrence patterns, then learns discriminative embeddings using deep metric learning. The approach is domain-agnostic and has been validated on speech data but can be applied to:

Biomedical Signals: ECG, EEG, EMG, speech patterns
Financial Time Series: Stock prices, trading patterns
Industrial Sensors: Manufacturing quality control, predictive maintenance
Behavioral Data: User interaction sequences, activity recognition
Natural Language: Character or word-level text analysis

🔬 Methodology

The pipeline consists of four key stages:

1. Signal Preprocessing

Converts sequential data into fixed-length representations
Supports custom tokenization/embedding strategies
Handles variable-length sequences with padding

2. Recurrence Plot Generation

Transforms signal embeddings into visual recurrence matrices
Captures temporal dynamics and self-similarity patterns
Uses Euclidean distance with adaptive epsilon thresholding

3. Deep Metric Learning (Siamese Network)

Learns discriminative embeddings through contrastive loss
Trains pairs of similar/dissimilar samples
CNN-based architecture for feature extraction

4. Classification

Uses learned embeddings for downstream tasks
Supports any classifier (XGBoost, Random Forest, SVM, etc.)
Includes cross-validation and performance metrics

🚀 Quick Start

Installation

# Clone the repository
git clone https://github.com/yourusername/signal2recurrence.git
cd signal2recurrence

# Install dependencies
pip install -r requirements.txt

Basic Usage

from signal2recurrence import SignalPipeline
import pandas as pd

# Load your sequential data
# Format: DataFrame with 'signal' and 'label' columns
data = pd.read_csv('your_data.csv')

# Initialize pipeline
pipeline = SignalPipeline(
    embedding_dim=32,
    max_sequence_length=None,  # Auto-detect
    recurrence_epsilon=None,    # Auto-calculate
    image_size=(128, 128)
)

# Process signals and generate recurrence plots
pipeline.fit_transform(
    signals=data['signal'],
    labels=data['label'],
    save_plots=True,
    output_dir='recurrence_plots'
)

# Train deep metric learning model
pipeline.train_siamese_network(
    epochs=20,
    batch_size=16,
    validation_split=0.2
)

# Extract embeddings
embeddings = pipeline.get_embeddings()

# Train classifier
from xgboost import XGBClassifier
classifier = XGBClassifier(random_state=42)
classifier.fit(embeddings['train'], data['label_train'])

# Evaluate
accuracy = classifier.score(embeddings['test'], data['label_test'])

📁 Repository Structure

signal2recurrence/
├── signal2recurrence/          # Main package
│   ├── __init__.py
│   ├── preprocessing.py        # Signal preprocessing & embedding
│   ├── recurrence.py          # Recurrence plot generation
│   ├── siamese.py             # Siamese network implementation
│   ├── pipeline.py            # End-to-end pipeline
│   └── utils.py               # Utility functions
├── examples/                   # Usage examples
│   ├── speech_analysis.py     # Character-level speech example
│   ├── ecg_classification.py  # ECG signal example
│   └── custom_signals.py      # Generic signal template
├── tests/                      # Unit tests
├── notebooks/                  # Jupyter notebooks
│   └── demo.ipynb             # Interactive demo
├── requirements.txt
├── setup.py
├── LICENSE
└── README.md

🔧 Detailed Configuration

Preprocessing Options

from signal2recurrence.preprocessing import SignalPreprocessor

preprocessor = SignalPreprocessor(
    tokenization='character',     # 'character', 'word', 'custom'
    embedding_type='learned',     # 'learned', 'onehot', 'pretrained'
    embedding_dim=32,
    max_length=None,             # Auto-detect or specify
    padding='post',              # 'post' or 'pre'
    truncation='post'            # 'post' or 'pre'
)

Recurrence Plot Parameters

from signal2recurrence.recurrence import RecurrencePlotGenerator

rp_generator = RecurrencePlotGenerator(
    epsilon=None,                # Auto-calculate or specify
    distance_metric='euclidean', # 'euclidean', 'cosine', 'manhattan'
    image_size=(128, 128),
    colormap='binary'
)

Siamese Network Architecture

from signal2recurrence.siamese import SiameseNetwork

siamese = SiameseNetwork(
    input_shape=(128, 128, 1),
    base_filters=32,
    embedding_dim=128,
    learning_rate=0.001,
    margin=1.0                   # Contrastive loss margin
)

📊 Performance Metrics

The original implementation achieved:

ROC AUC: 95.9% (character-level linguistic biomarkers)
Stratified 5-Fold CV: 0.9589 ± 0.0142
Precision/Recall: Balanced across classes

🎯 Use Cases

Medical Applications

Early detection of cognitive decline
Parkinson's disease voice analysis
Sleep apnea detection from breathing patterns

Industrial IoT

Anomaly detection in sensor data
Predictive maintenance from vibration signals
Quality control in manufacturing

Finance

Fraud detection in transaction sequences
Market regime classification
Trading pattern recognition

📚 Citation

If you use this methodology in your research, please cite:

@article{mekulu2025character,
  title={Character-Level Linguistic Biomarkers for Precision Assessment of Cognitive Decline: A Symbolic Recurrence Approach},
  author={Mekulu, Kevin and Aqlan, Faisal and Yang, Hui},
  journal={medRxiv},
  year={2025},
  doi={10.1101/2025.06.12.25329529},
  note={Preprint}
}

Paper: Character-Level Linguistic Biomarkers for Precision Assessment of Cognitive Decline: A Symbolic Recurrence Approach

Published: June 13, 2025 (medRxiv preprint)

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Developed at Penn State University, Industrial Engineering Department
Funded by NSF I-Corps ($50K)
Forbes 30 Under 30 Healthcare Recognition

📧 Contact

Kevin - jkevin2010.kj@gmail.com

Project Link: https://github.com/jkevin2010/signal2recurrence

Note: This is a research tool. For medical applications, consult with healthcare professionals and follow appropriate regulatory guidelines.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Signal2Recurrence: Deep Metric Learning for Sequential Signal Analysis

📖 Overview

🔬 Methodology

1. Signal Preprocessing

2. Recurrence Plot Generation

3. Deep Metric Learning (Siamese Network)

4. Classification

🚀 Quick Start

Installation

Basic Usage

📁 Repository Structure

🔧 Detailed Configuration

Preprocessing Options

Recurrence Plot Parameters

Siamese Network Architecture

📊 Performance Metrics

🎯 Use Cases

Medical Applications

Industrial IoT

Finance

📚 Citation

🤝 Contributing

📝 License

🙏 Acknowledgments

📧 Contact

About

Uh oh!

Releases 1

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
examples		examples
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
custom_signals.py		custom_signals.py
gitignore		gitignore
pipeline.py		pipeline.py
preprocessing.py		preprocessing.py
recurrence.py		recurrence.py
requirements.txt		requirements.txt
setup.py		setup.py
siamese.py		siamese.py
speech_analysis.py		speech_analysis.py
utils.py		utils.py

License

JKEVIN2010/signal2recurrence

Folders and files

Latest commit

History

Repository files navigation

Signal2Recurrence: Deep Metric Learning for Sequential Signal Analysis

📖 Overview

🔬 Methodology

1. Signal Preprocessing

2. Recurrence Plot Generation

3. Deep Metric Learning (Siamese Network)

4. Classification

🚀 Quick Start

Installation

Basic Usage

📁 Repository Structure

🔧 Detailed Configuration

Preprocessing Options

Recurrence Plot Parameters

Siamese Network Architecture

📊 Performance Metrics

🎯 Use Cases

Medical Applications

Industrial IoT

Finance

📚 Citation

🤝 Contributing

📝 License

🙏 Acknowledgments

📧 Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages