# Biosignal Preprocessing Pipeline Tutorial

This tutorial demonstrates how to build your own raw data preprocessing pipeline using the ECG_ACORAI codebase. In this notebook we show how to simulate noisy biosignals (including Electrodermal Activity (EDA), Electromyogram (EMG), Respiration, and even ECG) and then apply various denoising techniques such as basic filtering and advanced wavelet thresholding.

We will cover several scenarios:

1. Scenario 1: Basic Filtering for EDA & Respiration – Using low-pass filters to remove high-frequency noise.
2. Scenario 2: EMG Denoising – Using bandpass filtering, rectification, and smoothing.
3. Scenario 3: Advanced ECG Denoising with Wavelet Thresholding – Leveraging our repository’s denoising module.

Feel free to extend these examples with your own signals and processing functions.

## Setup & Imports

First, we import the necessary libraries. Some key packages are:

- NumPy and SciPy: For numerical operations and filtering
- Matplotlib: For visualization
- NeuroKit2: To simulate sample biosignals
- PyTorch: For tensor-based processing (and to work with our repository code)
- ECG_ACORAI modules: Such as advanced denoising functions from ecg_processor_torch (our repository supports various denoising techniques for ECG signals; you can extend or modify similar approaches for EDA, EMG, and Respiration signals).

Make sure you have installed the repository requirements (e.g. using uv pip install -r requirements.txt).

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import butter, filtfilt
import neurokit2 as nk
import torch

# Import advanced ECG denoising function from our repository
from ecg_processor_torch.advanced_denoising import wavelet_denoise
from ecg_processor_torch.config import ECGConfig # For default config (e.g., sampling rate)

# For inline plotting in Jupyter Notebook
%matplotlib inline

# Set a seed for reproducibility
np.random.seed(42)
torch.manual_seed(42)

## Simulating Raw Biosignals

In this section, we simulate several raw biosignals for demonstration:

- EDA: A slow-varying signal with added high-frequency noise.
- EMG: A higher-frequency oscillatory signal with bursts (simulated with a sine function plus noise).
- Respiration: A low-frequency oscillatory signal with perturbations.

We use NeuroKit2 and NumPy to generate these synthetic signals.

In [None]:
# Define simulation parameters
fs = 500 # Sampling frequency in Hz
duration = 10 # seconds
t = np.linspace(0, duration, fs * duration)

# Simulate EDA: slow sinusoidal component + high-frequency noise
eda = 0.5 * np.sin(0.2 * 2 * np.pi * t) + 0.05 * np.random.randn(len(t))

# Simulate EMG: higher-frequency muscle activity + noise
emg = 0.1 * np.sin(20 * 2 * np.pi * t) + 0.02 * np.random.randn(len(t))

# Simulate Respiration: low-frequency oscillation with noise
resp = 0.3 * np.sin(0.5 * 2 * np.pi * t) + 0.03 * np.random.randn(len(t))

# For demonstration, also simulate a noisy ECG signal using NeuroKit2
ecg = nk.ecg_simulate(duration=10, sampling_rate=fs, noise=0.1)

# Visualize the simulated signals
plt.figure(figsize=(12, 10))

plt.subplot(4, 1, 1)
plt.plot(t, eda, color='C0')
plt.title('Simulated EDA Signal (Raw)')

plt.subplot(4, 1, 2)
plt.plot(t, emg, color='C1')
plt.title('Simulated EMG Signal (Raw)')

plt.subplot(4, 1, 3)
plt.plot(t, resp, color='C2')
plt.title('Simulated Respiration Signal (Raw)')

plt.subplot(4, 1, 4)
plt.plot(t, ecg, color='C3')
plt.title('Simulated ECG Signal (Raw)')

plt.tight_layout()
plt.show()

## Preprocessing Functions for Biosignals

We now define a set of preprocessing functions for our signals. These functions include:

- Butterworth Low-pass Filter: For smoothing slow-varying signals (like EDA and Respiration).
- Butterworth Bandpass Filter: For EMG signals (to retain the 20–150 Hz band, for example).
- Wavelet Denoising: (Advanced) Using the repository’s wavelet_denoise function to clean ECG signals.

You can adjust parameters like the cutoff frequencies, filter orders, and more to tune the processing to your needs.

In [None]:
def butter_lowpass(cutoff, fs, order=5):
 nyquist = 0.5 * fs
 normal_cutoff = cutoff / nyquist
 b, a = butter(order, normal_cutoff, btype='low', analog=False)
 return b, a

def filter_signal(data, cutoff, fs, order=5):
 b, a = butter_lowpass(cutoff, fs, order=order)
 y = filtfilt(b, a, data)
 return y

# EDA Cleaning: Low-pass filter to smooth the slow-varying signal
def clean_eda(signal, fs):
 cutoff = 0.5 # Hz
 return filter_signal(signal, cutoff, fs, order=3)

# EMG Denoising: Bandpass filter, then rectify and smooth the signal
def butter_bandpass(lowcut, highcut, fs, order=4):
 nyquist = 0.5 * fs
 low = lowcut / nyquist
 high = highcut / nyquist
 b, a = butter(order, [low, high], btype='band')
 return b, a

def denoise_emg(signal, fs):
 lowcut = 20 # Hz
 highcut = 150 # Hz (must be below Nyquist; for fs=500 Hz, Nyquist=250 Hz)
 b, a = butter_bandpass(lowcut, highcut, fs, order=4)
 filtered = filtfilt(b, a, signal)
 rectified = np.abs(filtered)
 # Smooth the rectified signal with a low-pass filter
 smooth = filter_signal(rectified, cutoff=10, fs=fs, order=3)
 return smooth

# Respiration Cleaning: Low-pass filter to remove high-frequency noise
def clean_respiration(signal, fs):
 cutoff = 1 # Hz
 return filter_signal(signal, cutoff, fs, order=3)


## Scenario 1: Basic Filtering for EDA & Respiration

In this scenario, we apply simple low-pass filters to the EDA and Respiration signals. This removes unwanted high-frequency noise while preserving the main trends of these signals.

In [None]:
# Process EDA and Respiration signals
eda_clean = clean_eda(eda, fs)
resp_clean = clean_respiration(resp, fs)

plt.figure(figsize=(12, 6))

plt.subplot(2, 1, 1)
plt.plot(t, eda, label='Raw EDA', alpha=0.6)
plt.plot(t, eda_clean, label='Cleaned EDA', linewidth=2)
plt.title('EDA Signal Processing')
plt.legend()

plt.subplot(2, 1, 2)
plt.plot(t, resp, label='Raw Respiration', alpha=0.6)
plt.plot(t, resp_clean, label='Cleaned Respiration', linewidth=2)
plt.title('Respiration Signal Processing')
plt.legend()

plt.tight_layout()
plt.show()

## Scenario 2: EMG Denoising using Bandpass Filtering, Rectification, and Smoothing

For EMG signals, not only do we want to filter noise using a bandpass filter (typically retaining frequencies between 20 and 150 Hz), but we also rectify the signal (by taking its absolute value) and then smooth it with a low-pass filter. This helps highlight the underlying muscle activity.


In [None]:
# Process the EMG signal
emg_clean = denoise_emg(emg, fs)

plt.figure(figsize=(12, 4))
plt.plot(t, emg, label='Raw EMG', alpha=0.6)
plt.plot(t, emg_clean, label='Denoised EMG', linewidth=2)
plt.title('EMG Signal Processing')
plt.xlabel('Time (s)')
plt.ylabel('Amplitude')
plt.legend()
plt.show()

## Scenario 3: Advanced ECG Denoising using Wavelet Thresholding

Our repository provides an advanced denoising function based on wavelet thresholding. Here we simulate a noisy ECG signal (using NeuroKit2) and apply the wavelet_denoise function from our repository (located in ecg_processor_torch.advanced_denoising).

This technique can be extended to other signals if you design a wavelet-based approach for, say, EDA or EMG signals.

In [None]:
# Simulate a noisy ECG signal (for demonstration purposes)
ecg_noisy = nk.ecg_simulate(duration=10, sampling_rate=fs, noise=0.1)

# Convert to a Torch tensor (the wavelet_denoise function expects a tensor)
ecg_noisy_tensor = torch.tensor(ecg_noisy, dtype=torch.float32)

# Apply advanced wavelet denoising
ecg_denoised_tensor = wavelet_denoise(ecg_noisy_tensor)

# Convert back to NumPy for plotting (if necessary)
ecg_denoised = ecg_denoised_tensor.cpu().numpy()

plt.figure(figsize=(12, 4))
plt.plot(t, ecg_noisy, label='Noisy ECG', alpha=0.6)
plt.plot(t, ecg_denoised, label='Denoised ECG', linewidth=2)
plt.title('Advanced ECG Denoising with Wavelet Thresholding')
plt.xlabel('Time (s)')
plt.ylabel('Amplitude')
plt.legend()
plt.show()

## Discussion & Next Steps

In this tutorial we explored several scenarios using the ECG_ACORAI codebase:

- Basic filtering for slow-varying signals (EDA & Respiration)
- EMG denoising through bandpass filtering, rectification, and smoothing
- Advanced ECG denoising using wavelet thresholding

### Next Steps:

- Artifact Removal: Integrate routines to detect and eliminate artifacts (e.g. motion artifacts).
- Feature Extraction & Classification: Use the preprocessed signals with the deep learning modules (e.g. convolutional autoencoders, transformer classifiers) provided in the repository for state-of-the-art analysis.
- Extend to Other Biosignals: Adapt these techniques to handle additional biosignals such as skin temperature or blood pressure if needed.

This modular pipeline allows you to mix and match various preprocessing techniques depending on the quality of your raw data and the specific requirements of your analysis.

## Conclusion

We provided a comprehensive walkthrough showing how to build a raw data preprocessing pipeline using the ECG_ACORAI codebase. This notebook detailed multiple scenarios—from basic filtering to advanced wavelet denoising—demonstrating how to clean different biosignals such as EDA, EMG, Respiration, and ECG.

Feel free to customize and extend this pipeline further to meet your research or application needs. Happy coding!