# Interactive ECG Analysis Pipeline

This notebook demonstrates how to use the utility functions in `ecg_utils.py` to process and analyze ECG data collected with Zephyr devices.

In addition to setting up the environment using the requirements.txt file, ensure you have installed all dependencies (see `requirements.txt`).

## 1. Import utilities and set up the environment
Import functions from `ecg_utils.py` and other required libraries:

In [1]:
from ecg_utils import import_zephyr_ecg_data, processing_ecg_signal, ecg_feature_extraction
import os
import pandas as pd

Set the data directory and find all the Zephyr ECG data files:

In [2]:
# Prompt the user to input the folder path containing Zephyr ECG data
directory = input("Enter the full path to your Zephyr ECG data folder: ")
ecg_directory = os.path.join(directory, "ecg_data")
ecg_files = [f for f in os.listdir(ecg_directory) if f.endswith('_ECG.csv')]
print(f"Found {len(ecg_files)} ECG files. Example: {ecg_files[:3]}")

Found 45 ECG files. Example: ['3101_2024_05_29-10_55_46_ECG.csv', '3103_2024_05_31-10_54_42_ECG.csv', '3104_2024_05_31-14_58_26_ECG.csv']


Load and preprocess a single ECG file as an example:

In [5]:
# Load and normalize data from the first file
example_file = ecg_files[0]
file_data = import_zephyr_ecg_data(ecg_directory)

In [18]:
print(file_data[1])

                          HR    BR  Posture  Activity  PeakAccel  BRAmplitude  \
Time                                                                            
2024-05-29 10:55:46.448   65  10.0       72      0.18       0.24          0.0   
2024-05-29 10:55:47.448   62  10.0       77      0.15       0.42          0.0   
2024-05-29 10:55:48.448   66  10.0       86      0.26       0.37          0.0   
2024-05-29 10:55:49.448   68  10.0       91      0.16       0.22          0.0   
2024-05-29 10:55:50.448   69  10.0       96      0.12       0.16          0.0   
...                      ...   ...      ...       ...        ...          ...   
2024-05-29 13:17:15.448  118  17.0       94      0.57       0.96      41143.0   
2024-05-29 13:17:16.448  118  17.0       86      0.42       1.10      33681.0   
2024-05-29 13:17:17.448  118  17.0       78      0.20       0.30      26998.0   
2024-05-29 13:17:18.448  118  17.0       76      0.10       0.14      21227.0   
2024-05-29 13:17:19.448  118

Process the ECG signal: clean, detect R-peaks, and extract heart rate:

In [None]:
# Example: process the ECG signal (clean, detect R-peaks, extract HR)
import numpy as np
import matplotlib.pyplot as plt

# Assume ecg_data contains the waveform as a DataFrame with a column 'ECG'
try:
    # Replace 'ECG' with the actual column name if different
    ecg_signal = ecg_data['ECG'] if ecg_data is not None else None
    if ecg_signal is not None:
        signals = processing_ecg_signal(ecg_signal.values, sampling_rate=250, plot_signal=True)
        print("Processed ECG signal!")
    else:
        print("ECG waveform not found in loaded data.")
except Exception as e:
    print(f"Error during ECG processing: {e}")

Extract HRV (Heart Rate Variability) features from the processed ECG signal:

In [None]:
# Example: extract HRV features from the processed ECG signal
try:
    if ecg_signal is not None and signals is not None:
        # Example: create epochs (dummy, as actual event segmentation is context-dependent)
        epochs = [ecg_signal.values]  # Replace with actual epoching if needed
        sr = 250  # Sampling rate
        save_output_folder = os.path.join(directory, 'hrv_results')
        baseline_correction = False
        hrv_features = ecg_feature_extraction(epochs, sr, save_output_folder, baseline_correction)
        print("Extracted HRV features!")
    else:
        print("ECG signal or processed signals not available.")
except Exception as e:
    print(f"Error during HRV feature extraction: {e}")

Process all ECG files and generate a single dataframe of features for all participants:

In [None]:
all_features = []
for fname in ecg_files:
    subfolder_path = os.path.join(directory)
    try:
        ecg_data = import_zephyr_ecg_data(subfolder_path)
        if ecg_data is not None:
            # Replace 'ECG' with the actual column name if different
            ecg_signal = ecg_data['ECG']
            signals = processing_ecg_signal(ecg_signal.values, sampling_rate=250, plot_signal=False)
            # Example: create epochs (dummy, as actual event segmentation is context-dependent)
            epochs = [ecg_signal.values]
            sr = 250
            save_output_folder = os.path.join(directory, 'hrv_results')
            baseline_correction = False
            hrv_features = ecg_feature_extraction(epochs, sr, save_output_folder, baseline_correction)
            # Add file/participant info if available
            if isinstance(hrv_features, pd.DataFrame):
                hrv_features['file_name'] = fname
                all_features.append(hrv_features)
    except Exception as e:
        print(f"Failed to process {fname}: {e}")

# Combine all features into a single DataFrame
if all_features:
    all_features_df = pd.concat(all_features, ignore_index=True)
    print("Processed all ECG files!")
else:
    print("No valid ECG files processed.")

Save the aggregated ECG features for further analysis:

In [None]:
output_csv = os.path.join('..', 'rf_training_data', 'ecg_features.csv')
if 'all_features_df' in locals():
    all_features_df.to_csv(output_csv, index=False)
    print(f"Saved all ECG features to {output_csv}")
else:
    print("No features to save.")