School of Engineering
Department of Computer Science and Engineering (Cyber Security)
(A State Private University under the Karnataka Act No. 20 of 2013)
Approved by UGC & AICTE, New Delhi
Explainable Federated Learning for Secure and Transparent Medical Diagnosis in IoT-based Smart Hospitals
High-Fidelity ML-based ECG Classification using Federated Learning & Explainability
TTEH Lab
This project presents an ECG classification system using Federated Learning combined with Explainable AI (XAI) for secure and transparent medical diagnosis.
Electrocardiogram (ECG) signals are widely used for detecting cardiac abnormalities, but sharing such sensitive patient data across hospitals raises serious privacy concerns. To address this, we implement a Federated Learning framework where multiple simulated hospitals (clients) train models locally on their own ECG data without sharing raw data.
A global model is then constructed by aggregating the locally trained models using the Flower federated learning framework.
In addition to model training, this project integrates Explainability techniques (SHAP) to interpret model predictions, enabling better transparency and trust in clinical decision-making.
The system is evaluated in both centralized and federated settings, demonstrating that federated learning can achieve comparable performance while preserving data privacy.
This work aims to contribute toward privacy-preserving, interpretable AI solutions for smart healthcare systems.
Keywords: Federated Learning ECG Classification Explainable AI Healthcare AI Privacy-Preserving Machine Learning
- Problem Statement
- Proposed Architecture
- System Architecture
- Components
- How It Works
- Performance Evaluation
- Explainability
- Code Architecture
- Core Modules
- Setup & Usage
- Implementation Results
- Limitations
Cardiovascular diseases are one of the leading causes of mortality worldwide. Early detection using ECG signals is crucial for timely diagnosis. However, training machine learning models on ECG data requires access to large amounts of patient data, which raises serious privacy and security concerns.
Traditional centralized learning approaches require data to be collected and stored in a single location, increasing the risk of data breaches and violating healthcare data regulations.
Therefore, there is a need for a privacy-preserving, scalable, and interpretable system that can:
- Train models without sharing sensitive medical data
- Maintain high diagnostic accuracy
- Provide transparency in model predictions
This project addresses these challenges using Federated Learning and Explainable AI.
The system follows a distributed federated learning pipeline where clients collaboratively train a global model without sharing raw data.
| Component | Description | Technology Used |
|---|---|---|
| ECG Dataset | Raw ECG signals from MIT-BIH Arrhythmia Dataset | PhysioNet |
| Data Preprocessing | Signal extraction, segmentation, normalization, labeling | NumPy, WFDB |
| Clients (Hospitals) | Simulated distributed nodes training on local ECG data | Flower Clients |
| Local Model | ECG classification model trained independently at each client | PyTorch |
| Server | Central aggregator coordinating federated learning | Flower Server |
| Aggregation | Combines model weights from all clients using Federated Averaging (FedAvg) | Flower Strategy |
| Global Model | Updated shared model distributed back to clients | PyTorch |
| Explainability | Interprets model predictions using SHAP | SHAP Library |
| Results Storage | Stores training results, logs, and plots | Local Storage |
The proposed system follows a federated learning architecture with explainability support.
- Clients (Hospitals) → Local ECG training
- Server → Model aggregation
- Global Model → Shared knowledge
- Explainability Module → SHAP-based interpretation
- ECG data is loaded from the MIT-BIH dataset
- Signals are preprocessed and converted into training samples
- Data is split into multiple clients (simulated hospitals)
- Each client trains a local model independently
- Flower framework coordinates training across clients
- Model weights are sent to the server
- Server aggregates updates to create a global model
- Process repeats for multiple rounds
- Final global model is evaluated
- SHAP is used to explain model predictions
- Trained on full dataset
- Achieves stable accuracy
- Serves as baseline for comparison
- 3 simulated clients
- Model trained without sharing raw data
- Accuracy comparable to centralized approach
- Federated learning preserves privacy
- Minimal drop in accuracy compared to centralized model
- Model converges successfully across rounds
- Accuracy
- Loss
Federated learning is effective for ECG classification while maintaining data privacy.
To improve model transparency, SHAP (SHapley Additive Explanations) is used to interpret predictions.
- SHAP identifies which parts of the ECG signal contribute most to classification
- Helps understand model decision-making
- Important for clinical trust and validation
- Certain waveform regions (QRS complex) show higher importance
- Model focuses on key ECG patterns for classification
- SHAP summary plots are generated and stored in the
results/folder
ECG-Federated-Learning/
│
├── data/
├── notebooks/
├── results/
│ ├── .getkeep
│ ├── accuracy.png
│ ├── ecg_sample.png
│ ├── shap_plot.png
│ ├── system_architecture.png
├── src/
│ ├── model.py
│ ├── data_utils.py
│ ├── train_baseline.py
│ ├── federated_simulation.py
│ ├── explain.py
│ └── config.py
│
├── main.py
├── requirements.txt
└── README.md
Handles:
- Dataset loading
- ECG signal extraction
- Preprocessing
- Label creation
Defines:
- ECG classification model (1D CNN / simple model)
Implements:
- Centralized training
- Performance comparison
Handles:
- Client creation
- Flower simulation
- Model aggregation
Implements:
- SHAP explainability
- Visualization of feature importance
Contains:
- Hyperparameters
- Training settings
pip install -r requirements.txtWe use the MIT-BIH Arrhythmia Dataset
🔗 Download here:
https://physionet.org/content/mitdb/1.0.0
Dataset is NOT included in this repository.
After downloading, place it inside:
data/ └── mit-bih-arrhythmia-database-1.0.0/
python src/train_baseline.pypython src/federated_simulation.pypython src/explain.py- Successfully implemented federated learning with 3 simulated clients
- Verified model training across distributed datasets
- Achieved stable convergence across training rounds
- Demonstrated privacy-preserving training
- Generated SHAP plots for interpretability
- Compared centralized vs federated performance
- Simulation uses limited number of clients (3 hospitals)
- Dataset size is relatively small
- Model architecture is not too complex
- Federated learning overhead increases computation time
- SHAP explanations may be computationally expensive
| Name | USN | |
|---|---|---|
| Poorvika N | ENG23CY0030 | poorvikan99@gmail.com |
| B.Tanusree reddy | ENG23CY0054 | bojja104@gmail.com |
| D.Himaja Sri vyshnavi | ENG23CY0061 | himaja210205@gmail.com |
| K N Navya | ENG23CY0019 | knnavya27@gmail.com |
| Pooja N | ENG23CY0029 | poojanarayan0906@gmail.com |
Dr. Prajwalasimha S N
Associate Professor, Department of Computer Science and Engineering (Cyber Security)
School of Engineering, Dayananda Sagar University
Email: prajwasimha.sn1@gmail.com




