# Google colab for Classifying ECG

Project Objectives and Suggestions:

- Main Objective: Identify and classify pathological heartbeats from ECG recordings.
- Dataset: Use the MIT-BIH arrhythmia database (https://physionet.org/content/mitdb/1.0.0/), available in csv format from Kaggle.
- Data Characteristics: 48 half-hour high-resolution recordings of 47 patients with various cardiologic conditions.

## Steps

- Data Preparation:
  - Preprocess the MIT-BIH dataset.
  - Apply data balancing techniques.
- Model Training:
  - Train individual FFNN models per patient.
  - Train individual CNN models per patient.
  - Implement transfer learning with pre-trained CNNs (if time allows).
- Evaluation and Analysis:
  - Measure performance using defined evaluation metrics.
  - Analyze results, particularly for patients with lower accuracy.

## Detailed Steps

Objective: Train individual models per patient from data recorded from that patient with a fixed recording instrument.

**Research and Literature** 📚
- Read Papers: Explore research based on the MIT-BIH arrhythmia dataset to find interesting objectives and alternative approaches.

**Techniques and Tools** 🛠
- Data Balancing: Address imbalance (more normal than abnormal data).
- Evaluation Metrics: Use accuracy, sensitivity, precision, and F1-score to evaluate model performance.
- Loss Function: Start with Adam optimizer. Research alternatives if needed (e.g., quadratic loss, L1-norm loss, logistic regression).
- Preprocessing: Apply a Butterworth filter to the data.
- Network Types: Feedforward vs. CNN: Experiment with both types to compare performance.

**Experimentation with CNNs** (Propose Two CNNs): 🧪
- Identical Architecture: Use the same architecture for both CNNs.
- Different Data Input: Use different data inputs for each CNN.
- Different Pre-processing Techniques: Apply various pre-processing techniques to the same data before feeding it into the network.


*Approach 1: CNN Classification* **Train CNNs:**
- Create a CNN classification model for each class given in the dataset.

*Approach 2: Transfer Learning (If Time Allows)*
**Transfer Learning Strategy:**
- Train CNN on dataset X with task A.
- Transfer learned features to train on dataset Y with task B.
- Compare results and document changes (note: this approach can be complex to report).

# Get started:

Table of contents:
- Get the data
- Inspect the data (become one with the data)
- Preprocess the data
- Create a model architecture (baseline)
- Fit the model
- Evaluate the model
- Improve the baseline
- Repeat until satisfied

## 1) Get the data

In [None]:
!wget -r -N -c -np https://physionet.org/files/mitdb/1.0.0/

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
Length: 5468 (5.3K) [application/octet-stream]
Saving to: ‘physionet.org/files/mitdb/1.0.0/223.atr’


2024-05-16 12:44:09 (1.39 GB/s) - ‘physionet.org/files/mitdb/1.0.0/223.atr’ saved [5468/5468]

--2024-05-16 12:44:09--  https://physionet.org/files/mitdb/1.0.0/223.dat
Reusing existing connection to physionet.org:443.
HTTP request sent, awaiting response... 200 OK
Length: 1950000 (1.9M) [application/octet-stream]
Saving to: ‘physionet.org/files/mitdb/1.0.0/223.dat’


2024-05-16 12:44:16 (265 KB/s) - ‘physionet.org/files/mitdb/1.0.0/223.dat’ saved [1950000/1950000]

--2024-05-16 12:44:16--  https://physionet.org/files/mitdb/1.0.0/223.hea
Reusing existing connection to physionet.org:443.
HTTP request sent, awaiting response... 200 OK
Length: 258 [text/plain]
Saving to: ‘physionet.org/files/mitdb/1.0.0/223.hea’


2024-05-16 12:44:17 (169 MB/s) - ‘physionet.org/files/mitdb/1.0.0/223.hea’ saved [258/258]

--2024-05-16 12:44:17

hyperparameter tuning
undersrand the techniques
focus on CNN
all patients and then tweak for each patient individually (transfer learning for each)

Eval: confusion matrix, f1 score (recall-accuracy)