# Task 2 Random Forest

References:
- [ECG Heartbeat Classification: A Deep Transferable Representation](https://arxiv.org/pdf/1805.00794.pdf)

## Load dependencies

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, accuracy_score, precision_score, recall_score

## MIT-BIH  Arrhythmia Dataset

- Number of Samples: 109446
- Number of Categories: 5
- Sampling Frequency: 125Hz
- Data Source: Physionet's MIT-BIH Arrhythmia Dataset
- Classes: ['N': 0, 'S': 1, 'V': 2, 'F': 3, 'Q': 4]
- Remark: All the samples are cropped, downsampled and padded with zeroes if necessary to the fixed dimension of 188.
- The final element of each row denotes the class to which that example belongs.


In [2]:
df_mitbih_train = pd.read_csv("../ecg_dataset/mitbih_train.csv", header = None)
df_mitbih_test = pd.read_csv("../ecg_dataset/mitbih_test.csv", header = None)

# print shapes of the dataframes
print("The shape of the mitbih_train is : ", df_mitbih_train.shape)
print("The shape of the mitbih_test is : ", df_mitbih_test.shape)

The shape of the mitbih_train is :  (87554, 188)
The shape of the mitbih_test is :  (21892, 188)


# Build model for classification

Configure training and testing sets. 

In [3]:
Xtrain = np.array(df_mitbih_train)[:,:187]
ytrain = np.array(df_mitbih_train)[:,187]

Xtest = np.array(df_mitbih_test)[:,:187]
ytest = np.array(df_mitbih_test)[:,187]

In [None]:
# Create a Random Forest classifier
rf = RandomForestClassifier(n_estimators=100, random_state=42)

# Train the classifier
rf.fit(Xtrain, ytrain)

# Predict on the test set
ypred = rf.predict(Xtest)

In [6]:
# Calculate accuracy
accuracy = accuracy_score(ytest, ypred)

# Calculate precision
precision = precision_score(ytest, ypred, average='weighted')

# Calculate recall
recall = recall_score(ytest, ypred, average='weighted')

# Print accuracy, precision, and recall
print("Accuracy:", accuracy)
print("Precision:", precision)
print("Recall:", recall)

# Print classification report
report = classification_report(ytest, ypred)
print(report)

Accuracy: 0.9746939521286314
Precision: 0.9748012566218289
Recall: 0.9746939521286314
              precision    recall  f1-score   support

         0.0       0.97      1.00      0.99     18118
         1.0       0.99      0.61      0.75       556
         2.0       0.98      0.88      0.93      1448
         3.0       0.88      0.64      0.74       162
         4.0       0.99      0.94      0.97      1608

    accuracy                           0.97     21892
   macro avg       0.96      0.81      0.87     21892
weighted avg       0.97      0.97      0.97     21892

