# ECG_explorer

This notebook serves as an introduction to working with ECG data. It will open one set of ECG data and make a few plots.

Information on wfdb is available at https://wfdb-python.readthedocs.io/en/latest/index.html    
Information on working with ECG and wfdb files in other datasets is also available at https://physionet.org/

File organization is expected to follow this pattern:

pilot_data_root           
&emsp;cardiac_ecg    
&emsp;&emsp;manifest.tsv    
&emsp;&emsp;ecg_12lead    
&emsp;&emsp;&emsp;philips_tc30    
&emsp;&emsp;&emsp;&emsp;0001    
&emsp;&emsp;&emsp;&emsp;&emsp;0001_ecg_<uniq_tag>.dat    
&emsp;&emsp;&emsp;&emsp;&emsp;0001_ecg_<uniq_tag>.hea    
&emsp;&emsp;&emsp;&emsp;0002    
&emsp;&emsp;&emsp;&emsp;&emsp;0002_ecg_<uniq_tag>.dat    
&emsp;&emsp;&emsp;&emsp;&emsp;0002_ecg_<uniq_tag>.hea    
&emsp;&emsp;&emsp;&emsp;... etc.

In [None]:
import matplotlib.pyplot as plt
import pandas as pd
import wfdb  # install this from https://pypi.org/project/wfdb/

# from datetime import datetime

In [None]:
print(wfdb.__version__)  # expect 4.2.0

## custom path -- change to match your file structure

In [None]:
data_root = "/Volumes/data/datasets/AIREADI/YEAR2"  # change this to your own path

# Read the manifest

In [None]:
manifest_path = data_root + "/cardiac_ecg/manifest.tsv"
print(manifest_path)

In [None]:
df = pd.read_csv(manifest_path, sep="\t")
print(df.columns)

In [None]:
df["participant_id"].nunique()  # number of unique participants

In [None]:
df.head()

In [None]:
key_columns = [
    "participant_id",
    "wfdb_hea_filepath",
    "Rate",
    "QRS",
]  # optionally view only a few columns

df[key_columns].head(2)

# Select a set of data to explore

wfdb format splits the data into 2 files:    
 * basename.dat  # contains the waveforms in a binary format    
 * basename.hea  # contains header and annotation information in ASCII format    
    
Note that the path ecg_data includes the basename of the final files, but not the .hea (header) or .dat (data) file extension

In [None]:
pid = 1001  # select a participant ID

pid_hea = df[df["participant_id"] == pid]["wfdb_hea_filepath"][0]
pid_basename = pid_hea.split(".")[0]  # keep only the path with the basename
print(f"{pid} full path: {pid_hea}")
print(f"{pid} base name only: {pid_basename}")

In [None]:
ecg_path = data_root + pid_basename
print(ecg_path)

In [None]:
!ls {ecg_path + '.*'}

## plot everything using wfdb

This plot will give you an overview of all of the signals. 

Note that the there is always a reference pulse at the end of the signal that is 1mV x 0.2 seconds.

In [None]:
record = wfdb.rdrecord(ecg_path)
fig_handle_grids = wfdb.plot_wfdb(
    record, figsize=(10, 14), ecg_grids="all", return_fig=True
)
# fig_handle_grids.savefig(f'example_wfdb_fig_ecg_grids.png')   # keep in mind that you must safeguard any exported data

## plot and explore selected traces

You may want to do something else with the traces such as view a section more closely, or select only a few traces to view.

Here, we will plot only the first few channels. See the docs at https://wfdb-python.readthedocs.io/en/latest/index.html for more information.

In [None]:
type(record)  # expect wfdb.io.record.Record

In [None]:
signals, fields = wfdb.rdsamp(
    ecg_path, channels=[0, 6]
)  # select a subset of the channels

In [None]:
wfdb.plot_items(
    signal=signals, fs=fields["fs"], title="Learning to work with some signals"
)

In [None]:
print(signals.shape, type(signals), "\n")
display(signals)

## view header or annotation data

This data may be helpful in interpreting the signals. This is the same information that would be included on a *.pdf printed output.

Please note that all diagnostic comments are reported directly from the Philips TC30, 
but these have not been reviewed by a cardiologist.

In [None]:
display(
    fields
)  # this is information from the header *.hea file that relates to the signals

In [None]:
print("Done.")