## Purpose

The purpose of this notebook is to pull data from the MIT-BIH database (https://physionet.org/physiobank/database/nsrdb/) and save it to a local .csv.  
One of the requirements is the library **wfdb** which can be found here: https://pypi.org/project/wfdb/.  
Please install via pip : `pip install wfdb` or conda : `conda install -c conda-forge wfdb`

In [1]:
from pathlib import Path
import pandas as pd
import numpy as np
import wfdb
from IPython.display import display

Select three ECG recordings by manipulating the items in `records_lst`

In [2]:
# https://physionet.org/physiobank/database/nsrdb/
records_lst = [16265, 16272, 16273]

Get an individual record : `record = wfdb.rdrecord('16265', pb_dir='nsrdb/')`  
Display all keys and values in the `record` object. `display(record.__dict__)`

See https://github.com/MIT-LCP/wfdb-python/blob/master/demo.ipynb for more info

In [3]:
record = wfdb.rdrecord('16265', pb_dir='nsrdb/') # Pull a single record out

In [4]:
record.__dict__ # Examine the "record" object

{'record_name': '16265',
 'n_sig': 2,
 'fs': 128,
 'counter_freq': None,
 'base_counter': None,
 'sig_len': 11730944,
 'base_time': datetime.time(8, 4),
 'base_date': None,
 'comments': ['32 M'],
 'sig_name': ['ECG1', 'ECG2'],
 'p_signal': array([[-0.165, -0.325],
        [-0.155, -0.325],
        [-0.195, -0.305],
        ...,
        [-0.05 , -0.095],
        [-0.05 , -0.085],
        [-0.05 , -0.085]]),
 'd_signal': None,
 'e_p_signal': None,
 'e_d_signal': None,
 'file_name': ['16265.dat', '16265.dat'],
 'fmt': ['212', '212'],
 'samps_per_frame': [1, 1],
 'skew': [None, None],
 'byte_offset': [None, None],
 'adc_gain': [200.0, 200.0],
 'baseline': [0, 0],
 'units': ['mV', 'mV'],
 'adc_res': [12, 12],
 'adc_zero': [0, 0],
 'init_value': [-33, -65],
 'checksum': [15756, -21174],
 'block_size': [0, 0]}

Save the ECG data into a .csv format. This for loop extracts the ECG Lead 2 values from the `p_signal` array.  
Lead II is the most useful lead for detecting cardiac arrhythmias.

In [5]:
# Save each ECG file in database into a separate CSV

output_dir = Path('raw/human/PhysioNet-nsrdb/')
output_dir.mkdir(parents=True, exist_ok=True)

for x in records_lst:
    
    record = wfdb.rdrecord(str(x), pb_dir='nsrdb/')
    
    fs = record.fs
    
    ECG = (record.p_signal[i][1] for i in np.arange(len(record.p_signal)))

    df = pd.DataFrame(ECG, columns=['ECG mV'], dtype=float)
    df = df.set_index([df.index * 1/fs])
    df.index.names = ['Time (s)']
    
    output_file = str(x) + '-' + str(fs) + '.csv'
    
    df.to_csv(output_dir / output_file)

Move on to `HRV-analysis.ipynb` for analysis.