#Physionet ECG-ID Database Files
---
This notebook downloads the ECG data from the [Physionet ECG-ID Database](http://www.physionet.org/physiobank/database/ecgiddb/) in the form of tab-delimited `.txt` files with no header.

The file titles follow this pattern: `Person##rec##.txt`

The first colum of data is **time (s)**, and the second column of data is **ECG I filtered (mV)**.

According to the database description, there are 310 records in total, made up of between 2 and 20 records from each of 90 individuals. Each record is 20 seconds long.

In order for this notebook to work, you must first install the [Physionet WFDB software](http://www.physionet.org/physiotools/wfdb.shtml) and be able to run a `%%bash` cell. Download the [RECORDS](http://www.physionet.org/physiobank/database/ecgiddb/RECORDS) file from Physionet, and put it in the same directory as this notebook.

[Click here](https://drive.google.com/folderview?id=0BwnXy5kOXzB5fjJ3eVZUdVpqbmptVW1KSE8zUFlRM1dyTWp0ekFjdzZxMEk5OFZaYWg0ODg&usp=sharing) to see the results of this notebook.

In [1]:
# Read the RECORDS file into a list
records = open('RECORDS','r')
rec_list = records.read()
records.close()

# Get rid of the last line, which
# is an extra empty entry for some reason.
rec_list = rec_list.split('\n')
del rec_list[-1]

# Create a dict where the keys are the Person
# and the values are a list of corresponding rec
rec_dict = {}

for i in range(len(rec_list)):
    rec = rec_list[i].split('/')
    if rec[0] in rec_dict:
        rec_dict[rec[0]].append(rec[1])
    else:
        rec_dict[rec[0]] = []
        rec_dict[rec[0]].append(rec[1])

In [2]:
print len(rec_dict)
print rec_dict

90
{'Person_14': ['rec_1', 'rec_2', 'rec_3'], 'Person_15': ['rec_1', 'rec_2'], 'Person_16': ['rec_1', 'rec_2', 'rec_3'], 'Person_17': ['rec_1', 'rec_2'], 'Person_10': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5'], 'Person_11': ['rec_1', 'rec_2', 'rec_3'], 'Person_12': ['rec_1', 'rec_2'], 'Person_13': ['rec_1', 'rec_2'], 'Person_18': ['rec_1', 'rec_2'], 'Person_19': ['rec_1', 'rec_2'], 'Person_90': ['rec_1', 'rec_2'], 'Person_61': ['rec_1', 'rec_2', 'rec_3', 'rec_4'], 'Person_60': ['rec_1', 'rec_2', 'rec_3'], 'Person_63': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5', 'rec_6'], 'Person_62': ['rec_1', 'rec_2', 'rec_3'], 'Person_65': ['rec_1', 'rec_2'], 'Person_64': ['rec_1', 'rec_2', 'rec_3'], 'Person_67': ['rec_1', 'rec_2', 'rec_3'], 'Person_66': ['rec_1', 'rec_2'], 'Person_69': ['rec_1', 'rec_2'], 'Person_68': ['rec_1', 'rec_2'], 'Person_72': ['rec_1', 'rec_2', 'rec_3', 'rec_4', 'rec_5', 'rec_6', 'rec_7', 'rec_8'], 'Person_73': ['rec_1', 'rec_2'], 'Person_70': ['rec_1', 'rec_2', 'rec_3'

In [3]:
# Generate a temporary file named Person##
# where each line is a rec entry
for k,v in rec_dict.iteritems():
    p = open('people','a')
    p.write(k+'\n')
    p.close()
    f = open(k,'w')
    for i in range(len(v)):
        f.write(v[i]+'\n')
    f.close()

###Use a `%%bash` cell to download Physionet files.
Delete the first 2 lines (headers) from each file.

In [None]:
%%bash
for i in $(cat people); do
    for ii in $(cat $i); do
        rdsamp -r ecgiddb/$i/$ii -H -f 0 -t 20 -v -ps -s 'ECG I filtered' >$i$ii.txt
        vi -c ':1,2d' -c ':wq' $i$ii.txt;
    done
done

In [6]:
# Remove temporary files
import os
os.remove('people')
for k in rec_dict.iterkeys():
    os.remove(k)