# Download & Process MIMIC II Waveform Data

Date: 02.01.2020

Objective: Download and process the the waveform data for our HISP MIMIC Project

**Reference:** https://github.com/MIT-LCP/wfdb-python/blob/master/demo.ipynb

In [None]:
from IPython.display import display
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import os
import shutil

import wfdb

The **rdsamp** function exists as a simple alternative to **rdrecord** for the common purpose of extracting the physical signals and a few important descriptor fields. Thus, **rdrecord** gives several more attributes of the data



In [None]:
# Can also read the same files hosted on PhysioBank (takes long to stream the many large files)
record = wfdb.rdsamp('s10042-2763-04-22-10-27', sampfrom=1000000, sampto=1100000, pb_dir = 'mimic2wdb/matched/s10042/')
record

In [None]:
# Can also read the same files hosted on PhysioBank (takes long to stream the many large files)
record = wfdb.rdrecord('s10042-2763-04-22-10-27', sampfrom=1000000, sampto=1100000, pb_dir = 'mimic2wdb/matched/s10042/')
record

Examine the dictionary of the attributes contained wiethin the record

In [None]:
display(record.__dict__)

Can use the below function to plot signals. **sampfrom** and **sampto** allow us to pick the sig_len of interest, but we can run without to get the whole time series

In [None]:
# signals #samp allows us to pick the frequency!
signals2, fields2 = wfdb.rdsamp('s10042-2763-04-22-10-27', sampfrom=1000000, sampto=1100000, pb_dir='mimic2wdb/matched/s10042/')

In [None]:
signals2.shape # confirms us how large sample length is
fields2 # shows us the attributes of the signal

In [None]:
wfdb.plot_items(signals2, sig_name = 'II', sig_units = 'mV', title='Record s10042 from Physionet Challenge 2015') 

The below gives a nice directory of the different wave form databases

In [None]:
# Examine the different waveform databases hosted by physionet
dbs = wfdb.get_dbs()
display(dbs)

The below code looks to be a way to download specific files. This may be a more conveint way of downloading our patient cohort of interest. Perhaps we could use a forloop with the files we want.

**NEXT STEPS: Create a list of the subject_ids + waveform data**

In [None]:
# Demo 18 - Download specified files from a Physiobank database

# Make a temporary download directory in your current working directory
os.chdir('C:\\Users\\User\\Box Sync\\Projects\\Mimic_HSIP\\Waveform_Data')

# The files to download
file_list = ['s10042\s10042-2763-04-20-10-41.hea']

# Make a temporary download directory in your current working directory
cwd = os.getcwd()
dl_dir = os.path.join(cwd, 'dl_dir_chf')

# Download the listed files
wfdb.dl_files('mimic2wdb/matched', dl_dir, file_list)

# Display the downloaded content in the folder
display(os.listdir(dl_dir))
#display(os.listdir(os.path.join(dl_dir, 'data')))

# Cleanup: delete the downloaded directory
# shutil.rmtree(dl_dir)