This repo contains the code for my main research project which is Phenogrouping patients with atrial fibrillation undergoing catheter ablation
To do this I will train a CNN using the surface ECG as input and the intracardiac electrogram as the output to be able to quantify intracardiac electrical activity from the surface ECG alone. Use the features generated by this CNN for phenogrouping. My dissertation about this whole thing can be also found as a pdf within this folder.
Here most of the pre-processing is actually just really the data extraction and cleaning, excluding anyone that doesn't have 60 seconds worth of samples which at 1000hz is 60,000 samples and preventing data leakages, we can not have the same patient in the train and validation dataset. Also I can't show you the data cleaning pipeline just yet as it is private info so I need to see how much I can share. I may just share the code for other people who need to extract ecg from txt files. The class to extract this is in the exemplar numpy_ecg.ipynb in the importing leads folder.
Two of the best models are the uUNET autoencoder architecture, the LSTM-based model is slightly more stable than the unet model with the normal bottleneck. Here is the LTSM model :
def build_autoencoder():
inputs = Input(shape=(2000, 4))
# Encoding path
conv1 = Conv1D(8, 3, activation='relu', padding='same')(inputs)
conv1 = Conv1D(16, 3, activation='relu', padding='same')(conv1)
pool1 = MaxPooling1D(pool_size=2)(conv1)
conv2 = Conv1D(32, 3, activation='relu', padding='same')(pool1)
conv2 = Conv1D(64, 3, activation='relu', padding='same')(conv2)
pool2 = MaxPooling1D(pool_size=2)(conv2)
conv3 = Conv1D(128, 3, activation='relu', padding='same')(pool2)
conv3 = Conv1D(256, 3, activation='relu', padding='same')(conv3)
pool3 = MaxPooling1D(pool_size=2)(conv3)
conv4 = Conv1D(256, 3, activation='relu', padding='same')(pool3)
conv4 = Conv1D(512, 3, activation='relu', padding='same')(conv4)
pool4 = MaxPooling1D(pool_size=2)(conv4)
# Bottleneck with larger LSTM layers
lstm5 = LSTM(256, return_sequences=True)(pool4)
lstm5 = LSTM(128, return_sequences=True)(lstm5)
lstm5 = LSTM(64, return_sequences=True)(lstm5)
# Decoding path with larger LSTM layers
up6 = concatenate([UpSampling1D(size=2)(lstm5), conv4], axis=-1)
conv6 = Conv1D(512, 3, activation='relu', padding='same')(up6)
conv6 = Conv1D(256, 3, activation='relu', padding='same')(conv6)
up7 = concatenate([UpSampling1D(size=2)(conv6), conv3], axis=-1)
conv7 = Conv1D(256, 3, activation='relu', padding='same')(up7)
conv7 = Conv1D(128, 3, activation='relu', padding='same')(conv7)
up8 = concatenate([UpSampling1D(size=2)(conv7), conv2], axis=-1)
conv8 = Conv1D(128, 3, activation='relu', padding='same')(up8)
conv8 = Conv1D(64, 3, activation='relu', padding='same')(conv8)
up9 = concatenate([UpSampling1D(size=2)(conv8), conv1], axis=-1)
conv9 = Conv1D(64, 3, activation='relu', padding='same')(up9)
conv9 = Conv1D(32, 3, activation='relu', padding='same')(conv9)
# Output layer
outputs = Conv1D(4, 3, activation='linear', padding='same')(conv9)
# Create the model
model = Model(inputs=inputs, outputs=outputs)
return model