# Extract index for patient ids

The pre-processed dataset does not contain the patient IDs, only their index (0-47).
In order to extract the correct patient from the pre-processed dataset, the correct index has to be selected. 
Especially required for the inter-patient training on differentially private data.

In [None]:
import scipy.io as spio

In [None]:
DS1 = [101, 106, 108, 109, 112, 114, 115, 116, 118, 119, 122, 124, 201, 203, 205, 207, 208, 209, 215, 220, 223, 230]
DS2 = [100, 103, 105, 111, 113, 117, 121, 123, 200, 202, 210, 212, 213, 214, 219, 221, 222, 228, 231, 232, 233, 234]
left_out = [102, 104, 107, 217] # those patients have pacemakers and were intentionally left out for the inter-patient analysis

First, get all patient ids sorted ASC in one list.

In [None]:
all_patient_ids = DS1 + DS2 + left_out
all_patient_ids.sort()
len(all_patient_ids)

48

Create a dictionary, with the id as key and their index as value.

In [3]:
patients_dict = {value: index for index, value in enumerate(all_patient_ids)}
patients_dict

{100: 0,
 101: 1,
 102: 2,
 103: 3,
 104: 4,
 105: 5,
 106: 6,
 107: 7,
 108: 8,
 109: 9,
 111: 10,
 112: 11,
 113: 12,
 114: 13,
 115: 14,
 116: 15,
 117: 16,
 118: 17,
 119: 18,
 121: 19,
 122: 20,
 123: 21,
 124: 22,
 200: 23,
 201: 24,
 202: 25,
 203: 26,
 205: 27,
 207: 28,
 208: 29,
 209: 30,
 210: 31,
 212: 32,
 213: 33,
 214: 34,
 215: 35,
 217: 36,
 219: 37,
 220: 38,
 221: 39,
 222: 40,
 223: 41,
 228: 42,
 230: 43,
 231: 44,
 232: 45,
 233: 46,
 234: 47}

Recreate the DS1 and DS2 lists with the patient index instead of patient ids.

In [4]:
DS1_idx = [patients_dict[key] for key in DS1]
DS2_idx = [patients_dict[key] for key in DS2]

In [5]:
DS2_idx

[0,
 3,
 5,
 10,
 12,
 16,
 19,
 21,
 23,
 25,
 31,
 32,
 33,
 34,
 37,
 39,
 40,
 42,
 44,
 45,
 46,
 47]

DS1_idx = [1, 6, 8,  9, 11, 13, 14, 15, 17, 18, 20, 22, 24, 26, 27, 28, 29, 30, 35, 38, 41, 43]

DS2_idx = [0, 3, 5, 10, 12, 16, 19, 21, 23, 25, 31, 32, 33, 34, 37, 39, 40, 42, 44, 45, 46, 47]

Extract the counts of available beats per AAMI class and patient.

In [9]:
dict_samples = spio.loadmat('../data/s2s_mitbih_aami.mat')
samples = dict_samples['s2s_mitbih']
labels = samples[0]['seg_labels']

DS2_shadow_train = [0,  5, 12, 19, 23, 31, 33, 37, 44, 45, 46]
DS2_shadow_test  = [3, 10, 16, 21, 25, 32, 34, 39, 40, 42, 47]

train_labels = [labels[i] for i in DS2_shadow_train]
test_labels = [labels[i] for i in DS2_shadow_test]

In [10]:
from collections import Counter

label_counts_per_patient = []

for patient_labels in test_labels:
    counts = Counter(patient_labels)
    label_counts_per_patient.append(counts)

for i, counts in enumerate(label_counts_per_patient):
    print(f"Patient {DS2_shadow_test[i]}: {dict(counts)}")

Patient 3: {np.str_('NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN