# Author
[Koen Aerts](https://koenaerts.ca/) @ [Mobia Technology Innovations](https://mobia.io)

[myOpenHealth](https://mobia.io/healthcare/)

# Intro
This notebook reads all the CSV files that were generated by notebook "02_import_mitdb_data" and combines them all in 2 new CSV files that can be used for training your model. More specifically the following is done:
* Read all CSV files generated by the "02_import_mitdb_data" notebook.
* Shuffle all records randomly.
* Save 80% of the records in a training CSV file.
* Save 20% of the records in a test CSV file.

# Save data
Data is saved into 2 separate CSV files: training, testing

In [2]:
from glob import glob
import numpy as np

alldata = np.empty(shape=[0, 188])
print(alldata.shape)
paths = glob('data_ecg/mitdb/*.csv')
for path in paths:
    print('Loading ', path)
    csvrows = np.loadtxt(path, delimiter=',')
    alldata = np.append(alldata, csvrows, axis=0)

print("Final shape of alldata:", alldata.shape)

# Shuffle and separate the data
np.random.shuffle(alldata)
totrows = len(alldata)

# Calculate row counts for 80% training and 20% testing
trainrows = int(totrows * 0.8)  # 80%
testrows = totrows - trainrows  # 20%

# Save data
with open('../train_human.csv', "wb") as fin:
    np.savetxt(fin, alldata[:trainrows], delimiter=",", fmt='%f')

with open('../test_human.csv', "wb") as fin:
    np.savetxt(fin, alldata[trainrows:], delimiter=",", fmt='%f')

print("Training data saved to 'train_human.csv' with", trainrows, "records.")
print("Testing data saved to 'test_human.csv' with", testrows, "records.")

(0, 188)
Loading  data_ecg/mitdb\100_MLII.csv
Loading  data_ecg/mitdb\101_MLII.csv
Loading  data_ecg/mitdb\103_MLII.csv
Loading  data_ecg/mitdb\105_MLII.csv
Loading  data_ecg/mitdb\106_MLII.csv
Loading  data_ecg/mitdb\107_MLII.csv
Loading  data_ecg/mitdb\108_MLII.csv
Loading  data_ecg/mitdb\109_MLII.csv
Loading  data_ecg/mitdb\111_MLII.csv
Loading  data_ecg/mitdb\112_MLII.csv
Loading  data_ecg/mitdb\113_MLII.csv
Loading  data_ecg/mitdb\114_MLII.csv
Loading  data_ecg/mitdb\115_MLII.csv
Loading  data_ecg/mitdb\116_MLII.csv
Loading  data_ecg/mitdb\117_MLII.csv
Loading  data_ecg/mitdb\118_MLII.csv
Loading  data_ecg/mitdb\119_MLII.csv
Loading  data_ecg/mitdb\121_MLII.csv
Loading  data_ecg/mitdb\122_MLII.csv
Loading  data_ecg/mitdb\123_MLII.csv
Loading  data_ecg/mitdb\124_MLII.csv
Loading  data_ecg/mitdb\200_MLII.csv
Loading  data_ecg/mitdb\201_MLII.csv
Loading  data_ecg/mitdb\202_MLII.csv
Loading  data_ecg/mitdb\203_MLII.csv
Loading  data_ecg/mitdb\205_MLII.csv
Loading  data_ecg/mitdb\207_M