# Dataset Wrapper
I will be explaining the reasoning behind the implementation of the dataset wrapper for the vectorcardiograms and dyssynchrony indices.

## Goal
The goal is to be able to iterate through a given dataset with minimum effort for the purpose of training a neural network in batches. Ideally, we would want to be able to call ```next_batch``` and it would give us the next batch of a specified size within the dataset. 

## Next Batch
We provide additional requirements of the ```next_batch``` function here. They are as follows:
* Deliver a specified number of examples upon calling ```next_batch``` for both the dyssynchrony index and the vectorcardiogram
* We deliver the batches sequentially. For example, if we deliver the the first batch, it should be example numbers 1 through 10. The second batch should deliver example numbers 11-20. There are exceptions for corner cases however (such as when the specified batch size is greater than the dataset size, or if we have reached the end of the dataset and need to pull from the beginning).
* If we have iterated through the entire dataset, then start pulling batches from the beginning. This is reasonable because most often, neural networks are usually trained with more than one epoch (the network sees the entire dataset usually more than once).

## Implementation Steps
### *Step 1: Rename Files*
We wish to rename the files for three reasons:
* Impose an ordering on the example numbers 
* Make the filenames more readable
* Maintain that the filenames are predictable and follow a well-defined format

Thus, we will execute the following bash script to rename all the ```.txt``` files in the current directory:
```
a=1
for file in allParams-1_ECG_VCG_{1..608}_dump.txt; do 
    
    # Require a 3 digit padding
    new=$(printf "version%03d.txt" "$a")
    
    # Change the name
    mv "$file" "$new"
    let a=a+1
done
```
Since the original filenames were not zero-padded, when we ```ls```, we would get ```allParams-1_ECG_VCG_109_dump.txt``` lexicographically before ```allParams-1_ECG_VCG_10_dump.txt```. Thus, instead of iterating through each ```file in ls *.txt```, we have to iterate through them using curly braces ```{1..608}``` to maintain that ```allParams-1_ECG_VCG_9_dump.txt``` corresponds to ```version009.txt``` and not ```allParams-1_ECG_VCG_109_dump.txt```

To show that this has preserved the original ordering, look at the content of the first file before and after the renaming.

#### Before:
```
>>> head allParams-1_ECG_VCG_1_dump.txt
1.14859698e-06	-8.52689793e-07	-1.62738886e-07
6.27637865e-03	8.56158099e-04	-2.80092680e-05
1.73977577e-02	2.37706707e-03	-8.70847085e-05
0.03220872	0.00439809	-0.00017971
0.04663505	0.00636597	-0.00028655
0.05990819	0.00821321	-0.00043461
0.07573148	0.01035879	-0.00061512
0.09897242	0.01347624	-0.0008311
0.11859204	0.01606282	-0.00100526
0.13539736	0.01838106	-0.00139427
```

#### After:
```
>>> head version001.txt
1.14859698e-06	-8.52689793e-07	-1.62738886e-07
6.27637865e-03	8.56158099e-04	-2.80092680e-05
1.73977577e-02	2.37706707e-03	-8.70847085e-05
0.03220872	0.00439809	-0.00017971
0.04663505	0.00636597	-0.00028655
0.05990819	0.00821321	-0.00043461
0.07573148	0.01035879	-0.00061512
0.09897242	0.01347624	-0.0008311
0.11859204	0.01606282	-0.00100526
0.13539736	0.01838106	-0.00139427
```
It works!

### *Step 2: Convert To NumPy Arrays*