```
Copyright 2018 FAU-iPAT (http://ipat.uni-erlangen.de/)
```

# Data extraction
This notebook loads all data from the analytically created time signal dataset (available at https://github.com/FAU-iPAT) in the ```./data.big.source/``` directory and processes those data to be more easily usable for training of the keras neural net model. 

The resulting data are stored to a series of subdirectories in the ```./data.big.training/``` directory. Stored data include the complex valued FFT transformations using different window functions (rectangle, bartlett, hanning, meyer wavelet, poisson), an answer vector counting the frequency peaks in each FFT bin and a configuration dictionary containing additional information about each data entry.

In [None]:
import numpy as np
import hashlib

In [None]:
def md5(data):
    return hashlib.md5(str(data).encode()).hexdigest()

In [None]:
path_source = './data.big.source/'
path_target = './data.big.training/'
# path_source = './data.source/'
# path_target = './data.training/'

### Prepare the dataset and directories

In [None]:
%run -i 'data_handler.py'
dh = DataHandler(path_source)
valid = dh.validate_dataset()
if not valid: raise ValueError('Unknown dataset!')

In [None]:
dh.makedir(path_target)
dh.makedir(path_target+'signal/')
dh.makedir(path_target+'rectangle/')
dh.makedir(path_target+'bartlett/')
dh.makedir(path_target+'hanning/')
dh.makedir(path_target+'meyer/')
dh.makedir(path_target+'poisson/')
dh.makedir(path_target+'answer/')
dh.makedir(path_target+'config/')

### Loop all files of the dataset and store extracted data

In [None]:
for idx in range(dh.file_count):
    data = dh.load_source_file(idx)
    signal, ffts, answer, config = dh.signal_to_training(data)    
    print('Converted file {:4d} of {:4d} ... {}'.format(idx+1, dh.file_count, md5((ffts, answer))))    
    np.save(path_target+'signal/batch_{:05d}.npy'.format(idx), signal)
    np.save(path_target+'rectangle/batch_{:05d}.npy'.format(idx), ffts[0])
    np.save(path_target+'bartlett/batch_{:05d}.npy'.format(idx), ffts[1])
    np.save(path_target+'hanning/batch_{:05d}.npy'.format(idx), ffts[2])
    np.save(path_target+'meyer/batch_{:05d}.npy'.format(idx), ffts[3])
    np.save(path_target+'poisson/batch_{:05d}.npy'.format(idx), ffts[4])
    np.save(path_target+'answer/batch_{:05d}.npy'.format(idx), answer)
    np.save(path_target+'config/batch_{:05d}.npy'.format(idx), config)

```
Copyright 2018 FAU-iPAT (http://ipat.uni-erlangen.de/)

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
```