## Sample conversion: HDF5 file to Fleet file

This notebook demonstrates a simple example of converting an HDF5 file into a Fleet file. To do so, we read the HDF5 file into a numpy arrays and then write those arrays to a Fleet file. 

In [1]:
# import packages
import h5py
from fleetfmt import FileWriter, FileReader
import numpy as np
from pathlib import Path

In [2]:
hdf5file = Path("./convert_example.flt")
fleetfile = Path("./convert_example.hdf5")

In this example, we create a test dataset that contains records of varied length. 

In [3]:
# Utility to create a test numpy array
def make_dataset():
    dataset = []
    def _rand_vec(size):
        data = np.array(10*np.random.random(size), dtype=np.int)
        print(data)
        return data

    for i in range(0, 2): # 2 loops = 6 records in example file
        for _ in range(2):
            dataset.append(_rand_vec(2))
        for _ in range(1):
            dataset.append(_rand_vec(5))
    return dataset

### Convert HDF5 file into a Fleet file (by way of numpy)

In [4]:
print("Creating dataset and saving as HDF5 file.")
data = make_dataset()

f = h5py.File(hdf5file, 'w')
for key,value in enumerate(data):
    f.create_dataset(str(key), data=value)
f.close()

Creating dataset and saving as HDF5 file.
[2 5]
[4 2]
[9 3 2 0 1]
[2 9]
[2 2]
[8 2 3 1 1]


In [5]:
print("Reading HDF5 file into numpy array.")
fulldata = []

# open HDF5 file
h5f2 = h5py.File(hdf5file,'r')

# get keys
h5keys = list(h5f2.keys())
print("HDF5 keys:", h5keys)

# get values
print("HDF5 Values:")
for key in h5keys:
    value = np.array(h5f2[(key)])
    print(value)
    fulldata.append(list(value))

h5f2.close()

Reading HDF5 file into numpy array.
HDF5 keys: ['0', '1', '2', '3', '4', '5']
HDF5 Values:
[2 5]
[4 2]
[9 3 2 0 1]
[2 9]
[2 2]
[8 2 3 1 1]


In [6]:
print("Writing numpy content to a new Fleet file.")
with fleetfile.open('wb') as fhandle, FileWriter(fhandle) as writer:
    for key, value in zip(h5keys, fulldata):
        writer.append(key, value)
print("Done.")

Writing numpy content to a new Fleet file.
Done.


### Verify Fleet content compared to original data

In [7]:
print("Reading back Fleet data.")
with fleetfile.open('rb') as fhandle, FileReader(fhandle) as reader:
    # get keys
    dkeys = list(reader.keys())
    print("Fleet keys:", dkeys)
    
    # get values
    print("Fleet Values:")
    for key in dkeys:
        value = reader.read(key)
        print(value)

Reading back Fleet data.
Fleet keys: ['0', '1', '2', '3', '4', '5']
Fleet Values:
[2, 5]
[4, 2]
[9, 3, 2, 0, 1]
[2, 9]
[2, 2]
[8, 2, 3, 1, 1]


In [8]:
# compare values across file types to validate integrity

# open both files
h5f2 = h5py.File(hdf5file,'r')
with fleetfile.open('rb') as fhandle, FileReader(fhandle) as reader:
    
    for key in dkeys:
        if np.array(reader.read(key)).all() != np.array(h5f2[(key)]).all():
            print("Key {}: mismatch between HDF5 and Fleet".format(key))
        else: 
            print("Key {}: HDF5 and Fleet match".format(key))

Key 0: HDF5 and Fleet match
Key 1: HDF5 and Fleet match
Key 2: HDF5 and Fleet match
Key 3: HDF5 and Fleet match
Key 4: HDF5 and Fleet match
Key 5: HDF5 and Fleet match
