# DataSource

The DataSource submodule provides an efficient wrapper for the 
KeyFollower.Follower and KeyFollower.FrameGrabber classes. Instances of the
DataSource.DataFollower class are iterators with similar functionality to
instances of the KeyFollower.Follower class, however rather than indices they
simply produce the frames themselves.

## DataFollower


The DataFollower class requires 3 arguments:

* An instance of an h5py.File object containing the datasets of interest.
* A list of paths to **groups** containing datasets of keys.
* A list of paths to **datasets** containing the data you wish to process.

THe DataFollower also has an optional timeout argument, which defaults to 1
second unless otherwise specified. This works in exactly the same way as the
timeout for the KeyFollower.Follower class.

First we will create two small datasets (of the same size but containing different values)
 and corresponding unique key dataset to use in our example. 
 The keys will all be non-zero so we should expect to recieve
every frame of the dataset

In [None]:
from swmr_tools.KeyFollower import Follower, FrameGrabber
from swmr_tools.DataSource import DataFollower
import h5py
import numpy as np

#Create a small dataset to extract frames from
data_1 = np.random.randint(low = -10000, high = 10000, size = (2,2,5,10))
data_2 = np.random.randint(low = -10000, high = 10000, size = (2,2,5,10))
keys_1 = np.arange(1,5).reshape(2,2,1,1)

#Save data to an hdf5 File
with h5py.File("example.h5", "w", libver = "latest") as f:
    f.create_group("keys")
    f.create_group("data")
    f["keys"].create_dataset("keys_1", data = keys_1)
    f["data"].create_dataset("data_1", data = data_1)
    f["data"].create_dataset("data_2", data = data_2)

Firstly we will iterate through the frames just using the classes found in the
KeyFollower submodule. Because we have two datasets, we will need to use two
instances of the FrameGrabber class (one for each dataset)

In [None]:
with h5py.File("example.h5", "r") as f:
    kf = Follower(f, ["keys"], timeout = 1)
    fg = FrameGrabber("data/data_1", f)
    for key in kf:
        frame = fg.Grabber(key)
        print(f"Frame number: {key}")
        print(str(frame) + "\n")

In [None]:
with h5py.File("example.h5", "r") as f:
    kf = Follower(f, ["keys"], timeout = 1)
    fg = FrameGrabber("data/data_2", f)
    for key in kf:
        frame = fg.Grabber(key)
        print(f"Frame number: {key}")
        print(str(frame) + "\n")

Use of the DataFollower class eliminates the need for creating multiple FrameGrabber
instances. Like the KeyFollower.Follower class, instances of the DataFollower
class are iterators. Like with the KeyFollower.Follower class, we instantiate 
it with the data containing h5py.File object, and a list of paths to key containing
groups. We also pass a list of paths to datasets we want frames from.

Once we have an instance of the class, we can use it in a for loop as with any
other iterator. At each step of the iteration a list containing the frame for
each dataset is returned. The ordering of the frames is the same as the ordering
of the list of datasets.

In [None]:
with h5py.File("example.h5", "r") as f:
    df = DataFollower(f, ['keys'], ['data/data_1', 'data/data_2'])
    key = 0
    for frames in df:
        print(f"Frame: {key}")
        print(frames)
        print("")
        key += 1