In [None]:
import Pkg 
Pkg.activate("./..")

In [None]:
import QuantumGrav as QG
import Flux 
import DataFrames
import CausalSets
import Arrow

Generate some dummy data first. This is only there to demonstrate the usage of the `Dataset` type with the `Flux.Dataloader` type, so the details of data generation don´t matter here. 

In [None]:
equal_data = [QG.DataGeneration.generateDataForManifold(
        ; dimension = 2,
        manifoldname = m,
        seed = 329478,
        num_datapoints = 256,
        equal_size = true,
        size_distr = d -> CausalSets.Uniform(0.7 * 10^(d + 1), 1.1 * 10^(d + 1)),
        make_diamond = d->CausalSets.CausalDiamondBoundary{d}(1.0),
        make_box = d->CausalSets.BoxBoundary{d}((
            ([-0.49 for i in 1:d]...,), ([0.49 for i in 1:d]...,)))
) for m in ["minkowski", "deSitter", "antiDeSitter"]]

write a number of files from the generated data

In [None]:
dir = tempdir()
for i in 1:length(equal_data)
    Arrow.write(joinpath(tempdir(), "testdata_$(i).arrow"), equal_data[i])
end

Create a dataset from the thing. The dataset uses lazy loading to fetch data on demand, and caches some of it to allow for a compromise between memory usage and speed. 

In [None]:
dset = QG.DataLoader.Dataset(
    dir, 
    ["testdata_1.arrow","testdata_2.arrow", "testdata_2.arrow"],
    cache_size = 5
)

Use the created dataset with a Flux dataloader (itself based on `MLUtils.jl`). We use shuffle and confirm that the data is reordered in the first batch.

In [None]:
[x.manifold for x in dset[collect(1:32)]] 

In [None]:
shuffle_loader = Flux.DataLoader(
    dset,
    batchsize = 32,
    shuffle = true,
)

In [None]:
d = [first(shuffle_loader)[i].manifold for i in 1:32]

Data is shuffled. Yay! We can do the same thing without shuffling, and should get the data in the order it is in the dataset

In [None]:
deterministic_loader = Flux.DataLoader(
    dset,
    batchsize = 32,
    shuffle = false,
)

In [None]:
d = [first(deterministic_loader)[i].manifold for i in 1:32]

It's ordered now. Yay!