In [None]:
# DON'T FORGET TO ACTIVATE THE GPU when on google colab (Edit > Notebook settings)
from os import environ
GOOGLE_COLAB = True if "COLAB_GPU" in environ else False
if GOOGLE_COLAB:
    !pip install git+https://github.com/undark-lab/swyft.git

# The Store - Caching and (re-)using simulator results with SWYFT

The caching and (re-)use of simulator results is central to the working of SWYFT, with reuse possible both within the context of a single inferrence problem, as well as between different experiments -- provided the simulator used (including **all** its setings) is the same. 
**It is the responsibility of the user to ensure the employed simulator is consistent between experiments using the same store.** 

To this end SWYFT incorporates a `store` class with two distinct subclasses the `Memorystore` and the `DirectoryStore`. Here we demonstrate the use of these stores.


In [None]:
import numpy as np
import pylab as plt
import swyft

We again begin by defining some parameters, a toy simulator, and a prior.

In [None]:
DEVICE = 'cpu'
Ntrain =3000
Npars = 2

In [None]:
def model(v, sigma = 0.2):
    x = v + np.random.randn(Npars)*sigma
    return dict(x=x)

In [None]:
import os
cwd = os.getcwd()


In [None]:
simulator = swyft.Simulator(model, Npars, sim_shapes = {"x": (Npars,)})

In [None]:
prior = swyft.Prior(lambda u: u*2-1, Npars)

## The memory store

The `MemorySore` class, which, intuitively, stores all results in active memory using `zarr`, provides  `SWYFT`'s simplest store option.

An empty store can be instantiated as follows, requiring only the specification of an associated simulator.

In [None]:
store = swyft.MemoryStore(simulator)

Subsequently, parameters, drawn according to the specified prior, can be added to the store as

In [None]:
store.add(Ntrain, prior=prior)

and it is possible to check whether entries in the store require simulator runs using

In [None]:
needs_sim = store.requires_sim()
needs_sim

Similarly, an overview of the exact simulation staus of all entries can be obtained using 

In [None]:
store.get_simulation_status()


Where a value of 0 corresponds to not yet simulated .

The reqired simulations can then be run using the store's `simulate` method.

In [None]:
store.simulate()

Afterwards, all simulations have been run, and their status in the store has been updated (2 corresponds to successfully simulated).

In [None]:
store.requires_sim()

In [None]:
store.get_simulation_status()

### Sample re-use and coverage
`SWYFT`'s store enables reuse of simulations. In order to check which fraction of a required number of samples can be reused, the coverge of the store for the desired prior, i.e. which fraction of the desired nuumber of samples to be drawn from the specified prior is already available in the store, can he inspected as follows.

In [None]:
store.coverage(2*Ntrain,prior=prior) 

Adding a specified number of samples to the store then becomes a question of adding the missing number. 

In [None]:
store.add(2*Ntrain,prior=prior)

These, however, do not yet have associated simulation results.

In [None]:
store.requires_sim()

In [None]:
store._get_indices_to_simulate()

#### Saving and loading
A memory store can also be saved, i.e. serialized to disk as a DirectoryStore, using the `save` method which takes the desired path as an argument,

In [None]:
store.save(cwd+'/SavedStore')

and be loaded into memory by specifying the path to a directory store and a simulator

In [None]:
store2 = swyft.MemoryStore.load(cwd+'/SavedStore',simulator=simulator)

In [None]:
store2._get_indices_to_simulate()

## The directory store
In many cases, running an instance of a simulator may be quite computationally expensive. For such simulators `SWYFT`'s ability to support reuse of simulations across different experiments is of paramount importance.

`Swyft` provides this capability in the form of the `DirectoryStore` class, which serializes the store to disk using `zarr`and keeps it up-to-date with regard to requested samples and parameters.

All methods demonstrated for the `MemoryStore`, with the excepttion of `load` and `save`, are also implemented for the `DirectoryStore`.

A directory store can be instantiated by invoking the `DirectoryStore` class and providing a path as argument. Should store with this path already exist, then `SWYFT` will connect to the existing store, otherwise an empty store will be created.



In [None]:
dirStore = swyft.DirectoryStore(cwd+'/SavedStore')

While it is posssible to specify the simulator to be associated with a directory store upon instantiation by passing the `simulator` keyword, it is also possible, in contrast to the case of the memory store, to instantiate the store without specifying a simulator and set the simulator later/afterwards.

In [None]:
dirStore.set_simulator(simulator)

### Updating on disk
We now briefly demonstrate the difference between a directory store and a memory store which has been loaded from an existing directory store.

In the example above, both the `dirStor` and `store2` are currenlty equivalent in content. In the `dirStore` we will now add simulations for half of the currently present samples lacking simulations, 

In [None]:
all_to_sim = dirStore._get_indices_to_simulate()
sim_now = all_to_sim[0:int(len(all_to_sim)/2)]
dirStore.simulate(sim_now)

Where we have made use of the ability to explicitly specify the indices of samples to be simulated.

The remaining samples lacking simulation results in the `dirStore` are now

In [None]:
dirStore._get_indices_to_simulate()

i.e. the store has been updated on disk, while in comparison the samples lacking simulation results in `store2` are still

In [None]:
store2._get_indices_to_simulate()

## asynchronous usage

In contrast to the `MemoryStore`, the `DirectoryStore` also supports asynchronous usage, i.e. when simulations are requested control immediately returns, with the simulations and updating of the store happening in the background.

This is particularly relevant for long-running simulators and parallelization using Dask, as is showcased in a separate notebook.

Here, as a small example, we simply add further samples to the store and then execute the associated simulations without waiting for the results.


In [None]:
dirStore.add(5*Ntrain,prior=prior)

In [None]:
dirStore.simulate(wait_for_results=False)

In [None]:
print('control returned')