# GT Storage Interface

* Storages which are passed without the `__gt_interface__` the storage is assumed to be in the order which is used in the stencil. (I.e. I-J-K currently) To retrieve information about the buffer, the passed object will be tested for the `__array_interface__` and/or, depending on the backend the `__cuda_array_interface__` properties. Alternatively, other standards are acceptable, as long as they are compatible with `np.asarray` and/or `cp.asarray`. 
* If the `__gt_interface__` is defined, it should return a `Map[Optional[str],Callable[Map[str, Any]]]`, which is explained in the following sections.

## `__gt_interface__`

This property maps a device identifier to a structured Currently, device identifiers can be `None` for main memory or `"gpu"` for gpu buffers. These could be extended to numeric device Id's in case of multiple accelerators, fpgas etc, or combinations such as `("gpu", 0)`. The value of the map would be a `Map` of 
* a subset of the `__array_interface__` and `__array_cuda_interface__`. Namely, the `TODO` from there are used.
* the `"dims"` key returns a sequence of string keys denoting the semantic meanings (such as `"I", "J", "K"` of the dimensions of the storage. 
* the object can specify callables taking no arguments at the `"acquire"`, `"release"` and `"touch"` keys. If defined and not `None`, `"acquire"`and `"release"` will be called before and after running computations on the buffers, respectively. The `"touch"` function will be called only for output fields after computations have finished:
```
for field in fields:
    field.acquire()
run_stencil(*fields)
for field in output_fields:
    field.touch()
for field in fields:
    field.acquire()
```

A buffer can be incompatible for a number of reasons:
* the backend is not compatible with the given memory due to low level details such as the layout
* the semantic dimensions or dtype are not the same as in the stencil definition
* no buffer is passed for the compute device of the stencil

The behavior in these cases can be defined by either raising errors or allocating and copying the buffers for the computations and copying back the results. The latter could be useful for debugging e.g. GPU applications with single stencils in the `"debug"` backends. 


## Examples
The following is a simple example of a class providing both a buffer on GPU as well as in main memory:

In [1]:
import numpy as np
import cupy as cp

class SomeStorage:
    
    NONE_MODIFIED = 0
    NP_MODIFIED = 1
    CP_MODIFIED = 2
    
    def __init__(shape, dtype):
        self._np_buffer = np.zeros(shape, dtype)
        self._cp_buffer = cp.zeros(shape, dtype)
        self._sync_state = self.NONE_MODIFIED
        
    def _touch_np(self):
        self._sync_state = self.NP_MODIFIED
    
    def _touch_cp(self):
        self._sync_state = self.CP_MODIFIED
    
    def _np_to_cp(self):
        if self._sync_state == self.NP_MODIFIED:
            # TODO
            self._sync_state = self.NONE_MODIFIED
    def _cp_to_np(self):
        if self._sync_state == self.CP_MODIFIED:
            # TODO
            self._sync_state = self.NONE_MODIFIED
    
    @property
    def __gt_interface__(self):
        np_interface = self._np_buffer.__array_interface__
        np_interface['acquire'] = self._cp_to_np
        np_interface['touch'] = self._touch_np
        
        cp_interface = self._np_buffer.__cuda_array_interface__
        cp_interface['acquire'] = self._np_to_cp
        cp_interface['touch'] = self._touch_cp
        return {None: np_interface, "gpu": cp_interface}
        

ModuleNotFoundError: No module named 'cupy'

This second example demonstrates how the GT4Py synced storages could be wrapped to add semantic information about the dimensions in addition:

In [None]:
from gt4py.storage import storage

class StorageWrapper:
    
    def __init__(*args, dims="IJK", **kwargs)
        self._gt4py_storage = storage(*args, **kwargs)
        self._dims = tuple(dims)
    
    
    @property
    def __gt_interface__(self):
        interface = self._gt4py_storage
        for key, interface in interface.items():
            interface[key]['dims'] = self._dims
        return interface

The third example demonstrate how `xarray` could be used to provide this interface: 

In [None]:
import numpy as np
import xarray as xr


@xr.register_dataarray_accessor("__gt_interface__")
def _sid_interface_from_xarray(xarray_obj):
    if hasattr(xarray_obj.data, "__gt_interface__"):
        interface = xarray_obj.data.__gt_interface__
    else:
        interface = {"cpu": xarray_obj.data.__array_interface__}
    for binfo in interface.values():
        binfo["dims"] = xarray_obj.dims
    return interface

In [None]:
import numpy as np
from gt4py import gtscript

in_field = np.zeros((2, 2, 2), dtype=np.float64)

ctr = 1
for i in range(2):
    for j in range(2):
        for k in range(2):
            in_field[i, j, k] = ctr
            ctr = ctr + 1
print("INPUT:")
print(in_field)


@gtscript.stencil(backend="numpy")
def copy_stencil(in_field: gtscript.Field[np.float64], out_field: gtscript.Field[np.float64]):
    with computation(PARALLEL), interval(...):
        out_field = in_field


print("OUTPUT: numpy")
out_field = np.zeros((2, 2, 2), dtype=np.float64)
copy_stencil(in_field, out_field, origin=(0, 0, 0))
print(out_field)

print("OUTPUT: xarray")
out_field = xr.DataArray(np.zeros((2, 2, 2), dtype=np.float64), dims=("J", "I", "K"))
copy_stencil(in_field, out_field, origin=(0, 0, 0))
print(out_field.transpose("I", "J", "K"))
