In [1]:
import sys
import os
import numpy
sys.path.append(os.path.dirname(os.path.abspath('')))

# Structs

Structs are containers for structured data, similar to NumPy arrays.
The key difference to arrays is, however, that the data contained in structs is structured, i.e. hierarchical and named.
This allows writing vectorized code for much more complex data structures than multi-dimensional arrays.

The `struct` API is closely related to the [`math` API](NumPy_and_TensorFlow_Execution.md).

## Examples

In [2]:
from phi import struct
a = [1, numpy.zeros([2])]
b = {'x0': 0, 'x1': 0.5, 'x2': 1}
c = (a, b)

All function in `phi.math` can be called on structs. This broadcasts the corresponding calls to all contained arrays.

In [3]:
from phi import math
math.sin(c)

  from ._conv import register_converters as _register_converters


([0.8414709848078965, array([0., 0.])],
 {'x0': 0.0, 'x1': 0.479425538604203, 'x2': 0.8414709848078965})

Note that math.sin calls numpy.sin in this case.
If the struct contained TensorFlow tensors, tensorflow.sin would be called instead.

## Types of structs

The following types are considered structs:
- Lists
- Tuples
- Dicts containing strings as keys
- NumPy arrays with `dtype=numpy.object`
- Subclasses of [`phi.math.struct.Struct`](../phi/struct/struct.py)

All `phi.math` functions and functions of the struct API work with any of the above types.

While all entries of lists, tuples, dicts and NumPy arrays are expected to hold data,
subclasses of the `Struct` class can define further *properties* which are not subject to the above mentioned functions.
Data-holding entries will be referred to as *attributes*.

## Iterating over structs

The struct interface provides the function `map` which iterates over all *attributes* of a struct and optionally all sub-structs.

In [4]:
struct.map(lambda x: str(x), c)

(['1', '[0. 0.]'], {'x0': '0', 'x1': '0.5', 'x2': '1'})

This iterates over all attributes of `data` and recursively over all of its sub-structs.
If we only wanted to affect the tensors directly held by `data`, we could call

In [5]:
struct.map(lambda x: str(x), c, recursive=False)

('[1, array([0., 0.])]', "{'x0': 0, 'x1': 0.5, 'x2': 1}")

Assume, we want only dicts to be stringified as a whole.
This can be achieved by defining a leaf_condition.

In [6]:
struct.map(lambda x: str(x), c, leaf_condition=lambda x: isinstance(x, dict))

(['1', '[0. 0.]'], "{'x0': 0, 'x1': 0.5, 'x2': 1}")

In some cases we require additional information when mapping a struct; not just the value but also where it is stored.
When calling `map(.., trace=True)`, a [`Trace`](../phi/struct/functions.py) object is passed to the mapping function `f` instead of the value. In addition to retrieving the value via `trace.value`, it provides access to the attribute key as `trace.key` and the parent structs via `trace.parent`.

In [7]:
struct.map(lambda trace: trace.key, c, trace=True)

([0, 1], {'x0': 'x0', 'x1': 'x1', 'x2': 'x2'})

## Usages in Φ<sub>*Flow*</sub>

In Φ<sub>*Flow*</sub>, structs are mostly used to store simulation states, i.e.
each attribute holds a tensor such as density or velocity of a smoke simulation.
In particular, the state base class [`phi.physics.physics.State`](../phi/physics/physics.py) extends `Struct`.
All Field classes such as StaggeredGrid are also structs.

Properties are used to hold additional parameters for the simulation that should be included in the `description.json` file. Typical examples of these include `viscosity` or `buoyancy_factor`.

In [8]:
from phi.flow import *
smoke = Smoke(Domain([80, 64]))

def print_name(trace):
    print('%s  (%s)' % (trace.path(), type(trace.value).__name__))
    return trace.value
struct.map(print_name, smoke, trace=True, include_properties=True);

TypeError: 'NoneType' object is not subscriptable

[Staggered grids](Staggered_Grids.md), as in `smoke.velocity`, are vector fields where the arrays for each component have different shapes.

## Tensor initialization

Initializer functions such as `zeros` or `placeholder` internally call their counterparts in NumPy or TensorFlow.
They can take 1D-tensors describing the shape as input but also support structs holding shapes.
The call `zeros(StaggeredGrid([1,65,65,2]))` will return a `StaggeredGrid` holding a NumPy array.

Some states simplify this even further by allowing a syntax like `SomkeState(density=zeros)` or `SmokeState(velocity=placeholder)`.

The `placeholder` and `variable` initializers also infer the name of the resulting tensors from the attribute names.

### Data I/O

The data writing and reading system accepts structs and automatically infers their names from the attributes.
See the [data documentation](Reading_and_Writing_Data.md).

### Session

The [`Session`](../phi/tf/session.py) class is a customized version of `tf.Session` which accepts structs for the `fetches` argument as well as inside the `feed_dict`.

This can be used to quickly run states through a graph like so:

In [9]:
from phi.tf.flow import *
numpy_state = Smoke(Domain([16, 16]), density=math.zeros, velocity=math.zeros)
placeholder_state = Smoke(Domain([16, 16]), density=placeholder, velocity=placeholder)
output_state = SMOKE.step(placeholder_state)
session = Session(None)
numpy_state = session.run(output_state, {placeholder_state: numpy_state})
numpy_state

Smoke[density: Grid[16x16(1), size=[16. 16.]], velocity: StaggeredGrid[16x16, size=[16. 16.]]]

## Validitiy

As structs are supposed to hold data in a specific structure, there is a preferred data type for each entry.
For a CenteredGrid, the `data` attribute should be a tensor or array with a certain rank and the `velocity` of a `Smoke` object should be a `StaggeredGrid`.

An entry is _valid_ if its value if of the preferred data type.
Subclasses of `Struct` can implement validity checks and modify their entries to make them valid.

This hierarchy is not always needed, however. Many math functions return invalid structs such as `math.staticshape(obj)` which returns a struct containing shapes instead of data.
Code dealing with invalid structs should always be enclosed in a `with struct.anytype():` block.
This context skips all data validation steps.

# Immutability
While structs can be mutable in principle, the struct interface does not allow for changing a struct.
Attributes and properties can be "changed" using the `copy_with` function.
Thereby, the struct isn't altered but rather a duplicate with the new values is created.

## Implementing a custom struct

The following code snippet defines a custom `Struct`

In [10]:
class MyStruct(struct.Struct):

    def __init__(self, a, p, **kwargs):
        struct.Struct.__init__(**struct.kwargs(locals()))

    @struct.attr()
    def a(self, a): return a

    @struct.prop()
    def p(self, p): return str(p)

Attributes and properties are declared using the decorators `struct.attr()` and `struct.prop()`, respectively.
These items create read-only attributes which should be changed only using the inherited `copied_with()` method.

The methods themselves are used for validation. In addition to `self`, each attribute and property gets the intended value as an input. The function can either directly return this value without any validity checks, raise an error for invalid values or transform the value into a valid value.

Unless created inside a `with struct.anytype()` block, structs are always valid when viewed from outside.

In [11]:
mystruct = MyStruct(a=0, p=0)
print(mystruct.p, type(mystruct.p))
mystruct = mystruct.copied_with(p=1)
print(mystruct.p, type(mystruct.p))

0 <class 'str'>
1 <class 'str'>
