# Deltaflow Data Types

Deltaflow allows to work with anyncronous processes written in different programming languages (currently only Python and C/C++). In order to facilitate interaction between them we define the Deltaflow Types, which the user can use for data excange.

## Design

Our type system was designed around three principles:

1. __Hardware Friendly__: a lightweight static system.

2. __Sizeable Connections__: all types are sizeable so that the size of a
    connection can be calculated in bits.

3. __Block Re-Use__: when possible, we want to allow users to
    encapsulate unique behaviour once and not have to re-write this with
    different types.

The system is informed by _Type Systems_ by Luca Cardelli, 1996.

## Main Types

Below we highlight the main data types and show how they are used in the context of Python nodes. For Python users some of the steps might look not obvious but we will try to address them below and give our reasoning which mostly comes from hardware development and other strictly typed languages, as C/C++.


### Primitive

These are the main building blocks:

- `DInt`: default 32-bit, maps to python `int`

- `DUInt`: default 32-bit, maps to python `int`

- `DBool`: 1-bit implementation, maps to python `bool`

- `DChar`: 8-bit implementations for ASCII characters, does not have python analogue

- `DFloat`: default 32-bit, maps to python `float`

- `DComplex`: default 64-bit, maps to python `complex`

Note that 16, 64, 128 bit versions are under development.

The python types can be converted into them:

In [1]:
import deltalanguage as dl

print(dl.as_delta_type(int))
print(dl.as_delta_type(bool))
print(dl.as_delta_type(float))
print(dl.as_delta_type(complex))

DInt32
DBool
DFloat32
DComplex64


In [2]:
print(dl.delta_type(-5))
print(dl.delta_type(True))
print(dl.delta_type(4.3))
print(dl.delta_type(3 + 5j))

DInt32
DBool
DFloat32
DComplex64


And in the opposite direction:

In [3]:
print(dl.DInt().as_python_type())
print(dl.DUInt().as_python_type())
print(dl.DBool().as_python_type())
print(dl.DFloat().as_python_type())
print(dl.DComplex().as_python_type())

<class 'int'>
<class 'int'>
<class 'bool'>
<class 'float'>
<class 'complex'>


### Compound (Non Primitive)

These types allow users to exchange more complicated blobs of data:

- `DTuple`: bundles a fixed number of fixed type elements together, maps to python `Tuple`

- `DArray`: as `DTuple` but with the same type elements,
    maps to python `List`

- `DStr`: `DArray` of `DChar`, default length is 1024, maps to python `str`

- `DRecord`: as `DTuple` but keyworded, maps to classes with attributes created
    via [attrs](http://www.attrs.org/en/stable/index.html)

- `DUnion`: used to define sereral types, the meta information about the type
    is stored in the extra buffer; maps to python `Union`

Let's see a few examples:

In [4]:
print(dl.delta_type((-5, False, 2.5)))
print(dl.delta_type([4.5, 3.2, -1.0]))
print(dl.delta_type('Hello World'))

(DInt32, DBool, DFloat32)
[DFloat32 x 3]
DStr88


`DRecord` example requires import of `attr` and creation of a special container:

In [5]:
import attr

@attr.s(slots=True)
class RecBI:

    x: bool = attr.ib()
    y: int = attr.ib()

print(dl.delta_type(RecBI(x=True, y=41)))

{x: DBool, y: DInt32}


The data types of Deltaflow and Python, which have 1-2-1 mapping, can be used interchangably:

In [6]:
print(dl.DTuple([int, dl.DFloat()]))

(DInt32, DFloat32)


`DUnion` should be defined like this, and can contain as many types as you wish:

In [7]:
print(dl.DUnion([int, float]))

<DFloat32 | DInt32>


Note that `DUnion` does not simplify in case of one type (unlike `Union`) and since it have a specific packing format it cannot be accepted as that type.

In [8]:
print(dl.DUnion([int]))
assert dl.DUnion([int]) != dl.DInt()

<DInt32>


### Special

These are all unique classes which require an individual approach. They all relate to the data types, which is why they are covered here. In short,

- `DSize`: defines the type's size, can be a placeholder

- `Void`: used in case without any output

- `Top`: abstract type, maps to python `object`

- `DRaw`: maps from a base type to an integer containing that type's binary representation

Now in more detail.

#### `DSize`

You have noted (I hope) that in the current implementation the Deltaflow types are instances of classes, and not just the class. It's because almost all types can be defined with the precisions, which is defined via the number of bits this data type is stored in:

In [9]:
# 32 bits
print(dl.DInt(dl.DSize(32)).pack(15))
# 64 bits
print(dl.DInt(dl.DSize(64)).pack(15))

# these types are different
assert dl.DInt(dl.DSize(32)) != dl.DInt(dl.DSize(64))

# can you read it? ;)

b'00000000000000000000000000001111'
b'0000000000000000000000000000000000000000000000000000000000001111'


Another use of this type is to define the length of `DArray` and `DStr`:

In [10]:
# 32 bits x 2 = 64 bits
print(dl.DArray(dl.DInt(), dl.DSize(2)).pack([15, 255]))

# these types are different
print(dl.DArray(dl.DInt(), dl.DSize(2)) == dl.DArray(dl.DInt(), dl.DSize(3)))

b'0000000000000000000000000000111100000000000000000000000011111111'
False


`DUnion` has a unique packing system:

In [11]:
# The last eight bits stores that information about the data type.
# The rest of the buffer has the same size as the largest type
# 32 bits in this case.
# So in total it's 40 bits.
print(dl.DUnion([dl.DInt(), dl.DBool()]).pack(1))
print(dl.DUnion([dl.DInt(), dl.DBool()]).pack(True))

b'0000000000000000000000000000000100000001'
b'1000000000000000000000000000000000000000'


Try to pack more complicated data types if you wish.

#### `Void`

In case if a node does not have any output channels use this class in the return statement. If the node may or may not send a message you need to create a channel by specifying the output type.

#### `Top`

This type is used for debugging mostly as any data type can be accepted as `Top`. However this rule is not applied to subtypes of compound types as it will break their packing mechanism, e.g. `DTuple([int, int])` cannot be received as `DTuple([int, Top()]`. Note that `Top` cannot be used within the SystemC runtime, which requires stricter typing.

Both `Void` and `Top` can be illustrated in this example:

In [12]:
@dl.DeltaBlock(allow_const=False)
def return5() -> int:
    return 5

@dl.DeltaBlock(allow_const=False)
def print_and_exit(a: dl.Top()) -> dl.Void:
    print('Hello ', a)
    raise dl.DeltaRuntimeExit

with dl.DeltaGraph() as graph:
    print_and_exit(return5())

rt = dl.DeltaPySimulator(graph)
rt.run()

Hello  5


Try:
- Use python's `object` instead of `Top()` with the same effect.
- Use a wrong type, like `float`, and see that an error is raised.

#### `DRaw`

`DRaw` is a special type used to allow support of more types with languages for low-level programming such as Migen. Because these languages require inputs to be specified as a binary signal, `DRaw` provides methods `as_bits` and `from_bits` to convert data to a binary signal and back. This allows us to use other types such as floating point numbers within these nodes.

In [13]:
raw_type = dl.DRaw(float)

@dl.DeltaBlock(allow_const=False)
def return_half() -> raw_type:
    return raw_type.as_bits(0.5)

@dl.DeltaBlock(allow_const=False)
def print_raw_and_bits(a: raw_type) -> dl.Void:
    print("Input in bits:", raw_type.pack(a))
    print("Input as float:", raw_type.from_bits(a))
    raise dl.DeltaRuntimeExit

with dl.DeltaGraph() as graph:
    print_raw_and_bits(a=return_half())

rt = dl.DeltaPySimulator(graph)
rt.run()

Input in bits: b'00111111000000000000000000000000'
Input as float: 0.5


## NumPy

[NumPy](https://numpy.org/) is a popular library for numerical computation in Python. While Deltaflow does not provide full compatibility with NumPy, we support a number of use cases, including many primitive types and some structures of `numpy.ndarray`.

For a Deltaflow data type, the method `as_numpy_object` will produce an equivalent NumPy object, and the method `from_numpy_object` will take a NumPy array and reconstruct the original object from it. Deltaflow types also have an `as_numpy_type` method, which produces a data type which can be used in a NumPy array.

NumPy has its own implementations of primitive data types, which can be mapped to Deltaflow's own primitive types. See [here](https://numpy.org/devdocs/user/basics.types.html#array-types-and-conversions-between-types) for a full list of NumPy's primitive types. Nearly all fixed-size primitive types in NumPy are supported, with the only exceptions being `numpy.intp` and `numpy.uintp`, which do not have an equivalent type in Deltaflow. Some platform-defined types might be supported depending on implementation, but as a rule of thumb it is safer to use NumPy's fixed-size primitive types.

In [14]:
import numpy as np

print(dl.as_delta_type(np.bool_))
print(dl.as_delta_type(np.ubyte))
print(dl.as_delta_type(np.int32))
print(dl.as_delta_type(np.uint32))
print(dl.as_delta_type(np.float32))
print(dl.as_delta_type(np.complex64))

print(dl.DBool().as_numpy_type())
print(dl.DChar().as_numpy_type())
print(dl.DInt().as_numpy_type())
print(dl.DUInt().as_numpy_type())
print(dl.DFloat().as_numpy_type())
print(dl.DComplex().as_numpy_type())

DBool
DChar8
DInt32
DUInt32
DFloat32
DComplex64
<class 'numpy.bool_'>
<class 'numpy.uint8'>
<class 'numpy.int32'>
<class 'numpy.uint32'>
<class 'numpy.float32'>
<class 'numpy.complex64'>


Compound types can also be mapped to NumPy data structures through the methods mentioned above.
The simplest example is `DArray`, which maps to a 1-dimensional `numpy.ndarray`.

In [15]:
np_array = dl.DArray(int, dl.DSize(5)).as_numpy_object([1, 2, 3, 4, 5])
print(np_array)

original_array = dl.DArray(int, dl.DSize(5)).from_numpy_object(np_array)
assert original_array == [1, 2, 3, 4, 5]

[1 2 3 4 5]


`DStr` is a special case of `DArray`, and maps to `numpy.string_` rather than `numpy.ndarray`, which is an ASCII-encoded string.

In [16]:
np_str = dl.DStr(dl.DSize(12)).as_numpy_object("Hello world!")
print(np_str)
assert chr(np_str[4]) == 'o'

original_str = dl.DStr(dl.DSize(12)).from_numpy_object(np_str)
assert original_str == "Hello world!"

b'Hello world!'


Note that multidimensional NumPy arrays are not currently supported, as `DArray` types are one-dimensional.

`DTuple` and `DRecord` objects also have equivalent NumPy types, which are a single-row `numpy.ndarray` whose columns are different types. Elements from a `DTuple` can be accessed by integer indexing, whereas elements from a `DRecord` can be accessed as attributes.

In [17]:
np_tuple = dl.DTuple([bool, dl.DChar(), int]).as_numpy_object((True, 'c', 5))
# Recall DChar objects are stored in their ASCII format
print(np_tuple)
assert chr(np_tuple[0][1]) == 'c'
original_tuple = dl.DTuple([bool, dl.DChar(), int]).from_numpy_object(np_tuple)
assert original_tuple == (True, 'c', 5)

np_record = dl.DRecord(RecBI).as_numpy_object(RecBI(True, 5))
print(np_record)
assert np_record[0].y == 5
original_record = dl.DRecord(RecBI).from_numpy_object(np_record)
assert original_record == RecBI(True, 5)

[( True, 99, 5)]
[( True, 5)]


Finally, `DUnion` also has a NumPy object, which is a single-row `numpy.ndarray` whose columns have the same byte offset.
However, note that unlike the other types, there is no equivalent method for converting back to a `DUnion` object,
because the current type is not stored. Instead, the user needs to know which type is the correct type.

In [18]:
# Columns in NumPy unions are indexed by names
np_union = dl.DUnion([bool, float, int]).as_numpy_object(5)
print(np_union)
original_union = np_union[0]["DInt32"]
assert original_union == 5

# Alternatively, they can be indexed in alphabetical order
np_union = dl.DUnion([bool, float, int]).as_numpy_object(3.5)
print(np_union)
original_union = np_union[0][1]
assert original_union == 3.5

[( True, 0., 5)]
[(False, 3.5, 1080033280)]


## Common Pitfalls

These are the most common mistakes for python users:

- Don't forget to use instances of the type class, `DBool()`, and not just class, `DBool`.

- Deltaflow does not to have type conversion or casting.
Thus the primitive types are not related to each other,
for instance `DBool()` can not be cast to `DUInt()` as it can with python.
The silent casting on python can cause an problem for Deltaflow.

- Typed data channels (a.k.a. wires, streams) connect nodes' input and output ports.
The types of the sender port should match the receiving port. `Top()` can be used
at the receiving side for quick debugging but should be the matching type when used
beyond the Python runtime.

- `DUnion` with a single type is not equal to that type as they have different
packing mechanisms; an attempt to create a channel between these two types
will raise an error.