# Kosh Operators

This notebook introduces Kosh's *operators*. Unlike *transformers* which act on a feature itself. *operators* allow post-processing of data comming from different features, for example adding two features together. Either from the same source or not.

Similarly to *loaders* and *transformers*, *operators* must declare the mime_types they can accept as inputs and the mime_types they export these inputs to. Where the *transformers* process the feature via their `transform` function, *operators* must define their `operate` function.

At the moment it is expected that all inputs must be from the same mime_type.

Operators inputs can be features coming straight from the loader, possibly processed by a(many) *transformer(s)* and coming from another *operator*.

In this example we will define a simple **ADD** operator that will add all the inputs it receives.

In [1]:
import kosh
import numpy as np

class ADD(kosh.KoshOperator):
    types = {"numpy" : ["numpy",]}  # Our operator accepts numpy arrays and outputs numpy arrays
    
    def operate(self, *inputs, ** kargs):
        # *inputs are the input received from their predecessors in the execution graph
        # It is important to define **kargs as the function will receive `format=some_format`
        out = np.array(inputs[0])
        for input_ in inputs[1:]:
            out += np.array(input_)
        return out

**Important points**:

  * The `operate` function will receive the desired output format via `format=output_format` so it *must* declare `**kargs`
  * inputs are sent via `*inputs`

In [2]:
import sys, os
store = kosh.create_new_db("operators_demo.sql")
ds = store.create()
ds.associate("../tests/baselines/node_extracts2/node_extracts2.hdf5", mime_type="hdf5")

f1 = ds["cycles"]
print(f1())

add = ADD(f1,f1)

print(add[:])


<HDF5 dataset "cycles": shape (2,), type "<i8">
[22 16]


In the next version the search function will return a generator.
You might need to wrap the result in a list.
  "\nIn the next version the search function will return a generator.\n"


As previously mentioned we can also pass the feature through a transformer first:

In [3]:
class Twice(kosh.transformers.KoshTransformer):
    types = {"numpy":["numpy",]}
    def transform(self, input, format):
        return np.array(input) * 2.
    
twice = Twice()

f1 = ds.get_execution_graph("cycles", transformers=[twice,])
f2 = ds["cycles"]

add2 = ADD(f1, f2)
print(add2[:])


[33. 24.]


We can also have an operator as an input to another, and mix and match this with regular features

In [4]:
add3 = ADD(add2, add)
print(add3[:])

[55. 40.]


Sometimes these can get complicated and hard to follow.
You can draw the execution graph to see if everything is happening as you would expect.

In [5]:
kosh.utils.draw_execution_graph(add3.execution_graph(), png_name="exec_graph.png", output_format="numpy")

![Execution Graph](exec_graph.png)

Lastly it is worth noting that transformers and operators can implement their own `__getitem__` function to subset the data. See [this notebook](Example_Advanced_Data_Slicing.ipynb) for more in this.