# Output CoTransformer (Advanced)

`OutputCoTransfomer` is in general similar to `CoTransformer`. And any `CoTransformer` can be used as `OutputCoTransformer`. It is important to understand the difference between the operations `transform` and `out_transform`

* `transform` is lazy, Fugue does not ensure the compute immediately. For example, if using `SparkExecutionEngine`, the real compute of `transform` happens only when hitting an action, for example `save`.
* `out_transform` is an action, Fugue ensures the compute happening immediately, regardless of what execution engine is used.
* `transform` outputs a transformed dataframe for the following steps to use
* `out_transform` is the last compute of a branch in the DAG, it outputs nothing.

You may find that `transform().persist()` can be an alternative to `out_transform`, it's in general ok, but you must notice that, the output dataframe of a transformation can be very large, if you persist or checkpoint it, it can take up great portion of memory or disk space. In contrast, `out_transform` does not take any space. Plus, it is a more explicit way to show what you want to do.

A typical use case is to distributedly compare two dataframes per partition


## Native Approach

In [None]:
from typing import List, Any

def assert_eq(df1:List[List[Any]], df2:List[List[Any]]) -> None:
    assert df1 == df2
    print(df1,"==",df2)

# schema: a:int
def assert_eq_2(df1:List[List[Any]], df2:List[List[Any]]) -> List[List[Any]]:
    assert df1 == df2
    print(df1,"==",df2)
    return [[0]]

In [None]:
from fugue import FugueWorkflow

with FugueWorkflow() as dag:
    df1 = dag.df([[0,1],[0,2],[1,3]], "a:int,b:int")
    df2 = dag.df([[1,3],[0,2],[0,1]], "a:int,b:int")
    z = df1.zip(df2, partition=dict(by=["a"],presort=["b"]))
    z.out_transform(assert_eq)
    z.out_transform(assert_eq_2) # All CoTransformer like functions/classes can be used directly

## Decorator Approach

There is no obvious advantage to use decorator for `OutputCoTransformer`

In [None]:
from typing import List, Any
from fugue.extensions import output_cotransformer
from fugue import FugueWorkflow

@output_cotransformer()
def assert_eq(df1:List[List[Any]], df2:List[List[Any]]) -> None:
    assert df1 == df2
    print(df1,"==",df2)
    
with FugueWorkflow() as dag:
    df1 = dag.df([[0,1],[0,2],[1,3]], "a:int,b:int")
    df2 = dag.df([[1,3],[0,2],[0,1]], "a:int,b:int")
    z = df1.zip(df2, partition=dict(by=["a"],presort=["b"]))
    z.out_transform(assert_eq)

## Interface Approach

Just like the interface approach of `CoTransformer`, you get all the flexibilities and control over your transformation

In [None]:
from typing import List, Any
from fugue.extensions import OutputCoTransformer
from fugue import FugueWorkflow

class AssertEQ(OutputCoTransformer):
    # notice the interface is different from CoTransformer
    def process(self, dfs):
        df1, df2 = dfs[0].as_array(), dfs[1].as_array()
        assert df1 == df2
        print(df1,"==",df2)

with FugueWorkflow() as dag:
    df1 = dag.df([[0,1],[0,2],[1,3]], "a:int,b:int")
    df2 = dag.df([[1,3],[0,2],[0,1]], "a:int,b:int")
    z = df1.zip(df2, partition=dict(by=["a"],presort=["b"]))
    z.out_transform(AssertEQ)