# Tutorial 3: Operator

In FastEstimator, the most important concept is `Operator`, which is used extensively in `RecordWriter`, `Pipeline` and `Network`. In this tutorial, we're going to talk about everything you need to know about `Operator`.

Let's start with a short version: `Operator` is a class that works like a function.

As we all know, a function has 3 compoents: input variable(s), transformation logics and output variable(s). Similarly, an `Operator` has 3 parts: input key(s), transformation fuction and output key(s). 

Now you may think: "`Operator` and function are almost the same, what's different between them? why do we need it?"

Here's the difference: function uses variable and `Operator` uses keys (which is a representation of variable). 

Here's why: the purpose of `Operator` is to allow users to construct a graph when variables are not created yet. In FastEstimator, we take care of the creation, routing and management of all variables such that users can have a good sleep at night.  

## How Operator works

Assuming our data is in a dictionary format with key-value pairs, and an `Operator` named `Add_one` which add 1 to whatever input key is given:
```python
class Add_one(Operator):
"""
assuming it is already defined, we will talk about how to define Operator later.
"""

data = {"x":1, "y":2}

```

What we want is to add 1 to the value associated with key `x`, we can simply do:

```python

Add_one(inputs="x", outputs="x")
```
At run time, what the operator will do is:

1. take the value of the input key 'x' from the data dictionary
2. apply transformation functions to the value
3. write the output value to the data dictionary with output key 'x'

As a result, the data will become:
```python
{"x":2, "y":2}

```

Now let's add 1 to the value of `x` again and write the output to a new key `z`:
```python

Add_one(inputs="x", outputs="z")
```
data then becomes:
```python
{"x":2, "y":2, "z":3}

```


## How to express Operator connections in FastEstimator

`Operator` can take multiple inputs and produce multiple outputs. One can see the true power of `Operator` when combining them in a sequence. The Figure below lists several examples of graph topologies enabled by lists of `Operator`. We will talk about `Schedule` in detail in future tutorials.

<img src="image/Ops.png">

## What are different types of Operators?

On the base level, there are two types of `Operators`: `NumpyOp` and `TensorOp`. 

`NumpyOp` is used in the `ops` argument of `RecordWriter` only. Users can use any library inside the transformation function to calculate output. For example, users can call numpy, cv2, scipy functions etc. 

`TensorOp` is used in the `ops` argument of `Pipeline` and `Network`. Users are restricted to use tensor graph to construct the output. For example, the transformation logic has to be written in tensorfloew graph.

## Operator demo in FastEstimator

Next we will showcase different usage of `Operator` in end-to-end deep learning task.  Let's start with same task as tutorial 2 and build more complex logics using `Operator`.

Similar to tutorial 2, let's first generate some data images and csv files for later usage:

In [3]:
from fastestimator.dataset.mnist import load_data

train_csv, eval_csv, data_path = load_data()

print("image data is generated in {}".format(data_path))

image data is generated in /var/folders/5g/d_ny7h211cj3zqkzrtq01s480000gn/T/.fe/Mnist


### RecordWriter

In the new task, given the csv files and trainig images, we want to do two preprocessing steps upfront:

1. read the image in grey scale, read the label. (we will use existing Operator provided by FastEstimator)
2. resale the image pixel value range, from [0, 255] to [-1, 1]. (We will customize an Operator to achieve this)

In [7]:
from fastestimator.util.op import NumpyOp
from fastestimator.record.preprocess import ImageReader
import fastestimator as fe
import numpy as np
import os

class Rescale(NumpyOp):
    def forward(data, state):
        data = (data - 127.5) / 127.5
        return data

writer = fe.RecordWriter(save_dir=os.path.join(data_path, "FEdata"),
                         train_data=train_csv,
                         validation_data=eval_csv,
                         ops=[ImageReader(inputs="x", parent_path=data_path, grey_scale=True), Rescale(outputs="x")])

Note that in the above ops sequence, `ImageReader` doesn't have outputs and 