## Pipeline Tutorial of HeteroNN with Pytorch Backend

In this tutorial, we offer two examples: A standard binary classification task and A simple recommendation task, to show you how to use the Pytorch sequential to build Hetero-NN model structures. Besides these two examples, we have some extra contents below to show you the usage of some Pytorch layers, like LSTM.

### install

`Pipeline` is distributed along with [fate_client](https://pypi.org/project/fate-client/).

```bash
pip install fate_client
```

To use Pipeline, we need to first specify which `FATE Flow Service` to connect to. Once `fate_client` installed, one can find an cmd enterpoint name `pipeline`:

In [1]:
!pipeline --help

Usage: pipeline [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  config  pipeline config tool
  init     - DESCRIPTION: Pipeline Config Command.


Assume we have a `FATE Flow Service` in 127.0.0.1:9380(defaults in standalone), then exec

In [2]:
!pipeline init --ip 127.0.0.1 --port 9380

Pipeline configuration succeeded.


### Hetero NN Binary Task Example - Pytorch Backend

In this example, we will show you how to build the bottom/interactive/top models sturctures of Hetero-NN model with Pytorch sequential components. Before start a modeling task, data to be used should be uploaded. Please refer to this [guide](https://github.com/FederatedAI/FATE/blob/master/doc/tutorial/pipeline/pipeline_tutorial_upload.ipynb).

The `pipeline` package provides components to compose a `FATE pipeline`.

In [1]:
from pipeline.backend.pipeline import PipeLine
from pipeline.component import Reader, DataTransform, Intersection, HeteroNN, Evaluation
from pipeline.interface import Data, Model

We import torch, fate_torch, and fate_torch_hook. **Do Remember to call fate_torch_hook**, this function is important because it modifies origin torch native functions/classes. This allows FATE-pipeline to parse the definition of PyTorch's sequential, optimizers, loss functions, and initializers.

In [2]:
import torch as t
from torch import nn
from torch.nn import init
from torch import optim

from pipeline import fate_torch_hook
from pipeline import fate_torch as ft

from collections import OrderedDict
# Call fate_torch_hook, this is important
t = fate_torch_hook(t)

Then we make a `pipeline` instance:

    - initiator: 
        * role: guest
        * party: 9999
    - roles:
        * guest: 9999
        * host: 10000
    

In [3]:
guest_party_id = 9999
host_party_id = 10000

pipeline = PipeLine() \
        .set_initiator(role='guest', party_id=guest_party_id) \
        .set_roles(guest=guest_party_id, host=host_party_id)

Define a `Reader` to load data

In [4]:
reader_0 = Reader(name="reader_0")
# set guest parameter
reader_0.get_party_instance(role='guest', party_id=guest_party_id).component_param(
    table={"name": "breast_hetero_guest", "namespace": "experiment"})
# set host parameter
reader_0.get_party_instance(role='host', party_id=host_party_id).component_param(
    table={"name": "breast_hetero_host", "namespace": "experiment"})

Add a `DataTransform` component to parse raw data into Data Instance

In [5]:
data_transform_0 = DataTransform(name="data_transform_0")
# set guest parameter
data_transform_0.get_party_instance(role='guest', party_id=guest_party_id).component_param(
    with_label=True)
data_transform_0.get_party_instance(role='host', party_id=[host_party_id]).component_param(
    with_label=False)

Add a `Intersection` component to perform PSI for hetero-scenario

In [6]:
intersect_0 = Intersection(name="intersect_0")

Now, we can build the structures of the Hetero-NN model locally with PyTorch. In this case, the guest bottom model and host bottom model are sequential that contains 2 layers. The guest top model is a sequential contians 1 layer and sigmoid activation. The Interactive model is a single linear layer.

In [7]:
# guest bottom model
guest_bottom_a = nn.Linear(10, 10, True)
guest_bottom_b = nn.Linear(10, 8, True)
guest_bottom_seq = t.nn.Sequential(
    OrderedDict([
        ('layer_0', guest_bottom_a),
        ('relu_0', nn.ReLU()),
        ('layer_1', guest_bottom_b),
        ('relu_1', nn.ReLU())
    ])
)

# host bottom model
host_bottom_model = nn.Sequential(
    nn.Linear(20, 16, True),
    nn.ReLU(),
    nn.Linear(16, 8, True),
    nn.ReLU()
)

# interactive layer
interactive_layer = nn.Linear(8, 4, True)

# guest top model
guest_top_seq = t.nn.Sequential(
    nn.Linear(4, 1, True),
    nn.Sigmoid()
)

After defining the model structures, we initialize model parameters(weight & bias) with torch initializers. These initializers are modified, so it can initialize whole sequential. And you can decide to initialize weight or bias.

In [8]:
# init bottom models
init.normal_(guest_bottom_seq)
init.normal_(host_bottom_model)
init.normal_(guest_bottom_seq, init='bias')
# init interactive layer
init.normal_(interactive_layer, init='weight')
init.constant_(interactive_layer, val=0, init='bias')
# init top model
init.xavier_normal_(guest_top_seq)

We define loss function and optimizer needed for the training. **Please notice that after fate torch hook, optimizers can be created without parameters**, this modification is better for pipeline compile.

In [9]:
# loss function
ce_loss_fn = nn.BCELoss()
# optimizer, after fate torch hook optimizer can be created without parameters
opt = optim.Adam(lr=0.01)

Now we can add each network structure to the Hetero-NN component and compile it.

In [None]:
# make HeteroNN components and define some parameters
hetero_nn_0 = HeteroNN(name="hetero_nn_0", epochs=10, floating_point_precision=None,
                       interactive_layer_lr=0.01, batch_size=-1, early_stop="diff")

# add sub-component to hetero_nn_0
guest_nn_0 = hetero_nn_0.get_party_instance(role='guest', party_id=guest_party_id)
guest_nn_0.add_bottom_model(guest_bottom_seq)
guest_nn_0.add_top_model(guest_top_seq)
guest_nn_0.set_interactve_layer(interactive_layer)

host_nn_0 = hetero_nn_0.get_party_instance(role='host', party_id=host_party_id)
host_nn_0.add_bottom_model(host_bottom_model)
# compile model with torch optimizer
hetero_nn_0.compile(opt, loss=ce_loss_fn)

To show the evaluation result, an "Evaluation" component is needed.

In [11]:
evaluation_0 = Evaluation(name="evaluation_0", eval_type="binary")

Then compile our pipeline to make it ready for submission.

In [12]:
hetero_nn_1 = HeteroNN(name="hetero_nn_1")

pipeline.add_component(reader_0)
pipeline.add_component(data_transform_0, data=Data(data=reader_0.output.data))
pipeline.add_component(intersect_0, data=Data(data=data_transform_0.output.data))
pipeline.add_component(hetero_nn_0, data=Data(train_data=intersect_0.output.data))
pipeline.add_component(hetero_nn_1, data=Data(test_data=intersect_0.output.data),
                       model=Model(model=hetero_nn_0.output.model))
pipeline.add_component(evaluation_0, data=Data(data=hetero_nn_0.output.data))
pipeline.compile()

<pipeline.backend.pipeline.PipeLine at 0x7f5cdd16dd90>

Now, submit(fit) our pipeline:

In [13]:
pipeline.fit()

[32m2022-08-25 14:27:05.915[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m83[0m - [1mJob id is 202208251427054295830
[0m
[32m2022-08-25 14:27:05.926[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m98[0m - [1m[80D[1A[KJob is still waiting, time elapse: 0:00:00[0m
[32m2022-08-25 14:27:06.937[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m98[0m - [1m[80D[1A[KJob is still waiting, time elapse: 0:00:01[0m
[0mm2022-08-25 14:27:07.953[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m125[0m - [1m
[32m2022-08-25 14:27:07.954[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m127[0m - [1m[80D[1A[KRunning component reader_0, time elapse: 0:00:02[0m
[32m2022-08-25 14:27:08.970[0m | [1mINFO    

[32m2022-08-25 14:27:42.545[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m127[0m - [1m[80D[1A[KRunning component hetero_nn_0, time elapse: 0:00:36[0m
[32m2022-08-25 14:27:43.561[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m127[0m - [1m[80D[1A[KRunning component hetero_nn_0, time elapse: 0:00:37[0m
[32m2022-08-25 14:27:44.580[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m127[0m - [1m[80D[1A[KRunning component hetero_nn_0, time elapse: 0:00:38[0m
[32m2022-08-25 14:27:45.594[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m127[0m - [1m[80D[1A[KRunning component hetero_nn_0, time elapse: 0:00:39[0m
[32m2022-08-25 14:27:46.619[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m127[0m

[32m2022-08-25 14:28:22.242[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m127[0m - [1m[80D[1A[KRunning component hetero_nn_0, time elapse: 0:01:16[0m
[32m2022-08-25 14:28:23.257[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m127[0m - [1m[80D[1A[KRunning component hetero_nn_0, time elapse: 0:01:17[0m
[32m2022-08-25 14:28:24.272[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m127[0m - [1m[80D[1A[KRunning component hetero_nn_0, time elapse: 0:01:18[0m
[32m2022-08-25 14:28:25.287[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m127[0m - [1m[80D[1A[KRunning component hetero_nn_0, time elapse: 0:01:19[0m
[32m2022-08-25 14:28:26.303[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m127[0m

[32m2022-08-25 14:29:01.836[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m127[0m - [1m[80D[1A[KRunning component hetero_nn_0, time elapse: 0:01:55[0m
[32m2022-08-25 14:29:02.849[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m127[0m - [1m[80D[1A[KRunning component hetero_nn_0, time elapse: 0:01:56[0m
[0mm2022-08-25 14:29:03.877[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m125[0m - [1m
[32m2022-08-25 14:29:03.881[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m127[0m - [1m[80D[1A[KRunning component hetero_nn_1, time elapse: 0:01:57[0m
[32m2022-08-25 14:29:04.896[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m127[0m - [1m[80D[1A[KRunning component hetero_nn_1, time elapse: 0:01

### Hetero NN RecommendationTask Example - Pytorch Backend

In this Example we show you how to use pytorch backend and fate-torch operations to build a simple recommendation task. Firstly we introduce the dataset. We use the Movielens-100K dataset, guest side holds the records of user-item interactions, while host side has some categoriacl movie meta data. Thus, this task requires sparse input data.

The first step is to define pipeline components that read and align data, like the first example.

In [None]:
import argparse

from collections import OrderedDict
from pipeline.backend.pipeline import PipeLine
from pipeline.component import DataTransform
from pipeline.component import HeteroNN
from pipeline.component import Intersection
from pipeline.component import Reader
from pipeline.component import Evaluation
from pipeline.interface import Data
from pipeline.utils.tools import load_job_config
from pipeline.interface import Model

from pipeline import fate_torch_hook
import torch as t
from torch import nn
from torch.nn import init
from torch import optim
from pipeline import fate_torch as ft

# this is important, modify torch modules so that Sequential model be parsed by pipeline
fate_torch_hook(t)

guest_train_data = {"name": "ml_hetero_guest", "namespace": f"experiment{namespace}"}
host_train_data = {"name": "ml_hetero_host", "namespace": f"experiment{namespace}"}

pipeline = PipeLine().set_initiator(role='guest', party_id=guest).set_roles(guest=guest, host=host)

reader_0 = Reader(name="reader_0")
reader_0.get_party_instance(role='guest', party_id=guest).component_param(table=guest_train_data)
reader_0.get_party_instance(role='host', party_id=host).component_param(table=host_train_data)

data_transform_0 = DataTransform(name="data_transform_0")
data_transform_0.get_party_instance(role='guest', party_id=guest).component_param(with_label=True,
                                                                                  label_name='label')
data_transform_0.get_party_instance(role='host', party_id=host).component_param(with_label=False)

intersection_0 = Intersection(name="intersection_0")

The key step is building the structures of guest/host NN components. Because for now Hetero-NN only supports parsing PyTorch sequential, we need to build Embedding layers in 1 sequential.

The Pytorch embedding layers take long type as input, so the first layer of sequential is to use an operation layer Astype in fate-torch to cast input to Long type. We have 610 users and 9742 items in this interaction records, and they will be fed into the embedding layer in pairs,  like ([0, 612]) for example. So we use an Embedding layer containing 610 + 9742 embeddings, and item indices will be offset by 610 in the dataset.  The third layer of the sequential is a fate-torch operation that reshapes the output of embedding layers, this operation concatenates user and item embeddings. Then, finally embedding will be fed into a linear layer following a ReLu activation.

In [15]:
# define embedding model
u_embd_count = 610  # idx 0 - 609 user embed
i_embd_count = 9742  # idx 611 - 10351 item embed

# guest model handle user-movie interaction pairs, user and movie share same Embedding layers
guest_model = t.nn.Sequential(
    ft.operation.Astype('int64'),  # cast to long
    t.nn.Embedding(u_embd_count + i_embd_count + 1, embedding_dim=8),
    ft.operation.Reshape((-1, 16)),  # fate_torch operation that concatenates u-i embeddings
    t.nn.Linear(16, 8),
    t.nn.ReLU()
)

# test guest_bottom model
test_u_i = t.Tensor([[0, 639]])
test_out = guest_model(test_u_i)
print(test_out)

tensor([[1.5611, 0.0000, 1.5532, 1.7774, 0.0394, 0.0000, 0.0000, 0.4997]],
       grad_fn=<ReluBackward0>)


The guest top model is a sequential contians 1 layer and sigmoid activation, like previous example.

In [16]:
# guest top model outputs score between 0-1
guest_top_model = t.nn.Sequential(
    t.nn.ReLU(),
    t.nn.Linear(4, 1),
    t.nn.Sigmoid()
)

Host model has movie categorical features, transform them into dense feature using Embeddings. Host side has 21 movie genres, and use 0 as the padding index. We sum all embedding to generate a dense feature.

In [20]:
host_model = t.nn.Sequential(
    ft.operation.Astype('int64'),  # cast to long
    t.nn.Embedding(21, embedding_dim=16, padding_idx=0),  # use 0 as padding index
    ft.operation.Sum(dim=1),  # operation that sum all categorical embeddings
    t.nn.Linear(16, 8),
    t.nn.ReLU()
)

# test categorical fatures
feat = t.Tensor([[1, 12, 13, 14, 15, 0, 0, 0, 0, 0]])
print(host_model(feat))

tensor([[2.4943, 6.1790, 3.8077, 5.0657, 1.8655, 0.0000, 0.0000, 0.0000]],
       grad_fn=<ReluBackward0>)


interactive layer takes the addition of guest&host model Embeddings as input

In [18]:
interactive_layer = t.nn.Linear(8, 4)

Then finally we can add these models into Hetero-NN, and start training a recommendation model.

In [None]:
# loss function
ce_loss_fn = nn.BCELoss()

# optimizer, after fate torch hook optimizer can be created without parameters
opt: ft.optim.Adam = optim.Adam(lr=0.01)

hetero_nn_0 = HeteroNN(name="hetero_nn_0", epochs=1, floating_point_precision=None,
                       interactive_layer_lr=0.01, batch_size=4096, early_stop="diff")

guest_nn_0 = hetero_nn_0.get_party_instance(role='guest', party_id=guest)
guest_nn_0.add_bottom_model(guest_model)
guest_nn_0.add_top_model(guest_top_model)
guest_nn_0.set_interactve_layer(interactive_layer)
host_nn_0 = hetero_nn_0.get_party_instance(role='host', party_id=host)

host_nn_0.add_bottom_model(host_model)
# compile model with torch optimizer
hetero_nn_0.compile(opt, loss=ce_loss_fn)

hetero_nn_1 = HeteroNN(name="hetero_nn_1")
evaluation_0 = Evaluation(name="evaluation_0")

pipeline.add_component(reader_0)
pipeline.add_component(data_transform_0, data=Data(data=reader_0.output.data))
pipeline.add_component(intersection_0, data=Data(data=data_transform_0.output.data))
pipeline.add_component(hetero_nn_0, data=Data(train_data=intersection_0.output.data))
pipeline.add_component(hetero_nn_1, data=Data(test_data=intersection_0.output.data),
                       model=Model(model=hetero_nn_0.output.model))
pipeline.add_component(evaluation_0, data=Data(data=hetero_nn_0.output.data))
pipeline.compile()
pipeline.fit()

### Some examples of Pytorch and Fate-Torch

For the convenience of using pytorch backend, we offer some operation layers in fate-torch so that users will be able to use some complex layer, like LSTM. Here's some examples

#### LSTM usage

In [23]:
import torch as t
from pipeline import fate_torch as ft
from pipeline import fate_torch_hook

fate_torch_hook(t)

<module 'torch' from '/data/projects/fate/common/python/venv/lib/python3.8/site-packages/torch/__init__.py'>

we create a fake word sequence

In [74]:
fake_word_seq = t.randint(1, 32, (64, 5)) 

Here's an example of using LSTM in sequentials, we firstly cast input type to int64(long), feed sequence into Embedding layer and LSTM layer,
then sum all word embeddings in lstm output sequence together to produce a dense output. We use ft.operation.Index(0) to pick outputs from tuple (outputs, (h_n, c_n)).

In [75]:
seq_with_lstm = t.nn.Sequential(
    ft.operation.Astype('int64'),
    t.nn.Embedding(32, embedding_dim=16, padding_idx=0),
    t.nn.LSTM(input_size=16, hidden_size=16, num_layers=3, batch_first=True),
    ft.operation.Index(0),
    ft.operation.Sum(dim=1)
)

print(seq_with_lstm(fake_word_seq))

tensor([[4.7123, 4.7123, 4.7180,  ..., 4.7159, 4.7142, 4.7190],
        [4.6503, 4.6536, 4.6853,  ..., 4.6505, 4.6576, 4.7024],
        [4.6771, 4.6778, 4.7010,  ..., 4.6801, 4.6823, 4.7104],
        ...,
        [4.6044, 4.5972, 4.6423,  ..., 4.5680, 4.5999, 4.6759],
        [4.7167, 4.7167, 4.7194,  ..., 4.7186, 4.7177, 4.7197],
        [4.5361, 4.5358, 4.5816,  ..., 4.4659, 4.5261, 4.6402]],
       grad_fn=<SumBackward1>)


In this example we select the last embedding of output sequence.The origin output shape is (64, 20, 16), the sequence length is 20, we select the last embedding using Select.

In [77]:
lstm_module = t.nn.LSTM(input_size=16, hidden_size=16, num_layers=3, batch_first=True)
embd = t.nn.Embedding(32, embedding_dim=16, padding_idx=0)
t.nn.init.normal_(embd)
t.nn.init.xavier_normal_(lstm_module)

seq_with_lstm = t.nn.Sequential(
    ft.operation.Astype('int64'),
    embd, 
    lstm_module,
    ft.operation.Index(0),
    ft.operation.Select(dim=1, idx=-1), # dim 1, index last
)

out = seq_with_lstm(fake_word_seq)
print(out.shape)

torch.Size([64, 16])


Layers offered in pytorch nn modules are supported . Use them to build hetero-nn sturectures in Sequential.

For more demo on using pipeline to submit jobs, please refer to [pipeline demos](https://github.com/FederatedAI/FATE/tree/master/examples/pipeline/demo)