# Homo NN 单机版快速开始

在该版本中， pipeline中Homo-NN加入了对pytorch的支持，可以遵照pytorch往常的使用方式，定义Sequential模型，使用torch自带的layers, 提交模型

下面给出一个基础的二分类任务Homo-NN任务，有两个client，party id分别为10000，9999参与，并指定10000为server端聚合模型。

使用方法与的其他FATE算法一致：使用FATE自带的reader, transformer接口进行表格数据输入，
数据输入到算法组件中，组件使用定义的模型，优化器和loss函数进行训练，聚合模型

## 上传csv数据到FATE

In [1]:
from pipeline.backend.pipeline import PipeLine  # pipeline类

host_0 = 10000
host_1 = 9999
pipeline_upload = PipeLine().set_initiator(role='host', party_id=host_0).set_roles(host=[host_0, host_1],
                                                                            arbiter=[host_0])

partition = 4

# 上传一份数据
data = {"name": "breast_homo_host", "namespace": "experiment"}
pipeline_upload.add_upload_data(file="./examples/data/breast_homo_host.csv", # 以project文件夹为根目录
                                table_name=data["name"],             # table name
                                namespace=data["namespace"],         # namespace
                                head=1, partition=partition)               # data info

pipeline_upload.upload(drop=1)

 UPLOADING:||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.00%


[32m2022-11-11 12:19:42.829[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m83[0m - [1mJob id is 202211111219416095940
[0m
[32m2022-11-11 12:19:42.841[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m98[0m - [1m[80D[1A[KJob is still waiting, time elapse: 0:00:00[0m
[0mm2022-11-11 12:19:45.304[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m125[0m - [1m
[32m2022-11-11 12:19:45.305[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m127[0m - [1m[80D[1A[KRunning component upload_0, time elapse: 0:00:02[0m
[32m2022-11-11 12:19:46.334[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m127[0m - [1m[80D[1A[KRunning component upload_0, time elapse: 0:00:03[0m
[32m2022-11-11 12:19:47.377[0m | [1mI

## 编写Pipeline脚本并执行

### import 相关组件

In [2]:
# torch
import torch as t
from torch import nn

# pipeline
from pipeline.component.homo_nn import HomoNN, TrainerParam  # HomoNN组件，训练器参数
from pipeline.backend.pipeline import PipeLine  # pipeline类
from pipeline.component import Reader, DataTransform, Evaluation # 数据IO， Evaluation
from pipeline.interface import Data  # Data接口，用于数据IO

### fate torch hook

请务必执行下面的fate_torch_hook函数，它会够修改torch的一些类，使得你定义的torch神经网络，优化器，loss function能够被pipeline解析并提交

In [3]:
from pipeline import fate_torch_hook
fate_torch_hook(t)

<module 'torch' from '/home/cwj/standalone_fate_install_1.9.0_release/env/python/venv/lib/python3.8/site-packages/torch/__init__.py'>

### pipeline脚本

In [9]:
# 创建pipeline 
host_0 = 10000
host_1 = 9999
pipeline = PipeLine().set_initiator(role='host', party_id=host_0).set_roles(host=[host_0, host_1],
                                                                            arbiter=[host_0])

# 设置上传数据
train_data_0 = {"name": "breast_homo_host", "namespace": "experiment"}
train_data_1 = {"name": "breast_homo_host", "namespace": "experiment"}
reader_0 = Reader(name="reader_0")
reader_0.get_party_instance(role='host', party_id=host_0).component_param(table=train_data_0)
reader_0.get_party_instance(role='host', party_id=host_1).component_param(table=train_data_1)

# tranform组件将上传的数据转换为FATE标准格式
data_transform_0 = DataTransform(name='data_transform_0')
data_transform_0.get_party_instance(
    role='host', party_id=host_0).component_param(
    with_label=True, output_format="dense")
data_transform_0.get_party_instance(
    role='host', party_id=host_1).component_param(
    with_label=True, output_format="dense")

"""
定义模型
"""
# 与本地使用torch sequential一致
model = nn.Sequential(
    nn.Linear(30, 1),
    nn.Sigmoid()
)
loss = nn.BCELoss()
optimizer = t.optim.Adam(model.parameters(), lr=0.01)


"""
创建组件
"""
nn_component = HomoNN(name='nn_0',
                      model=model,
                      loss=loss,
                      optimizer=optimizer,
                      # TrainerParam传递参数到fedavg_trainer，关于Trainer详细请见下文
                      trainer=TrainerParam(trainer_name='fedavg_trainer', epochs=20, batch_size=128, validation_freqs=1),
                      torch_seed=100 # 全局随机种子
                      )

# 加入组件，定义数据间的数据IO
pipeline.add_component(reader_0)
pipeline.add_component(data_transform_0, data=Data(data=reader_0.output.data))
pipeline.add_component(nn_component, data=Data(train_data=data_transform_0.output.data))
pipeline.add_component(Evaluation(name='eval_0'), data=Data(data=nn_component.output.data))

pipeline.compile()
pipeline.fit()

<pipeline.backend.pipeline.PipeLine at 0x7f3ba836c130>

## 获取模型预测结果

In [5]:
pipeline.get_component('nn_0').get_output_data()

Unnamed: 0,id,label,predict_result,predict_score,predict_detail,type
0,1,0.0,0,0.02906397357583046,"{'0': 0.9709360264241695, '1': 0.0290639735758...",train
1,4,1.0,1,0.9970678687095642,"{'0': 0.002932131290435791, '1': 0.99706786870...",train
2,8,1.0,1,0.937065839767456,"{'0': 0.06293416023254395, '1': 0.937065839767...",train
3,9,0.0,0,0.059680141508579254,"{'0': 0.9403198584914207, '1': 0.0596801415085...",train
4,11,1.0,1,0.9879087805747986,"{'0': 0.012091219425201416, '1': 0.98790878057...",train
...,...,...,...,...,...,...
223,560,0.0,0,0.015546107664704323,"{'0': 0.9844538923352957, '1': 0.0155461076647...",train
224,561,1.0,1,0.7906540632247925,"{'0': 0.20934593677520752, '1': 0.790654063224...",train
225,563,1.0,1,0.9084046483039856,"{'0': 0.0915953516960144, '1': 0.9084046483039...",train
226,565,1.0,1,0.9931216835975647,"{'0': 0.006878316402435303, '1': 0.99312168359...",train


## TrainerParam 训练器参数与训练器

在该版本中，Homo-NN的训练逻辑，联邦聚合逻辑，都在Trainer类中实现。FATE自带一个fedavg_trainer，其中实现了标准的fedavg算法，默认情况下会在每个epoch聚合各方的模型。 而TrainerParam的作用是:

- 使用trainer_name='{模块名字}'指定使用的trainer，trainer在federatedml.nn.homo.trainer目录下，因此，你可以自定义自己的trainer，自定义trainer的教程将会有专门一章
- 其余参数将会传递到trainer的\_\_init\_\_() 接口

我们可以查看下FATE自带的 fedavg_trainer

In [6]:
from federatedml.nn.homo.trainer.fedavg_trainer import FedAVGTrainer

查看FedAVGTrainer的文档，了解可用的参数，提交任务时，这些参数都可用TrainerParam传递

In [7]:
print(FedAVGTrainer.__doc__)



    Parameters
    ----------
    epochs: int >0, epochs to train
    batch_size: int, -1 means full batch
    early_stop: None, 'diff' or 'abs'. if None, disable early stop; if 'diff', use the loss difference between
                two epochs as early stop condition, if differences < eps, stop training ; if 'abs', if loss < eps,
                stop training
    eps: float, eps value for early stop
    secure_aggregate: bool, default is True, whether to use secure aggregation. if enabled, will add random number
                            mask to local models. These random number masks will eventually cancel out to get 0.
    weighted_aggregation: bool, whether add weight to each local model when doing aggregation.
                         if True, According to origin paper, weight of a client is: n_local / n_global, where n_local
                         is the sample number locally and n_global is the sample number of all clients.
                         if False, simply averagi

也可参考下接口的代码

In [None]:
class FedAVGTrainer(TrainerBase):

    def __init__(self, epochs=10, batch_size=512,  # training parameter
                 early_stop=None, eps=0.0001,  # early stop parameters
                 secure_aggregate=True, weighted_aggregation=True, aggregate_every_n_epoch=None,  # federation
                 cuda=False, pin_memory=True, shuffle=True, data_loader_worker=0,  # GPU dataloader
                 validation_freqs=None,  # validation configuration
                 checkpoint_save_freqs=None,
                 task_type='auto'
                 ):
        ...

# Homo NN组件的参数

In [None]:
print(HomoNN.__doc__)

至此，我们已经可以对Homo-NN有一些基本的了解，并用其实现一些基本的建模任务了，不过Homo-NN还支持对模型，数据集和训练流程的自定义，可以
参考后面的其他文档