## Pipeline Upload Data Tutorial 

### install

`Pipeline` is distributed along with [fate_client](https://pypi.org/project/fate-client/).

```bash
pip install fate_client
```

To use Pipeline, we need to first specify which `FATE Flow Service` to connect to. Once `fate_client` installed, one can find an cmd enterpoint name `pipeline`:

In [37]:
!pipeline --help

Usage: pipeline [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  config  pipeline config tool
  init    - DESCRIPTION: Pipeline Config Command.


Assume we have a `FATE Flow Service` in 127.0.0.1:9380(defaults in standalone), then exec

In [38]:
!pipeline init --ip 127.0.0.1 --port 9380

Pipeline configuration succeeded.


### upload data

 Before start a modeling task, the data to be used should be uploaded. 
 Typically, a party is usually a cluster which include multiple nodes. Thus, when we upload these data, the data will be allocated to those nodes.

In [39]:
from pipeline.backend.pipeline import PipeLine

Make a `pipeline` instance:

    - initiator: 
        * role: guest
        * party: 9999
    - roles:
        * guest: 9999

note that only local party id is needed.
    

In [40]:
pipeline_upload = PipeLine().set_initiator(role='guest', party_id=9999).set_roles(guest=9999)

Define a partitions for data storage

In [41]:
partition = 4

Define table name and namespace, which will be used in FATE job configuration

In [42]:
dense_data_guest = {"name": "breast_hetero_guest", "namespace": f"experiment"}
dense_data_host = {"name": "breast_hetero_host", "namespace": f"experiment"}
tag_data = {"name": "breast_hetero_host", "namespace": f"experiment"}

Now, we add data to be uploaded

In [43]:
data_base = "/workspace/FATE/"
pipeline_upload.add_upload_data(file=os.path.join(data_base, "examples/data/breast_hetero_guest.csv"),
                                table_name=dense_data_guest["name"],             # table name
                                namespace=dense_data_guest["namespace"],         # namespace
                                head=1, partition=partition)               # data info

pipeline_upload.add_upload_data(file=os.path.join(data_base, "examples/data/breast_hetero_host.csv"),
                                table_name=dense_data_host["name"],
                                namespace=dense_data_host["namespace"],
                                head=1, partition=partition)

pipeline_upload.add_upload_data(file=os.path.join(data_base, "examples/data/breast_hetero_host.csv"),
                                table_name=tag_data["name"],
                                namespace=tag_data["namespace"],
                                head=1, partition=partition)

We can then upload data

In [44]:
pipeline_upload.upload(drop=1)

 UPLOADING:||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.00%


[32m2021-11-15 08:08:32.438[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m123[0m - [1mJob id is 202111150808306898680
[0m
[32m2021-11-15 08:08:32.461[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m144[0m - [1m[80D[1A[KJob is still waiting, time elapse: 0:00:00[0m
[32m2021-11-15 08:08:32.985[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m144[0m - [1m[80D[1A[KJob is still waiting, time elapse: 0:00:00[0m
[0m
[32m2021-11-15 08:08:34.149[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m177[0m - [1m[80D[1A[KRunning component upload_0, time elapse: 0:00:01[0m
[32m2021-11-15 08:08:34.682[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m177[0m - [1m[80D[1A[KRunning component upload_0, 

 UPLOADING:||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.00%


[32m2021-11-15 08:08:42.573[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m123[0m - [1mJob id is 202111150808411456480
[0m
[32m2021-11-15 08:08:42.591[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m144[0m - [1m[80D[1A[KJob is still waiting, time elapse: 0:00:00[0m
[32m2021-11-15 08:08:43.106[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m144[0m - [1m[80D[1A[KJob is still waiting, time elapse: 0:00:00[0m
[32m2021-11-15 08:08:43.630[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m144[0m - [1m[80D[1A[KJob is still waiting, time elapse: 0:00:01[0m
[32m2021-11-15 08:08:44.144[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m144[0m - [1m[80D[1A[KJob is still waiting, time elapse: 0:00

 UPLOADING:||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.00%


[32m2021-11-15 08:08:56.191[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m123[0m - [1mJob id is 202111150808546337770
[0m
[32m2021-11-15 08:08:56.203[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m144[0m - [1m[80D[1A[KJob is still waiting, time elapse: 0:00:00[0m
[32m2021-11-15 08:08:56.719[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m144[0m - [1m[80D[1A[KJob is still waiting, time elapse: 0:00:00[0m
[32m2021-11-15 08:08:57.234[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m144[0m - [1m[80D[1A[KJob is still waiting, time elapse: 0:00:01[0m
[32m2021-11-15 08:08:57.769[0m | [1mINFO    [0m | [36mpipeline.utils.invoker.job_submitter[0m:[36mmonitor_job_status[0m:[36m144[0m - [1m[80D[1A[KJob is still waiting, time elapse: 0:00

For more demo on using pipeline to submit jobs, please refer to [pipeline demos](https://github.com/FederatedAI/FATE/tree/master/examples/pipeline/demo)