# Syft Duet for Federated Learning - Data Owner (Portuguese Bank)

## Setup

First we need to install syft 0.3.0 because for every other syft project in this repo we have used syft 0.2.9. However, a recent update has removed a lot of the old features and replaced them with this new 'Duet' function. To do this go into your terminal and cd into the repo directory and run:

> pip uninstall syft

Then confirm with 'y' and hit enter.

> pip install syft==0.3.0

NOTE: Make sure that you uninstall syft 0.3.0 and reinstall syft 0.2.9 if you want to run any of the other projects in this repo. Unfortunately when PySyft updated from 0.2.9 to 0.3.0 it removed all of the previous functionalities for the FL, DP, and HE that have previously been iplemented.

In [10]:
# Double check you are using syft 0.3.0 not 0.2.9
# !pip show syft

In [11]:
import syft as sy
import torch as th
import pandas as pd

## Initialising Duet

For each bank there will be this same initialisation step. Ensure that you run the below code. This should produce a Syft logo and some information. The important part is the lines of code;
```python
import syft as sy
duet = sy.duet("xxxxxxxxxxxxxxxxxxxxxxxxxxx")
```
Where the x's are some combination of letters and numbers. You need to take this key and paste it in to the respective banks duet code in the central aggregator. This should be clear and detailed in the central aggregator notebook. In essence, this is similar to the specific banks generating a server and key, and sending the key to the aggregator to give them access to this joint, secure, server process.

Once you have run the key in the code on the aggregator side, it will give you a similar key which it tells you to input on this side. There will be a box within the Syft logo/information output on this notebook to input the key. Once you enter it and hit enter then the connection for this bank should be established.

In [12]:
# We now run the initialisation of the duet
# Note: this can be run with a specified network if required
# For exmaple, if you don't trust the netwrok provided by pysyft 
# to not look at the data

duet = sy.duet()
sy.logging(file_path="./syft_do.log")

AttributeError: module 'syft' has no attribute 'duet'

>If the connection is established then there should be a green message above saying 'CONNECTED!'. Similarly, there should also be a Live Status indicating the number of objects, requests, and messages on the duet.

# Import Portugues Bank Data 

In [None]:
data = pd.read_csv('datasets/portuguese-bank-data.csv', sep = ',')
target = pd.read_csv('datasets/portuguese-bank-target.csv', sep = ',')
data.head()

In [None]:
data = th.tensor(data.values).float()
data

In [None]:
target = th.tensor(target.values).float()
target

In [None]:
from sklearn.preprocessing import StandardScaler

sc_X = StandardScaler()
data = sc_X.fit_transform(data)
data = th.tensor(data).float()
data

## Label and Send Data to Server

Here we are tagging, and labeling the specific banks data. Although we are sending the data, this does not mean that it is accessible by the central aggregator. We are sending this data to a trusted network server - hence, the reason we can specify our own when establishing the duet, just in case we don't trust the default one. This specific network should reside in the country of the data, more specifically wihtin the banks own network, therefore adhering to all regulations where neccessary.

In [None]:
data = data.tag("data")
data.describe("Portuguese Bank Training Data")
target = target.tag("target")
target.describe("Portuguese Bank Training Target")

# Once we have sent the data we are left with a pointer to the data
data_ptr = data.send(duet, searchable=True)
target_ptr = target.send(duet, searchable=True)

In [None]:
# Detail what is stored 
duet.store.pandas

>NOTE: Although the data has been sent to this 'store' the other end of the connection cannot access/see the data without requesting it from you. However, from this side, because we sent the data, we can retrieve it whenever we want without rewuesting permission. Simply run the following code;

```python
duet.store["tag"].get()
```

>Where you replace the 'tag' with whatever the tag of the data you wish to get. Once you run this, the data will be removed from the store and brought back locally here.

In [None]:
# Detail any requests from client side. As mentioned above
# on the other end they need to request access to data/anything 
# on duet server/store. This si where you can list any requests
# outstanding.
duet.requests.pandas

In [None]:
# Because on the other end of the connection they/we plan on
# running a model (with lots of requests) we can set up some
# request handlers that will automatically accept/deny certain
# labeled requests.
duet.requests.add_handler(
    name="loss",
    action="accept",
    timeout_secs=-1,  # no timeout
    print_local=True  # print the result in your notebook
)

duet.requests.add_handler(
    name="model_download",
    action="accept",
    print_local=True  # print the result in your notebook
)

In [None]:
duet.requests.handlers

>There isn't much more to do on this end, unless you wish to retrieve the data at any point to ensure security. So now we head over to the main central aggregator to begin running the model.