This prototype shows the code for the Data Owner to upload a dataset and give access to a Data Scientist 

First install syft, this will install the 0.6.0 version

In [1]:
!pip install syft==0.6.0



Check the installed version

In [2]:
!pip list | grep syft

syft                          0.6.0


Import all the necessary dependencies 

In [3]:
import syft as sy
from google.colab import files
import pandas as pd
import numpy as np

Define IP and port address to connect to the domain or use the DNS. The domain has been installed using the one-click for Azure. 
Check https://github.com/OpenMined/PySyft/tree/0.6.0 at the Deploy to Cloud
section


In [4]:
DOMAIN_URL1 ="http://40.118.207.120:80"
#"http://tesis.westus.cloudapp.azure.com:80" 
#"http://ort.westus.cloudapp.azure.com:80"

Login to the domain as admin and create an user with the role Data Scientist

In [5]:
domain = sy.login(
    url=DOMAIN_URL1,
    email="info@openmined.org",
    password="changethis"
    )



Anyone can login as an admin to your node right now because your password is still the default PySyft username and password!!!

Connecting to http://40.118.207.120:80... done! 	 Logging into node... done!


Crear el usuario

In [6]:
domain.users

Unnamed: 0,id,email,name,budget,verify_key,role,added_by,website,institution,daa_pdf,created_at,budget_spent
0,1,info@openmined.org,Jane Doe,5.55,07da602cdbeb26826410b4b9708db587ead1f976cb13e8...,Owner,,,,,2022-03-11 20:13:28.993651,5.55
1,2,test@gmail.com,test,0.0,f531025137b830cb01d06d0df928d8f027100dbdc568d1...,Data Scientist,Jane Doe,,,1.0,2022-03-11 20:30:20.071845,0.0
2,3,szanottag@gmail.com,TCS,0.0,80aea16da02e20d930a65197f190b6946c435ad883d76f...,Data Scientist,Jane Doe,,,2.0,2022-03-13 22:07:28.215712,0.0
3,4,betirod@hotmail.com,TCT,0.0,4d509a60e4cbc37762f6a34dd0cfb6f1b4d27938794037...,Data Scientist,Jane Doe,,,3.0,2022-03-13 22:07:28.640620,0.0


The next code is commented becouse each time you run the systmen create a new user with the same name. 

In [None]:
#domain.users.create(name='TCS',email='szanottag@gmail.com',role='Data Scientist',  password='passwordtest')
#domain.users.create(name='TCT',email='betirod@hotmail.com',role='Data Scientist',  password='passwordtest1')

List the users in order to check that it has been correctly created

Let's upload the dataset that the Owner will upload to the domain. In this case we are using the diabetes dataset, taken from the Kaggle competition 
https://www.kaggle.com/mathchi/diabetes-data-set


In [7]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


Let's see what information has this dataset 

In [8]:
df = pd.read_csv('/content/drive/MyDrive/df_consultas.csv', index_col=0 )
df

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age
537,0,57,60,0,0,21.7,0.735,67
538,0,127,80,37,210,36.3,0.804,23
539,3,129,92,49,155,36.4,0.968,32
540,8,100,74,40,215,39.4,0.661,43
541,3,128,72,25,190,32.4,0.549,27
...,...,...,...,...,...,...,...,...
763,10,101,76,48,180,32.9,0.171,63
764,2,122,70,27,0,36.8,0.340,27
765,5,121,72,23,112,26.2,0.245,30
766,1,126,60,0,0,30.1,0.349,47


Transform to syft Tensor the dataframe 

In [9]:
data_tensors_consultas = sy.Tensor(df.values.astype(np.int32))

Check the min and max values 

In [10]:
df.min(), df.max()

(Pregnancies                  0.000
 Glucose                     56.000
 BloodPressure                0.000
 SkinThickness                0.000
 Insulin                      0.000
 BMI                          0.000
 DiabetesPedigreeFunction     0.085
 Age                         21.000
 dtype: float64, Pregnancies                  13.000
 Glucose                     199.000
 BloodPressure               114.000
 SkinThickness                99.000
 Insulin                     600.000
 BMI                          57.300
 DiabetesPedigreeFunction      1.699
 Age                          70.000
 dtype: float64)

In [11]:
max = max(df.max())
max

600.0

In [12]:
min = min(df.min())
min

0.0

Make private the tensor data 

In [14]:
print(data_tensors_consultas.shape)
private_data_tensors_consultas = data_tensors_consultas.private(min_val=min, max_val=max, entities=[str(s) for s in range(data_tensors_consultas.shape[0])])

(231, 8)


Upload the dataset to the domian 

In [17]:
domain.load_dataset(
    assets={"data": private_data_tensors_consultas},
    name="diabetes_consultas",
    description="Diabetes_Consultas"
)



Loading dataset... uploading... SUCCESS!                        

Run <your client variable>.datasets to see your new dataset loaded into your machine!


List the datasets for the domain

In [18]:
domain.datasets

Idx,Name,Description,Assets,Id
[0],diabetes_consultas,Diabetes_Consultas,"[""data""] -> Tensor",91ef9c35-f706-4ee9-a404-41fe11a4a768


Check the requests made by the Data Scientist

In [20]:
domain.requests

Unnamed: 0,Name,Email,Role,Request Type,Status,Reason,Request ID,Requested Object's ID,Requested Object's tags,Requested Budget,Current Budget
0,TCS,szanottag@gmail.com,Data Scientist,DATA,pending,Para salvar la tesis,<UID: 4a5f513439154d91a1093651db2ea51d>,<UID: 0188a01ab220455f93c75b1eb82ce0c6>,[#data],,


Approve the request by the id 

In [21]:
domain.requests[0].approve()

In [22]:
domain.requests

Unnamed: 0,Name,Email,Role,Request Type,Status,Reason,Request ID,Requested Object's ID,Requested Object's tags,Requested Budget,Current Budget
0,TCS,szanottag@gmail.com,Data Scientist,DATA,accepted,Para salvar la tesis,<UID: 4a5f513439154d91a1093651db2ea51d>,<UID: 0188a01ab220455f93c75b1eb82ce0c6>,[#data],,
