# Sonar - Decentralized Model Training Simulation (local)

DISCLAIMER: This is a proof-of-concept implementation. It does not represent a remotely product ready implementation or follow proper conventions for security, convenience, or scalability. It is part of a broader proof-of-concept demonstrating the vision of the OpenMined project, its major moving parts, and how they might work together.


# Getting Started: Installation

##### Step 1: install IPFS

- https://ipfs.io/docs/install/

##### Step 2: Turn on IPFS Daemon
Execute on command line:
> ipfs daemon

##### Step 3: Install Ethereum testrpc

- https://github.com/ethereumjs/testrpc

##### Step 4: Turn on testrpc with 1000 initialized accounts (each with some money)
Execute on command line:
> testrpc -a 1000

##### Step 5: install openmined/sonar and all dependencies
##### Step 6: execute the following code

# The Simulation: Diabetes Prediction

In this example, a diabetes research center (Cure Diabetes Inc) wants to train a model to try to predict the progression of diabetes based on several indicators. They have collected a small sample (42 patients) of data but it's not enough to train a model. So, they intend to offer up a bounty of $5,000 to the OpenMined commmunity to train a high quality model.

As it turns out, there are 400 diabetics in the network who are candidates for the model (are collecting the relevant fields). In this simulation, we're going to faciliate the training of Cure Diabetes Inc incentivizing these 400 anonymous contributors to train the model using the Ethereum blockchain.

Note, in this simulation we're only going to use the sonar and syft packages (and everything is going to be deployed locally on a test blockchain). Future simulations will incorporate mine and capsule for greater anonymity and automation.

### Imports and Convenience Functions

In [1]:
import warnings
import numpy as np
import phe as paillier
from sonar.contracts import ModelRepository,Model
from syft.he.Paillier import KeyPair
from syft.nn.linear import LinearClassifier
import numpy as np
from sklearn.datasets import load_diabetes

def get_balance(account):
    return repo.web3.fromWei(repo.web3.eth.getBalance(account),'ether')

warnings.filterwarnings('ignore')

### Setting up the Experiment

In [2]:
# for the purpose of the simulation, we're going to split our dataset up amongst
# the relevant simulated users

diabetes = load_diabetes()
y = diabetes.target
X = diabetes.data

validation = (X[0:42],y[0:42])
anonymous_diabetes_users = (X[42:],y[42:])

# we're also going to initialize the model trainer smart contract, which in the
# real world would already be on the blockchain (managing other contracts) before
# the simulation begins

repo = ModelRepository() # blockchain hosted model repository

# we're going to set aside 400 accounts for our 400 patients
# Let's go ahead and pair each data point with each patient's 
# address so that we know we don't get them confused
patient_addresses = repo.web3.eth.accounts[100:500]
anonymous_diabetics = list(zip(patient_addresses,
                               anonymous_diabetes_users[0],
                               anonymous_diabetes_users[1]))

# we're going to set aside 1 account for Cure Diabetes Inc
cure_diabetes_inc = repo.web3.eth.accounts[501]

No account submitted... using default[2]
Deployed ModelRepository:0xb811aeafff1f6f0bd39b07f21a1e57cc64c6e8770b4b9b1133eaea154de66a9d


## Step 1: Cure Diabetes Inc Initializes a Model and Provides a Bounty

In [3]:
pubkey,prikey = KeyPair().generate(n_length=1024)
diabetes_classifier = LinearClassifier(desc="DiabetesClassifier",n_inputs=10,n_labels=1)
initial_error = diabetes_classifier.evaluate(validation[0],validation[1])
diabetes_classifier.encrypt(pubkey)

diabetes_model = Model(owner=cure_diabetes_inc,
                       syft_obj = diabetes_classifier,
                       bounty = 0.001,
                       initial_error = initial_error,
                       target_error = 10000
                      )
model_id = repo.submit_model(diabetes_model)

## Step 2: An Anonymous Patient Downloads the Model and Improves It

In [4]:
diabetic_address,input_data,target_data = anonymous_diabetics[0]
repo[model_id].submit_gradient(diabetic_address,input_data,target_data)

## Step 3: Cure Diabetes Inc. Evaluates the Gradient 

In [5]:
repo[model_id]

Desc:DiabetesClassifier
Owner:0x54d1f5dd32e12c22eff83d24faa3afa604f7de16
Bounty:0.001
Initial Error:26645642
Best Error:None
Target Error:10000
Model ID:0
Num Grads:1

In [6]:
new_error = repo[model_id].evaluate_gradient(cure_diabetes_inc,repo[model_id][0],prikey,pubkey,validation[0],validation[1])

In [7]:
new_error

26623238

## Step 4: Rinse and Repeat

In [None]:
for i,(addr, input, target) in enumerate(anonymous_diabetics):
    model = repo[model_id]
    model.submit_gradient(addr,input,target)
    new_error = model.evaluate_gradient(cure_diabetes_inc,model[i+1],prikey,pubkey,validation[0],validation[1],alpha=2)
    print(new_error)

26531949
26596590
26471645
26481334
26218895
26063063
26122476
26010976
25975969
25931464
25957436
25997872
25947727
25971795
25878993
25844347
25824632
25955404
25784298
25687608
25705897
25658046
25737011
25536902
25546388
25378029
25327242
25389867
25265221
25261027
25183437
25212903
25242696
25220010
25282876
25149695
25148261
25184219
25225898
25136445
25071396
25093899
25168788
25075534
25038640
25092418
25069513
25072111
25093821
25096285
24990938
25030720
25075210
25014104
24997901
24860956
24940264
24747127
24825186
24790856
24770535
24784217
