# Tutorial 2 : Exploring contributivity 

With this example, we dive deeper into the potential of the library, and the notion of contributivity.

## 1 - Prerequisites

In order to run this example, you'll need to:

* use python 3.7 +
* install requirements from the requirements.txt file
* install this package https://test.pypi.org/project/pkg-test-distributed-learning-contributivity/0.0.18/

If you did not follow our first tutorial, it is recommended to [take a look at it !](https://github.com/SubstraFoundation/distributed-learning-contributivity/blob/master/notebooks/examples/1%20_INTRO_MNIST.ipynb) 


In [1]:
import sys
import subprocess

requirements = ['librosa==0.8.0',
 'Keras==2.3.1',
 'matplotlib==3.1.3',
 'numpy==1.19.0',
 'scipy==1.4.1',
 'scikit-learn==0.22.1',
 'pandas==1.0.5',
 'seaborn==0.10.0',
 'loguru==0.4.1',
 'tensorflow==2.2.0',
 'ruamel.yaml==0.16.10']

def pip_install(package_name):
    subprocess.call(
        [sys.executable, '-m', 'pip', 'install', package_name]
    )


for package in requirements:
    pip_install(package)


!pip install -i https://test.pypi.org/simple/ subtest==0.0.0.18


Looking in indexes: https://test.pypi.org/simple/


## 2 - Context 

In collaborative data science projects partners sometimes need to train a model on multiple datasets, contributed by different data providing partners. In such cases the partners might have to measure how much each dataset involved contributed to the performance of the model. This is useful for example as a basis to agree on how to share the reward of the ML challenge or the future revenues derived from the predictive model, or to detect possible corrupted datasets or partners not playing by the rules. The library explores this question and the opportunity to implement some mechanisms helping partners in such scenarios to measure each dataset's *contributivity* (as *contribution to the performance of the model*).

In the first tutorial, you learnt how to parametrize and run a scenario.
In this tutorial, you will learn how to add one of the contributivity measurement implemented to your scenario run.  

In [2]:
# imports
import pandas as pd
import seaborn as sns
sns.set()



## 2 -  Setup and run the scenario

We will use the same dataset, and overall setup for the scenario. The main change relies on the contributivity parameter, which is a list of the contributivity methods that will be tested. As these methods are time-consuming, the parameter is set empty. 

All methods available are:

```python
- "Shapley values"
- "Independent scores"
- "TMCS"
- "ITMCS"
- "IS_lin_S"
- "IS_reg_S"
- "AIS_Kriging_S"
- "SMCS"
- "WR_SMC"
```

See in the documentation the [dedicated section](https://github.com/SubstraFoundation/distributed-learning-contributivity/blob/master/subtest/docs/documentation.md#contributivity-measurement-approaches-studied-and-implemented) for explanation of the different methods.  

Here we will use the Shapley value, a contributivity measurement which came from cooperative game theory.

In [3]:
from subtest.scenario import Scenario

my_scenario = Scenario(partners_count=3,
                            amounts_per_partner=[0.001, 0.699, 0.3],
                            epoch_count=10,
                            minibatch_count=3,
                            dataset_name='mnist',
                            methods=["Shapley values"])  # <- Here is the difference

Using TensorFlow backend.
2020-09-21 10:48:22.197 | DEBUG    | subtest.scenario:__init__:88 - Dataset selected: mnist
2020-09-21 10:48:22.199 | DEBUG    | subtest.scenario:__init__:101 - Computation use the full dataset for scenario #1
2020-09-21 10:48:22.200 | INFO     | subtest.scenario:__init__:262 - ### Description of data scenario configured:
2020-09-21 10:48:22.201 | INFO     | subtest.scenario:__init__:263 -    Number of partners defined: 3
2020-09-21 10:48:22.202 | INFO     | subtest.scenario:__init__:264 -    Data distribution scenario chosen: random
2020-09-21 10:48:22.202 | INFO     | subtest.scenario:__init__:265 -    Multi-partner learning approach: fedavg
2020-09-21 10:48:22.203 | INFO     | subtest.scenario:__init__:266 -    Weighting option: uniform
2020-09-21 10:48:22.203 | INFO     | subtest.scenario:__init__:267 -    Iterations parameters: 10 epochs > 3 mini-batches > 8 gradient updates per pass
2020-09-21 10:48:22.204 | INFO     | subtest.scenario:__init__:273 - ###

In [4]:
my_scenario.run()

2020-09-21 10:48:22.559 | INFO     | subtest.scenario:split_data:521 - ### Splitting data among partners:
2020-09-21 10:48:22.560 | INFO     | subtest.scenario:split_data:522 -    Simple split performed.
2020-09-21 10:48:22.561 | INFO     | subtest.scenario:split_data:523 -    Nb of samples split amongst partners: 43738
2020-09-21 10:48:22.564 | INFO     | subtest.scenario:split_data:525 -    Partner #0: 43 samples with labels [1, 2, 3, 4, 5, 6, 7, 8, 9]
2020-09-21 10:48:22.565 | INFO     | subtest.scenario:split_data:525 -    Partner #1: 30573 samples with labels [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
2020-09-21 10:48:22.565 | INFO     | subtest.scenario:split_data:525 -    Partner #2: 13122 samples with labels [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
2020-09-21 10:48:22.957 | DEBUG    | subtest.scenario:compute_batch_sizes:569 -    Compute batch sizes, partner #0: 1
2020-09-21 10:48:22.958 | DEBUG    | subtest.scenario:compute_batch_sizes:569 -    Compute batch sizes, partner #1: 1273
2020-09-21 10:48

2020-09-21 10:51:43.878 | DEBUG    | subtest.multi_partner_learning:log_collaborative_round_partner_result:628 - Epoch 01/09 > Minibatch 02/02 > Partner id #1 (1/2) > val_acc: 0.93
2020-09-21 10:51:51.106 | DEBUG    | subtest.multi_partner_learning:log_collaborative_round_partner_result:628 - Epoch 01/09 > Minibatch 02/02 > Partner id #2 (2/2) > val_acc: 0.93
2020-09-21 10:51:51.112 | DEBUG    | subtest.multi_partner_learning:compute_collaborative_round_fedavg:336 - End of fedavg collaborative round.
2020-09-21 10:51:53.052 | INFO     | subtest.multi_partner_learning:compute_test_score:188 -    Model evaluation at the end of the epoch: ['0.285', '0.920']
2020-09-21 10:51:53.054 | DEBUG    | subtest.multi_partner_learning:compute_test_score:191 -       Checking if early stopping criteria are met:
2020-09-21 10:51:53.056 | DEBUG    | subtest.multi_partner_learning:compute_test_score:201 -          -> Early stopping criteria are not met, continuing with training.
2020-09-21 10:51:53.207 |

2020-09-21 10:55:44.478 | DEBUG    | subtest.multi_partner_learning:compute_collaborative_round_fedavg:289 - Start new fedavg collaborative round ...
2020-09-21 10:55:44.480 | INFO     | subtest.multi_partner_learning:compute_collaborative_round_fedavg:304 - (fedavg) Minibatch n°0 of epoch n°4, init aggregated model for each partner with models from previous round
2020-09-21 10:55:59.631 | DEBUG    | subtest.multi_partner_learning:log_collaborative_round_partner_result:628 - Epoch 04/09 > Minibatch 00/02 > Partner id #0 (0/2) > val_acc: 0.94
2020-09-21 10:56:14.163 | DEBUG    | subtest.multi_partner_learning:log_collaborative_round_partner_result:628 - Epoch 04/09 > Minibatch 00/02 > Partner id #1 (1/2) > val_acc: 0.97
2020-09-21 10:56:21.668 | DEBUG    | subtest.multi_partner_learning:log_collaborative_round_partner_result:628 - Epoch 04/09 > Minibatch 00/02 > Partner id #2 (2/2) > val_acc: 0.97
2020-09-21 10:56:21.675 | DEBUG    | subtest.multi_partner_learning:compute_collaborative_

2020-09-21 10:59:53.630 | DEBUG    | subtest.multi_partner_learning:compute_collaborative_round_fedavg:289 - Start new fedavg collaborative round ...
2020-09-21 10:59:53.631 | INFO     | subtest.multi_partner_learning:compute_collaborative_round_fedavg:304 - (fedavg) Minibatch n°1 of epoch n°6, init aggregated model for each partner with models from previous round
2020-09-21 11:00:06.307 | DEBUG    | subtest.multi_partner_learning:log_collaborative_round_partner_result:628 - Epoch 06/09 > Minibatch 01/02 > Partner id #0 (0/2) > val_acc: 0.9
2020-09-21 11:00:19.673 | DEBUG    | subtest.multi_partner_learning:log_collaborative_round_partner_result:628 - Epoch 06/09 > Minibatch 01/02 > Partner id #1 (1/2) > val_acc: 0.98
2020-09-21 11:00:27.724 | DEBUG    | subtest.multi_partner_learning:log_collaborative_round_partner_result:628 - Epoch 06/09 > Minibatch 01/02 > Partner id #2 (2/2) > val_acc: 0.98
2020-09-21 11:00:27.730 | DEBUG    | subtest.multi_partner_learning:compute_collaborative_r

2020-09-21 11:03:59.330 | DEBUG    | subtest.multi_partner_learning:compute_collaborative_round_fedavg:289 - Start new fedavg collaborative round ...
2020-09-21 11:03:59.331 | INFO     | subtest.multi_partner_learning:compute_collaborative_round_fedavg:304 - (fedavg) Minibatch n°2 of epoch n°8, init aggregated model for each partner with models from previous round
2020-09-21 11:04:11.741 | DEBUG    | subtest.multi_partner_learning:log_collaborative_round_partner_result:628 - Epoch 08/09 > Minibatch 02/02 > Partner id #0 (0/2) > val_acc: 0.94
2020-09-21 11:04:24.672 | DEBUG    | subtest.multi_partner_learning:log_collaborative_round_partner_result:628 - Epoch 08/09 > Minibatch 02/02 > Partner id #1 (1/2) > val_acc: 0.98
2020-09-21 11:04:31.641 | DEBUG    | subtest.multi_partner_learning:log_collaborative_round_partner_result:628 - Epoch 08/09 > Minibatch 02/02 > Partner id #2 (2/2) > val_acc: 0.98
2020-09-21 11:04:31.648 | DEBUG    | subtest.multi_partner_learning:compute_collaborative_

2020-09-21 11:16:29.471 | INFO     | subtest.multi_partner_learning:compute_test_score:136 - ## Training and evaluating model on partners with ids: ['#0', '#1']
2020-09-21 11:16:29.559 | DEBUG    | subtest.multi_partner_learning:compute_collaborative_round_fedavg:289 - Start new fedavg collaborative round ...
2020-09-21 11:16:29.560 | INFO     | subtest.multi_partner_learning:compute_collaborative_round_fedavg:301 - (fedavg) Very first minibatch of epoch n°0, init new models for each partner
2020-09-21 11:16:41.709 | DEBUG    | subtest.multi_partner_learning:log_collaborative_round_partner_result:628 - Epoch 00/09 > Minibatch 00/02 > Partner id #0 (0/1) > val_acc: 0.15
2020-09-21 11:16:54.884 | DEBUG    | subtest.multi_partner_learning:log_collaborative_round_partner_result:628 - Epoch 00/09 > Minibatch 00/02 > Partner id #1 (1/1) > val_acc: 0.75
2020-09-21 11:16:54.893 | DEBUG    | subtest.multi_partner_learning:compute_collaborative_round_fedavg:336 - End of fedavg collaborative roun

2020-09-21 11:20:19.367 | INFO     | subtest.multi_partner_learning:compute_collaborative_round_fedavg:304 - (fedavg) Minibatch n°2 of epoch n°2, init aggregated model for each partner with models from previous round
2020-09-21 11:20:31.047 | DEBUG    | subtest.multi_partner_learning:log_collaborative_round_partner_result:628 - Epoch 02/09 > Minibatch 02/02 > Partner id #0 (0/1) > val_acc: 0.86
2020-09-21 11:20:44.024 | DEBUG    | subtest.multi_partner_learning:log_collaborative_round_partner_result:628 - Epoch 02/09 > Minibatch 02/02 > Partner id #1 (1/1) > val_acc: 0.95
2020-09-21 11:20:44.031 | DEBUG    | subtest.multi_partner_learning:compute_collaborative_round_fedavg:336 - End of fedavg collaborative round.
2020-09-21 11:20:45.888 | INFO     | subtest.multi_partner_learning:compute_test_score:188 -    Model evaluation at the end of the epoch: ['0.233', '0.929']
2020-09-21 11:20:45.890 | DEBUG    | subtest.multi_partner_learning:compute_test_score:191 -       Checking if early sto

2020-09-21 11:23:47.865 | DEBUG    | subtest.multi_partner_learning:compute_collaborative_round_fedavg:289 - Start new fedavg collaborative round ...
2020-09-21 11:23:47.866 | INFO     | subtest.multi_partner_learning:compute_collaborative_round_fedavg:304 - (fedavg) Minibatch n°1 of epoch n°5, init aggregated model for each partner with models from previous round
2020-09-21 11:24:00.379 | DEBUG    | subtest.multi_partner_learning:log_collaborative_round_partner_result:628 - Epoch 05/09 > Minibatch 01/02 > Partner id #0 (0/1) > val_acc: 0.79
2020-09-21 11:24:13.239 | DEBUG    | subtest.multi_partner_learning:log_collaborative_round_partner_result:628 - Epoch 05/09 > Minibatch 01/02 > Partner id #1 (1/1) > val_acc: 0.97
2020-09-21 11:24:13.245 | DEBUG    | subtest.multi_partner_learning:compute_collaborative_round_fedavg:336 - End of fedavg collaborative round.
2020-09-21 11:24:13.247 | DEBUG    | subtest.multi_partner_learning:compute_collaborative_round_fedavg:289 - Start new fedavg c

KeyboardInterrupt: 

## 3 - Accuracy score between each partner and comparison with aggregated model performance

Like in the first tutorial, we take a look at the scores, local and global.

In [None]:
scores = my_scenario.mpl.score_matrix_per_partner.mean(axis = 1)
score_collective = my_scenario.mpl.score_matrix_collective_models.mean(axis=1)

scores_df = pd.DataFrame({
    f'partner {i}':scores[:,i] for i in range(my_scenario.partners_count) })
scores_df['collective model'] = score_collective

scores_df

We can plot the evolution of the accuracy through the epochs. 

In [None]:
sns.relplot(data = scores_df.iloc[2:], kind="line")

## 4 - Contributivity scores

We have set our scenario with Shapley values as a contributivity measurement method.

While being quite heavy on computing resources, it provides a great measuring tool. 



In [None]:
contributivity_score = my_scenario.contributivity_list

In [None]:
print(contributivity_score[0])

Since we have artificially set our first partner to only have .1% of the total data set, it obviously contributes less in the final model. Because we are using the MNIST dataset, even with .1% of the total data, the model is still able to perform reasonably well according to our Accuracy values

There is other way to artificially generate poor contributors. For instance, we can shuffle the labels of one partner, which will mislabel the whole dataset of this particular partner.  
To do so, we use the `corrupted_partner`of the scenario object. It must be a list, with size equal to the number of partner. For each partner , a string indicates the state of its dataset. 

``` python
corrupted_partner=['shuffled', 'not corrupted', 'not corrupted']

```
Here, with 3 partners, the first one will see its labels shuffled, and the two others will have their dataset untouched. 


In [10]:
my_second_scenario = Scenario(partners_count=3,
                        amounts_per_partner=[0.2, 0.5, 0.3],                             # <- The repartition is more regular
                        corrupted_partner=['shuffled', 'not corrupted', 'not corrupted'] # <- Here is the true difference
                        epoch_count=10,
                        minibatch_count=3,
                        dataset_name='mnist',
                        methods=["Shapley values"])  

my_second_scenario.run()

In [None]:
contributivity_score = my_second_scenario.contributivity_list

#TODO commentary on the result


# That's it !

Now you can explore our other tutorials for a better snapshot of what can be done with our library!

This work is collaborative, enthusiasts are welcome to comment open issues and PRs or open new ones.

Should you be interested in this open effort and would like to share any question, suggestion or input, you can use the following channels:

- This Github repository (issues or PRs)
- Substra Foundation's [Slack workspace](https://substra-workspace.slack.com/join/shared_invite/zt-cpyedcab-FHYgpy08efKJ2FCadE2yCA), channel `#workgroup-mpl-contributivity`
- Email: hello@substra.org
- Come meet with us at La Paillasse (Paris, France), Le Palace (Nantes, France) or Studio Iconosquare (Limoges, France)

 ![logo Substra Foundation](./img/substra_logo_couleur_rvb_w150px.png)