# Experiment 1: SHAP values in different dataset sizes

This experiment should show the changes in shap values as dataset size get reduced to help us understand how SHAP values change independently from an poisoning attack. 
Since in Federated Learning the size is drastically reduced due to the missing step of data aggregation. 


Experimental Setup:


* datasets: MNIST(FFNNCLient, nll-loss), FMNIST(CNNCLient, cross entropy)
* number of clients: (1, 10, 50, 100, 200)
* dataset size respectively: (60000, 6000, 1200, 600, 300)

In [1]:
from federated_learning.utils import SHAPUtil
from federated_learning import ClientPlane, Configuration

In [2]:
%load_ext autoreload
%autoreload 2

In [3]:
class ObserverConfiguration():
    experiment_type = "datasize_shap"
    experiment_id = 0
    dataset_type = "MNIST"
    test = False
    
    # Client Configurations 
    client_name = "client"
    client_type = "client"

In [4]:
config = Configuration()
data = config.DATASET(config)
shap_util = SHAPUtil(data.test_dataloader)
observer_config = ObserverConfiguration()
for i in range(10):
    for number_client in [1, 10, 50, 100, 200]:
        config.NUMBER_OF_CLIENTS = number_client
        observer_config.experiment_id = i
        client_plane = ClientPlane(config, observer_config, data, shap_util)
        client_plane.clients[0].test()
        for epoch in range(1, config.N_EPOCHS + 1):
            client_plane.clients[0].train(epoch)
            client_plane.clients[0].test()
        client_plane.clients[0].push_metrics()

MNIST training data loaded.
MNIST test data loaded.
Create 1 clients with dataset of size 60000

Test set: Average loss: 0.0023, Accuracy: 974/10000 (10%)


Test set: Average loss: 0.0004, Accuracy: 9278/10000 (93%)


Test set: Average loss: 0.0002, Accuracy: 9544/10000 (95%)


Test set: Average loss: 0.0001, Accuracy: 9666/10000 (97%)


Test set: Average loss: 0.0001, Accuracy: 9718/10000 (97%)


Test set: Average loss: 0.0001, Accuracy: 9746/10000 (97%)



Using a non-full backward hook when the forward contains multiple autograd Nodes is deprecated and will be removed in future versions. This hook will be missing some grad_input. Please use register_full_backward_hook to get the documented behavior.
Note that order of the arguments: ceil_mode and return_indices will changeto match the args list in nn.MaxPool2d in a future release.


Predictions: tensor([[0],
        [1],
        [2],
        [3],
        [4],
        [5],
        [6],
        [7],
        [8],
        [9]])
client,client_id=0,test=False,poisoned=False,poisoned_data=1,dataset_size=60000,type=client,experiment_type=datasize_shap,experiment_id=0,poisoned_clients=0,num_of_epochs=5,batch_size=64,num_clients=1,target=0 precision=0.963973 1653634447
client,client_id=0,test=False,poisoned=False,poisoned_data=1,dataset_size=60000,type=client,experiment_type=datasize_shap,experiment_id=0,poisoned_clients=0,num_of_epochs=5,batch_size=64,num_clients=1,target=1 precision=0.982662 1653634447
client,client_id=0,test=False,poisoned=False,poisoned_data=1,dataset_size=60000,type=client,experiment_type=datasize_shap,experiment_id=0,poisoned_clients=0,num_of_epochs=5,batch_size=64,num_clients=1,target=2 precision=0.947030 1653634447
client,client_id=0,test=False,poisoned=False,poisoned_data=1,dataset_size=60000,type=client,experiment_type=datasize_shap,experiment_id=