# Embedded ML Lab - Excercise 1- Federated Learning

In this exercise, we will explore federated learning (FL).

To recap, the core idea of FL is that many *devices* collaboratively train a single model, where each device has its own local private training data. Each device performs local training with its own data, and exchanges the weights of the NN model with all other devices via the *server*.

You are one participant in an FL system that comprises all participating students in this lab.

This excercise comprises two parts.
* First, we will implement decentralized single-device training, i.e., no collaboration between devices. This will serve as a baseline to compare FL to.
* Second, we will implement actual FL, where all participants collaboratively train a model.

## Part 1: Decentralized single-device training

We start with the data. Therefore, we randomly distribute the CIFAR-10 dataset among all participants. This is already implemented in the helper functions in device_data. It divides the dataset into `total_clients` parts and returns the part with the number `client_id`.

We also make use of test data. This is usually not available in FL. However, we still use it to test the model quality. Of course, the test data must not be used for training -- only testing.

Adjust the following constants to the participants in the lab. Each participant is one client. Make sure that you use a different *CLIENT_ID*, each.

In [1]:
TOTAL_CLIENTS = 7  # number of participants in the lab
CLIENT_ID = 2  # between 0 and TOTAL_CLIENTS-1

In [2]:
import torch
import device_data

training_dataset = device_data.get_client_training_data(CLIENT_ID, TOTAL_CLIENTS)
train_loader = torch.utils.data.DataLoader(training_dataset, batch_size=16)
test_dataset = device_data.get_test_data()
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=16)

Using downloaded and verified file: data/cifar-10-python.tar.gz
Extracting data/cifar-10-python.tar.gz to data/
Files already downloaded and verified


You are already familiar with training an NN in PyTorch. We want to train a CifarNet with your part of the training data.

**Your Task:**
- Create a instance of CifarNet, define the optimizer and the criteron. For the optimizer use `torch.optim.SGD` (https://pytorch.org/docs/stable/generated/torch.optim.SGD.html) , for the loss use criterion use `torch.nn.CrossEntropyLoss`. Use a learning rate of `0.001` and a momentum of `0.9`
- implement the function train that trains the model CifarNet. The function `train` takes the model (an instance), the optimizer, an optimization criterion, a trainloader, and a device (either `cpu` or `cuda` as input)
    - Firstly, set the model in training mode, load the model on the device. Secondly, push the `inputs` and `targets` to the device.
    - Inside the batch loader loop, set the grads of the optimizer to zero (`.zero_grad()`) and push the inputs into the model. After that, calculate the loss and call backward on the loss. Lastly, apply the optimizer step.
- Implement the function test that tests the model CifarNet, The function `test` takes the model (an instance), a testloader, and the device as input. The function returns the accuracy.
    - Set the model into evaluation mode, for each batch calculate the number of correct detected and all detections.
    - Return the accuracy (fraction of correct detected and all)

In [14]:
# this reaches <70% with 4 clients

from models.cifarnet import CifarNet
from tqdm import tqdm
import sys

device = 'cuda'
epochs = 10

#create a model instance and define loss and optimizer criterion
#-to-be-done-by-student-------------
model = CifarNet()
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
#-----------------------------------

def train(model, optimizer, criterion, trainloader, device='cpu'):
    #-to-be-done-by-student---------
    model.train()
    model.to(device)
    #-------------------------------
    for _, (inputs, targets) in enumerate(tqdm(trainloader, ncols=80,
                                               file=sys.stdout, desc="Training", leave=False)):
        
        #-to-be-done-by-student----
        inputs = inputs.to(device)
        targets = targets.to(device)

        optimizer.zero_grad()
        outputs = model(inputs)

        loss = criterion(outputs, targets)
        loss.backward()
        optimizer.step()
        #--------------------------

def test(model, testloader, device='cpu'):
    num_correct = 0
    num_samples = 0
    
    #-to-be-done-by-student---------
    model.eval()
    model.to(device)
    
    #-------------------------------
    for _, (inputs, targets) in enumerate(tqdm(testloader, ncols=80,
                                               file=sys.stdout, desc="Testing", leave=False)):
        #-to-be-done-by-student----
        inputs = inputs.to(device)
        targets = targets.to(device)

        outputs = model(inputs)

        num_correct += sum(outputs.argmax(dim=1) == targets)
        num_samples += len(targets)
        #--------------------------
        
    return num_correct / num_samples

for epoch in range(epochs):
    train(model, optimizer, criterion, train_loader, device=device)
    accuracy = test(model, test_loader, device=device)
    print(f'Epoch {epoch+1:2d}/{epochs:2d}: accuracy ({accuracy:.2f})')

Epoch  1/10: accuracy (0.10)                                                    
Epoch  2/10: accuracy (0.10)                                                    
Epoch  3/10: accuracy (0.10)                                                    
Epoch  4/10: accuracy (0.10)                                                    
                                                                                

KeyboardInterrupt: 