# Pretraining
A pretrained model is used to speed up the training, which was necessary for the short time allowed for the research. The pretrained model is created by training for 200 epochs on the pretrain set. To begin we create a run-folder for the pretrain2 and pretrain8 set.

In [None]:
import os
os.makedirs('KITTI/yolo2/clients/run-pretrain2/')
os.makedirs('KITTI/yolo8/clients/run-pretrain8/')

In this case we train centralized, i.e. on one device: device 0. Device 0 gets all pretrain samples as labeled samples.

In [None]:
from shutil import copyfile
copyfile('KITTI/yolo2/pretrain.txt', 'KITTI/yolo2/clients/run-pretrain2/l-0.txt')
copyfile('KITTI/yolo8/pretrain.txt', 'KITTI/yolo8/clients/run-pretrain8/l-0.txt')

Start the training on each of the pretrainsets for 200 epochs.

In [None]:
!python research.py --action train --epochs 200 --pretrain --client 0 --classes 2 --round 0 --test --name pretrain2

In [None]:
!python research.py --action train --epochs 200 --pretrain --client 0 --classes 8 --round 0 --test --name pretrain8

And copy them to an easily accessible location

In [None]:
copyfile('runs/run-pretrain2-0-0/weights/last.pt', 'startingpoint.pt')
copyfile('runs/run-pretrain8-0-0/weights/last.pt', 'startingpoint8.pt')

# Experiments
## Introduction
The algorithm consists of multiple steps: an active learning step, a training step and a federated learning step. These have been defined in the research.py files. These steps can be combined in multiple ways, with different parameters to create the runs described in the paper. Each of the runs has a numbered identifier, a mapping from numbered identifier to the name in the paper will be provided later. 
#### Process spawning
In this notebook we assume that for every step of the algorithm we start a new process. This is done since python does not clear up the memory (well enough) after a training iteration (committed memory remains high, sometimes graphics memory is not emptied). Forcing the garbage collector only results in marginable returns. Therefore a new process for each step is chosen, as after the step finishes, the entire memory used by the process is forcefully cleared by the OS. 

Note: this adds additional overhead, plus in rounds the datasets are often reinitialized. If one has the memory available, the simple modification can be made which selects the dataloader from a pre-initialized pool of dataloaders depending on the device ID in start_training in research.py. One can then just call research.py for every round or make the necessary loops in research.py. Warning: this can take excessive amounts of memory and is thus prone to crashes, and can additionally only be used after the datasets for each device are constant, i.e. after AL iterations have finished. Speedups can be expected as 15-20 seconds per device per round.

### Initializing Step
To begin with the experiments for a run, we initialize the device files. That means creating the necessary directories and adding all device files to its unlabeled datapool. This is done as follows:

In [None]:
!python research.py --action init --classes 2 --name example 

### Active learning step
With the folders created, we can begin by using AL for the pretrained starting point weights. The active learning aggregation method and samples can be defined through parameters. The client parameter indicates the device on which the active learning step is executed. The active learning step will only use the data available on that device.

In [None]:
!python research.py --action al --al_method sum --weights startingpoint.pt --client 0 --classes 2 --round 0 --name example 

### Training step
Now that files are labelled on the device, we can begin training. The training code is based on the PyTorch YOLOv5 implementation, adapted specifically for this paper. The original code can be seen at: https://github.com/ultralytics/yolov5. 
In chained methods, we use the freeze_backbone parameter during training.

In [None]:
!python research.py --action train --epochs 20 --weights startingpoint.pt --client 0 --classes 2 --round 0 --name example 

### FedAvg step
After multiple devices have finished training, we can fedavg the resulting checkpoints, to recreate pseudo-FL.


In [None]:
!python research.py --action fedavg --classes 2 --round 0 --name example 

### Composing
These steps can now be combined in a for-loop using iPython, e.g. as follows:

In [None]:
# Initialize run
!python research.py --action init --classes 2 --name example1 

In [None]:
# First round is using startingpoint.pt, after that we reuse weights in all other rounds. (When FedAvg'ing)
for client in range(0, 9):
    !python research.py --action al --al_method sum --weights startingpoint.pt --client {client} --classes 2 --round 0 --name example1 
    !python research.py --action train --epochs 20 --weights startingpoint.pt --client {client} --classes 2 --round 0 --name example1 
!python research.py --action fedavg --classes 2 --round 0 --name example1

In [None]:
# The Active Learning rounds
for curr_round in range(1,22):
    for client in range(0, 9):
        !python research.py --action al --al_method sum --reuse_weights --client {client} --classes 2 --round {curr_round} --name example1 
        !python research.py --action train --epochs 20 --reuse_weights --client {client} --classes 2 --round {curr_round} --name example1 
    !python research.py --action fedavg --classes 2 --round {curr_round} --name example1

In [None]:
# The non-active learning rounds. 
# These are the rounds you can optimize by reusing the dataloaders for each device through a pooled dataloader.
for curr_round in range(22,111):
    for client in range(0, 9):
        !python research.py --action train --epochs 20 --reuse_weights --client {client} --classes 2 --round {curr_round} --name example1 
    !python research.py --action fedavg --classes 2 --round {curr_round} --name example1

# Executing the experiments described in the paper
These are the experiments described in the paper. The names in the paper were given after the runs were completed. The original names for the runs are numbered.

A small sidenote: much of the experiments here were executed in different python notebooks. For the repository these notebooks have been aggregated into one large codebase with flags for each of the runs. This is done to allow reviewers and contributors to more easily understand the code (rather than searching endlessly through 39 python notebooks). It is recommended to run the code below in different notebooks as well, as the logging output will often result in a python notebook in the hundreds of megabytes. This makes it incredibly slow to load the notebook.

It can occur that a slightly different AL sample is selected in one run, compared to another run. This happens if the confidences are very close and the model training has slightly different weights (which can happen as SGD optimizer uses randomness). The randomness is seeded, however it appears PyTorch internally uses an unseeded random generator. The influence of this should be negligible. The originally selected AL samples and resulting weights are kept by the author for reproducability. Due to the size (more than 200GB) these cannot be uploaded to this repository, but are available via the emailadress in the paper. 

## FedAvg runs

### 2-2*-SUM (Run-4)

In [None]:
!python research.py --action init --classes 2 --name 4 

for client in range(0, 9):
    !python research.py --action al --al_method sum --weights startingpoint.pt --client {client} --classes 2 --round 0 --name 4 
    !python research.py --action train --epochs 20 --weights startingpoint.pt --client {client} --classes 2 --round 0 --name 4 
!python research.py --action fedavg --classes 2 --round 0 --name 4

for curr_round in range(1,22):
    for client in range(0, 9):
        weights = f'runs/run-4-{client}-{curr_round -1}'
        !python research.py --action al --al_method sum --weights {weights} --client {client} --classes 2 --round {curr_round} --name 4 
        !python research.py --action train --epochs 20 --weights {weights} --client {client} --classes 2 --round {curr_round} --name 4 
!python research.py --action fedavg --classes 2 --round 21 --name 4
    
for curr_round in range(22,111):
    for client in range(0, 9):
        !python research.py --action train --epochs 20 --reuse_weights --client {client} --classes 2 --round {curr_round} --name 4 
    !python research.py --action fedavg --classes 2 --round {curr_round} --name 4

### -> 2-2*-AVG (Run-4a)

In [None]:
!python research.py --action init --classes 2 --name 4a 

for client in range(0, 9):
    !python research.py --action al --al_method avg --weights startingpoint.pt --client {client} --classes 2 --round 0 --name 4a 
    !python research.py --action train --epochs 20 --weights startingpoint.pt --client {client} --classes 2 --round 0 --name 4a 
!python research.py --action fedavg --classes 2 --round 0 --name 4a

for curr_round in range(1,22):
    for client in range(0, 9):
        weights = f'runs/run-4a-{client}-{curr_round -1}/weights/last.pt'
        !python research.py --action al --al_method avg --weights {weights} --client {client} --classes 2 --round {curr_round} --name 4a 
        !python research.py --action train --epochs 20 --weights {weights} --client {client} --classes 2 --round {curr_round} --name 4a 
!python research.py --action fedavg --classes 2 --round 21 --name 4a
    
for curr_round in range(22,201):
    for client in range(0, 9):
        !python research.py --action train --epochs 20 --reuse_weights --client {client} --classes 2 --round {curr_round} --name 4a 
    !python research.py --action fedavg --classes 2 --round {curr_round} --name 4a

### -> 2-2*-MAX (Run-4b)

In [None]:
!python research.py --action init --classes 2 --name 4b

for client in range(0, 9):
    !python research.py --action al --al_method max --weights startingpoint.pt --client {client} --classes 2 --round 0 --name 4b 
    !python research.py --action train --epochs 20 --weights startingpoint.pt --client {client} --classes 2 --round 0 --name 4b 
!python research.py --action fedavg --classes 2 --round 0 --name 4b

for curr_round in range(1,22):
    for client in range(0, 9):
        weights = f'runs/run-4b-{client}-{curr_round -1}/weights/last.pt'
        !python research.py --action al --al_method max --weights {weights} --client {client} --classes 2 --round {curr_round} --name 4b 
        !python research.py --action train --epochs 20 --weights {weights} --client {client} --classes 2 --round {curr_round} --name 4b 
!python research.py --action fedavg --classes 2 --round 21 --name 4b
    
for curr_round in range(22,201):
    for client in range(0, 9):
        !python research.py --action train --epochs 20 --reuse_weights --client {client} --classes 2 --round {curr_round} --name 4b 
    !python research.py --action fedavg --classes 2 --round {curr_round} --name 4b

### 2-2*-RND (Run-5)

In [None]:
!python research.py --action init --classes 2 --name 5 

for client in range(0, 9):
    !python research.py --action al --al_method rnd --weights startingpoint.pt --client {client} --classes 2 --round 0 --name 5 
    !python research.py --action train --epochs 20 --weights startingpoint.pt --client {client} --classes 2 --round 0 --name 5 
!python research.py --action fedavg --classes 2 --round 0 --name 5

for curr_round in range(1,22):
    for client in range(0, 9):
        weights = f'runs/run-5-{client}-{curr_round -1}/weights/last.pt'
        !python research.py --action al --al_method rnd --weights {weights} --client {client} --classes 2 --round {curr_round} --name 5 
        !python research.py --action train --epochs 20 --weights {weights} --client {client} --classes 2 --round {curr_round} --name 5 
!python research.py --action fedavg --classes 2 --round 21 --name 5
    
for curr_round in range(22,111):
    for client in range(0, 9):
        !python research.py --action train --epochs 20 --reuse_weights --client {client} --classes 2 --round {curr_round} --name 5 
    !python research.py --action fedavg --classes 2 --round {curr_round} --name 5

### 2-2-AVG (Run-8)

In [None]:
!python research.py --action init --classes 2 --name 8 

for client in range(0, 9):
    !python research.py --action al --al_method rnd --weights startingpoint.pt --client {client} --classes 2 --round 0 --name 8 
    !python research.py --action train --epochs 20 --weights startingpoint.pt --client {client} --classes 2 --round 0 --name 8 
!python research.py --action fedavg --classes 2 --round 0 --name 8

for curr_round in range(1,22):
    for client in range(0, 9):
        !python research.py --action al --al_method rnd --reuse_weights --client {client} --classes 2 --round {curr_round} --name 8 
        !python research.py --action train --epochs 20 --reuse_weights --client {client} --classes 2 --round {curr_round} --name 8 
    !python research.py --action fedavg --classes 2 --round {curr_round} --name 8
    
for curr_round in range(22,111):
    for client in range(0, 9):
        !python research.py --action train --epochs 20 --reuse_weights --client {client} --classes 2 --round {curr_round} --name 8 
    !python research.py --action fedavg --classes 2 --round {curr_round} --name 8

### 2-2-SUM (Run-10)

In [None]:
!python research.py --action init --classes 2 --name 10 

for client in range(0, 9):
    !python research.py --action al --al_method sum --weights startingpoint.pt --client {client} --classes 2 --round 0 --name 10 
    !python research.py --action train --epochs 20 --weights startingpoint.pt --client {client} --classes 2 --round 0 --name 10 
!python research.py --action fedavg --classes 2 --round 0 --name 10

for curr_round in range(1,22):
    for client in range(0, 9):
        !python research.py --action al --al_method sum --reuse_weights --client {client} --classes 2 --round {curr_round} --name 10 
        !python research.py --action train --epochs 20 --reuse_weights --client {client} --classes 2 --round {curr_round} --name 10 
    !python research.py --action fedavg --classes 2 --round {curr_round} --name 10
    
for curr_round in range(22,111):
    for client in range(0, 9):
        !python research.py --action train --epochs 20 --reuse_weights --client {client} --classes 2 --round {curr_round} --name 10 
    !python research.py --action fedavg --classes 2 --round {curr_round} --name 10

### 2-2-AVG (Run-11)

In [None]:
!python research.py --action init --classes 2 --name 11 

for client in range(0, 9):
    !python research.py --action al --al_method avg --weights startingpoint.pt --client {client} --classes 2 --round 0 --name 11 
    !python research.py --action train --epochs 20 --weights startingpoint.pt --client {client} --classes 2 --round 0 --name 11 
!python research.py --action fedavg --classes 2 --round 0 --name 11

for curr_round in range(1,22):
    for client in range(0, 9):
        !python research.py --action al --al_method avg --reuse_weights --client {client} --classes 2 --round {curr_round} --name 11 
        !python research.py --action train --epochs 20 --reuse_weights --client {client} --classes 2 --round {curr_round} --name 11 
    !python research.py --action fedavg --classes 2 --round {curr_round} --name 11
    
for curr_round in range(22,111):
    for client in range(0, 9):
        !python research.py --action train --epochs 20 --reuse_weights --client {client} --classes 2 --round {curr_round} --name 11 
    !python research.py --action fedavg --classes 2 --round {curr_round} --name 11

### 2-2-MAX (Run-12)

In [None]:
!python research.py --action init --classes 2 --name 12 

for client in range(0, 9):
    !python research.py --action al --al_method max --weights startingpoint.pt --client {client} --classes 2 --round 0 --name 12 
    !python research.py --action train --epochs 20 --weights startingpoint.pt --client {client} --classes 2 --round 0 --name 12 
!python research.py --action fedavg --classes 2 --round 0 --name 12

for curr_round in range(1,22):
    for client in range(0, 9):
        !python research.py --action al --al_method max --reuse_weights --client {client} --classes 2 --round {curr_round} --name 12 
        !python research.py --action train --epochs 20 --reuse_weights --client {client} --classes 2 --round {curr_round} --name 12
    !python research.py --action fedavg --classes 2 --round {curr_round} --name 12
    
for curr_round in range(22,111):
    for client in range(0, 9):
        !python research.py --action train --epochs 20 --reuse_weights --client {client} --classes 2 --round {curr_round} --name 12 
    !python research.py --action fedavg --classes 2 --round {curr_round} --name 12

In [None]:
for curr_round in range(111,201):
    for client in range(0, 9):
        !python research.py --action train --epochs 20 --reuse_weights --client {client} --classes 2 --round {curr_round} --name 12 
    !python research.py --action fedavg --classes 2 --round {curr_round} --name 12

In [None]:
curr_round = 122
for client in range(5, 9):
    !python research.py --action train --epochs 20 --reuse_weights --client {client} --classes 2 --round {curr_round} --name 12 
!python research.py --action fedavg --classes 2 --round {curr_round} --name 12
for curr_round in range(123,201):
    for client in range(0, 9):
        !python research.py --action train --epochs 20 --reuse_weights --client {client} --classes 2 --round {curr_round} --name 12 
    !python research.py --action fedavg --classes 2 --round {curr_round} --name 12

In [None]:
for curr_round in range(201,300):
    for client in range(0, 9):
        !python research.py --action train --epochs 2 --reuse_weights --client {client} --classes 2 --round {curr_round} --name 12 
    !python research.py --action fedavg --classes 2 --round {curr_round} --name 12

### 8-2-MAX (Run-14)

In [None]:
!python research.py --action init --classes 8 --name 14 

for client in range(0, 9):
    !python research.py --action al --al_method max --weights startingpoint8.pt --client {client} --classes 8 --round 0 --name 14 
    !python research.py --action train --epochs 20 --weights startingpoint8.pt --client {client} --classes 8 --round 0 --name 14 
!python research.py --action fedavg --classes 8 --round 0 --name 14

for curr_round in range(1,22):
    for client in range(0, 9):
        !python research.py --action al --al_method max --reuse_weights --client {client} --classes 8 --round {curr_round} --name 14 
        !python research.py --action train --epochs 20 --reuse_weights --client {client} --classes 8 --round {curr_round} --name 14 
    !python research.py --action fedavg --classes 8 --round {curr_round} --name 14
    
for curr_round in range(22,111):
    for client in range(0, 9):
        !python research.py --action train --epochs 20 --reuse_weights --client {client} --classes 8 --round {curr_round} --name 14 
    !python research.py --action fedavg --classes 8 --round {curr_round} --name 14

### 8-2-RND (Run-16)


In [None]:
!python research.py --action init --classes 8 --name 16 

for client in range(0, 9):
    !python research.py --action al --al_method rnd --weights startingpoint8.pt --client {client} --classes 8 --round 0 --name 16 
    !python research.py --action train --epochs 20 --weights startingpoint8.pt --client {client} --classes 8 --round 0 --name 16 
!python research.py --action fedavg --classes 8 --round 0 --name 16

for curr_round in range(1,22):
    for client in range(0, 9):
        !python research.py --action al --al_method rnd --reuse_weights --client {client} --classes 8 --round {curr_round} --name 16 
        !python research.py --action train --epochs 20 --reuse_weights --client {client} --classes 8 --round {curr_round} --name 16 
    !python research.py --action fedavg --classes 8 --round {curr_round} --name 16
    
for curr_round in range(22,111):
    for client in range(0, 9):
        !python research.py --action train --epochs 20 --reuse_weights --client {client} --classes 8 --round {curr_round} --name 16 
    !python research.py --action fedavg --classes 8 --round {curr_round} --name 16

## The chained runs
### 2-Chained (Run-Chainx)

In [None]:
!python research.py --action aggchain --classes 2 --round 0 --name Chainx # save agg to pseudo_fl/chainagg.pt


In [None]:
!python research.py --action init --classes 2 --name Chainx 

for client in range(0, 9):
    if client == 0:
        weights = 'startingpoint.pt'
    else:
        weights = f'runs/run-Chainx-{client-1}-0'
    !python research.py --action al --al_method sum --al_samples 220 --weights {weights} --client {client} --classes 2 --round 0 --name Chainx 
    !python research.py --action train --epochs 200 --weights {weights} --client {client} --classes 2 --round 0 --test --name Chainx 
!python research.py --action aggchain --classes 2 --round 0 --name Chainx -aggr_metric ap50 
# saved agg to pseudo_fl/chainagg.pt

for curr_round in range(1,21):
    for client in range(0, 9):
        if client == 0 and curr_round == 1:
            weights = 'pseudo_fl/Chainxagg.pt'
        elif client == 0:
            weights = f'runs/run-Chainx-8-{curr_round -1}'
        else:
            weights = f'runs/run-Chainx-{client -1}-{curr_round}'
        !python research.py --action train --epochs 20 --weights {weights} --freeze-backbone --client {client} --classes 2 --round {curr_round} --name Chainx 

### 2-Chained-Fed-Avg (Run-CFed)

In [None]:
!python research.py --action init --classes 2 --name CFed
# First round from chainagg.pt, shared with Run-Chainx
for client in range(0, 9):
      !python research.py --action train --epochs 20 --weights pseudo_fl/Chainxagg.pt --freeze-backbone --client {client} --classes 2 --round 1 --name CFed 
!python research.py --action fedavg --classes 2 --round 1 --name CFed

# Rounds after that in regular fedavg fashion with reuse_weights.
for curr_round in range(2,21):
    for client in range(0, 9):
          !python research.py --action train --epochs 20 --reuse_weights --freeze-backbone --client {client} --classes 2 --round {curr_round} --name CFed 
    !python research.py --action fedavg --classes 2 --round {curr_round} --name CFed

### 8-Chained (Run-15x)

In [None]:
!python research.py --action init --classes 8 --name 15x 

for client in range(0, 9):
    if client == 0:
        weights = 'startingpoint8.pt'
    else:
        weights = f'runs/run-15x-{client-1}-0'
    !python research.py --action al --al_method sum --al_samples 220 --weights {weights} --client {client} --classes 8 --round 0 --name 15x 
    !python research.py --action train --epochs 200 --weights {weights} --client {client} --classes 8 --round 0 --test --name 15x 
!python research.py --action aggchain --classes 8 --round 0 --name 15x # save agg to pseudo_fl/chainagg.pt

for curr_round in range(1,21):
    for client in range(0, 9):
        if client == 0 and curr_round == 1:
            weights = 'pseudo_fl/15xagg.pt'
        elif client == 0:
            weights = f'runs/run-15x-8-{curr_round -1}'
        else:
            weights = f'runs/run-15x-{client -1}-{curr_round}'
        !python research.py --action train --epochs 20 --weights {weights} --freeze-backbone --client {client} --classes 8 --round {curr_round} --name 15x 

## The centralized runs
To create the run files for the centralized baseline, we take everything in the labelled sets of all devices for one run, and store it in one file on one device.

In [None]:
def create_centralized_run_files(name, nc):
    total_images = []
    for file in glob.glob(f"KITTI/yolo{nc}/clients/run-{name}/l-*.txt"): 
        with open(file, "r") as f:
            for line in f.readlines():
                total_images.append(line.replace('\n',''))
    os.makedirs(f"KITTI/yolo{nc}/clients/run-{name}-CENTRALIZED/)
    with open(f"KITTI/yolo{nc}/clients/run-{name}-CENTRALIZED/l-0.txt", "w") as f:
        for line in total_images:
            f.write(line+'\n')

With those files, a model is trained from scratch for 200 epochs. The number of classes has to be known per run, so this is manually entered in a mapping.

In [None]:
run_classes_mapping = {'4': 2, '5': 2, '8': 2, '10': 2, '11': 2, '12': 2, '14': 8, '16': 8, 'Chainx': 2, 'CFed': 2, '15x': 8}
for name, nc in d.items():
    create_centralized_run_files(name, nc)
    !python research.py --action train --epochs 200 --pretrain --client 0 --classes {nc} --round 0 --test --name {name+"-CENTRALIZED"} 