# Project Lama Into the Wild: Transfer Learning & Domain Similarity metrics
<b>Group Number: 13</b><br>
<b>Name Group Member 1: Léo Brucker</b><br>
<b>u-Kürzel Group Member 1: uhugu</b><br>
<b>Name Group Member 2: Cyril Rudolph</b><br>
<b>u-Kürzel Group Member 2: udjvh</b>
## Project Description:
### Primary Milestone:
Goal: Transfer Learning (Domain Adaptation) in Precision Agriculture.
<br>Task: train a model for classification and evaluate it in different domains.

Understand the problems of Federated learning and class imbalance by first trying to apply a model that can transfer the “global model” onto local small datasets (defined by different clients in the emnist dataset).

### Secondary Milestone:
Use Federated Transfer Learning on a more complex dataset with two types of class categories (plant type and plant health). One of these types of classes will be used to simulate the clients (for the federated learning part): For example we could only have access to pictures of healthy plants, but we will need to also recognize sick plants. We will also analyze the effects of class imbalance (compare arXiv:2109.04094v2) and try to mitigate them. 

## Roadmap
- [ ] Implement a Federated Transfer Learning Model on the emnist Dataset. Create/Adapt a first model and test it.
- [ ] Implement Class Imbalance (by manually creating it in the dataset or using existing imbalance from the dataset) and train a model to reduce this imbalance as much as possible.
- [ ] Data preparation and visualization of the Plant Village dataset. Transfer the emnist model to the more complex dataset and build upon it.
- [ ] Evaluation of the results, what are valuable methods to reduce class imbalance?

#### Nice links:
- https://www.tensorflow.org/federated/tutorials/federated_learning_for_image_classification


In [3]:
# Import packages

import numpy as np
import matplotlib.pyplot as plt
import time
import os
import tensorflow as tf
import tensorflow_federated as tff
import collections


2024-01-29 14:53:17.775123: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-01-29 14:53:17.961879: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-01-29 14:53:17.961908: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-01-29 14:53:17.961937: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-01-29 14:53:17.970914: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-01-29 14:53:17.971840: I tensorflow/core/platform/cpu_feature_guard.cc:182] This Tens

#### Implementation of Multiclass Imbalance Degree (MID)

The paper uses the general approach of LRID and but combines it, which leaves us with following equation:
$ Test = 123 $

$$ MID = \left(\frac{LRID}{LRID_ext} \right) = \left( \sum_{c=1}^C \frac{n_c}{N} *log_C(\frac{C_n}{N})\right)
$$
<br>

with $N = Datasamples$, $C = possible classes$, the number of sampels with label c is $n_c$
The larger the number (between 0 & 1) the more the dataset is unbalanced.

#### Implementation of Weighted Cosine Similarity (WCS)
The mean cosine similarity (MCS) on which the WCS is based, can not consider the clients sample size, which leads to potentially biased estimation of the global and local imbalance relation. 
$$
WCS=  \sum_{i=1}^P \frac{||l_i||_1 L * l_i}{||L||_1 ||L||_2 ||l_i||_2} = \frac{1}{||L||_1 ||L||_2} \sum_{i=1}^P \frac{||l_i||_1}{||l_i||_2} L*l_i
$$

In [4]:
# Load the femnist dataset

emnist_train, emnist_test = tff.simulation.datasets.emnist.load_data(only_digits=True, cache_dir=None)

NUM_CLIENTS = 10 # number of clients (writers) we will work with
BATCH_SIZE = 20


NUM_EPOCHS = 5
SHUFFLE_BUFFER = 100
PREFETCH_BUFFER = 10

def preprocess(dataset):

  def batch_format_fn(element):
    """Flatten a batch `pixels` and return the features as an `OrderedDict` (x,y)."""
    return collections.OrderedDict(
        x=tf.reshape(element['pixels'], [-1, 784]),
        y=tf.reshape(element['label'], [-1, 1]))

  return dataset.repeat(NUM_EPOCHS).shuffle(SHUFFLE_BUFFER, seed=1).batch(
      BATCH_SIZE).map(batch_format_fn).prefetch(PREFETCH_BUFFER)


# Create list of multiple given client datasets
def make_federated_data(client_data, client_ids): 
  return [
      preprocess(client_data.create_tf_dataset_for_client(x))
      for x in client_ids
  ]

sample_clients = emnist_train.client_ids[0:NUM_CLIENTS]

federated_train_data = make_federated_data(emnist_train, sample_clients)

print(f'Number of client datasets: {len(federated_train_data)}')
print(f'First dataset: {federated_train_data[0]}')

example_dataset = emnist_train.create_tf_dataset_for_client(emnist_train.client_ids[0])
preprocessed_example_dataset = preprocess(example_dataset)


2024-01-29 14:53:22.863074: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:894] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-01-29 14:53:22.865891: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2211] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...


Number of client datasets: 10
First dataset: <_PrefetchDataset element_spec=OrderedDict([('x', TensorSpec(shape=(None, 784), dtype=tf.float32, name=None)), ('y', TensorSpec(shape=(None, 1), dtype=tf.int32, name=None))])>


In [5]:
## Visualize

visualize = False

if(visualize):
  # Number of examples per layer for a sample of clients -> We can see some class imbalance is present
  f = plt.figure(figsize=(12, 7))
  f.suptitle('Label Counts for a Sample of Clients')
  for i in range(6):
    client_dataset = emnist_train.create_tf_dataset_for_client(
        emnist_train.client_ids[i])
    plot_data = collections.defaultdict(list)
    for example in client_dataset:
      label = example['label'].numpy()
      plot_data[label].append(label)
    plt.subplot(2, 3, i+1)
    plt.title('Client {}'.format(i))
    for j in range(10):
      plt.hist(
          plot_data[j],
          density=False,
          bins=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
  _ = plt.show()
  # Each client has different mean images, meaning each client will be nudging
  # the model in their own directions locally.

  for i in range(3):
    client_dataset = emnist_train.create_tf_dataset_for_client(
        emnist_train.client_ids[i])
    plot_data = collections.defaultdict(list)
    for example in client_dataset:
      plot_data[example['label'].numpy()].append(example['pixels'].numpy())
    f = plt.figure(i, figsize=(12, 5))
    f.suptitle("Client #{}'s Mean Image Per Label".format(i))
    for j in range(10):
        mean_img = np.mean(plot_data[j], 0)
        plt.subplot(2, 5, j+1)
        plt.imshow(mean_img.reshape((28, 28)))
        plt.axis('off')


In [6]:
# Creating the model

# Create a tensorflow (keras) model
def create_keras_model():
  initializer = tf.keras.initializers.GlorotNormal(seed=0)
  return tf.keras.models.Sequential([
      tf.keras.layers.Input(shape=(784,)),                          ## Input 784 pixels
      tf.keras.layers.Dense(10, kernel_initializer=initializer),    ## 10 fully-connected neurons
      tf.keras.layers.Softmax(),                                    ## One-hot encoding
  ])

def model_fn():
  # We _must_ create a new model here, and _not_ capture it from an external
  # scope. TFF will call this within different graph contexts.
  keras_model = create_keras_model()
  return tff.learning.models.from_keras_model(
      keras_model,
      input_spec=preprocessed_example_dataset.element_spec,
      loss=tf.keras.losses.SparseCategoricalCrossentropy(),
      metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])



In [8]:
## Train!

training_process = tff.learning.algorithms.build_weighted_fed_avg(
    model_fn,   
    client_optimizer_fn=lambda: tf.keras.optimizers.SGD(learning_rate=0.02),
    server_optimizer_fn=lambda: tf.keras.optimizers.SGD(learning_rate=1.0))

# Uncomment to see the representation of the training process
#print(training_process.initialize.type_signature.formatted_representation())

# Initialize: Construct the server state
train_state = training_process.initialize()

# Next: A single round of federated averaging (push server state to clients -> local updates -> collect and average to new server state
result = training_process.next(train_state, federated_train_data)
train_state = result.state
train_metrics = result.metrics
print('round  1, metrics={}'.format(train_metrics))


2024-01-29 15:23:51.496993: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:894] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-01-29 15:23:51.497161: I tensorflow/core/grappler/devices.cc:66] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0
2024-01-29 15:23:51.497278: I tensorflow/core/grappler/clusters/single_machine.cc:361] Starting new session
2024-01-29 15:23:51.497746: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:894] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-01-29 15:23:51.497857: W tensorflow/core/common_runtime/gpu/gpu_d

KeyboardInterrupt: 