# Project Lama Into the Wild: Transfer Learning & Domain Similarity metrics
<b>Group Number: 13</b><br>
<b>Name Group Member 1: Léo Brucker</b><br>
<b>u-Kürzel Group Member 1: uhugu</b><br>
<b>Name Group Member 2: Cyril Rudolph</b><br>
<b>u-Kürzel Group Member 2: udjvh</b>
## Project Description:
### Primary Milestone:
Goal: Transfer Learning (Domain Adaptation) in Precision Agriculture.
<br>Task: train a model for classification and evaluate it in different domains.

Understand the problems of Federated learning and class imbalance by first trying to apply a model that can transfer the “global model” onto local small datasets (defined by different clients in the FEMNIST dataset).

### Secondary Milestone:
Use Federated Transfer Learning on a more complex dataset with two types of class categories (plant type and plant health). One of these types of classes will be used to simulate the clients (for the federated learning part): For example we could only have access to pictures of healthy plants, but we will need to also recognize sick plants. We will also analyze the effects of class imbalance (compare arXiv:2109.04094v2) and try to mitigate them. 

## Roadmap
- [ ] Implement a Federated Transfer Learning Model on the FEMNIST Dataset. Create/Adapt a first model and test it.
- [ ] Implement Class Imbalance (by manually creating it in the dataset or using existing imbalance from the dataset) and train a model to reduce this imbalance as much as possible.
- [ ] Data preparation and visualization of the Plant Village dataset. Transfer the FEMNIST model to the more complex dataset and build upon it.
- [ ] Evaluation of the results, what are valuable methods to reduce class imbalance?


In [1]:
# Import packages

import numpy as np
import matplotlib.pyplot as plt
import time
import os
import tensorflow as tf
import tensorflow_federated as tff



2024-01-29 10:53:38.367914: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-01-29 10:53:38.846351: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-01-29 10:53:38.846388: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-01-29 10:53:38.849580: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-01-29 10:53:39.118108: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2024-01-29 10:53:39.119496: I tensorflow/core/platform/cpu_feature_guard.cc:182] This Tens

In [2]:
# Load the FEMNIST dataset

femnist_train, femnist_test = tff.simulation.datasets.emnist.load_data(only_digits=True, cache_dir=None)

NUM_CLIENTS = 10 # number of clients (writers) we will work with
BATCH_SIZE = 20

# preprocess function (from ttf tutorial)
def preprocess(dataset):
  def batch_format_fn(element):
    """Flatten a batch of EMNIST data and return a (features, label) tuple."""
    return (tf.reshape(element['pixels'], [-1, 784]),
            tf.reshape(element['label'], [-1, 1]))

  return dataset.batch(BATCH_SIZE).map(batch_format_fn)

# Create dataset of defined number of clients
client_ids = sorted(femnist_train.client_ids)[:NUM_CLIENTS]
federated_train_data = [preprocess(femnist_train.create_tf_dataset_for_client(x))
  for x in client_ids
]



2024-01-29 10:53:44.921765: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:894] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
2024-01-29 10:53:44.924413: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2211] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...


In [4]:
# Creating the model

# Create a tensorflow (keras) model
def create_keras_model():
  initializer = tf.keras.initializers.GlorotNormal(seed=0)
  return tf.keras.models.Sequential([
      tf.keras.layers.Input(shape=(784,)),                          ## Input 784 pixels
      tf.keras.layers.Dense(10, kernel_initializer=initializer),    ## 10 fully-connected neurons
      tf.keras.layers.Softmax(),                                    ## One-hot encoding
  ])

# Convert the created model into ttf
def model_fn():
  keras_model = create_keras_model()
  return tff.learning.models.from_keras_model(
      keras_model,
      input_spec=federated_train_data[0].element_spec,
      loss=tf.keras.losses.SparseCategoricalCrossentropy(),
      metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])

#tff.federated_computation(lambda: 'Hello, Puta madre!')()

