#### GRU Model Overview
The Gated Recurrent Unit (GRU) is a type of Recurrent Neural Network (RNN) designed to handle sequences of data. It is similar to the Long Short-Term Memory (LSTM) network but with a simplified architecture. GRUs are particularly useful for tasks involving temporal or sequential data due to their ability to capture dependencies over time.

Components of the GRU Model:
GRU Layers:

Update Gate: Determines how much of the past information needs to be passed along to the future. It decides what new information should be updated.
Reset Gate: Controls how much of the past information should be forgotten.
Candidate Activation: Generates a new candidate for the hidden state, based on the reset gate and the previous hidden state.
Hidden State Calculation: Combines the old hidden state and the new candidate activation, weighted by the update gate, to produce the final hidden state.
Encoder:

Input Layer: Accepts the input sequence data.
GRU Layers: Processes the input sequence through one or more GRU layers. Each GRU layer updates its hidden state based on the input and previous hidden state, capturing temporal dependencies.
Decoder:

Dense Layers: After processing the sequence data with GRUs, the output is passed through dense layers to produce predictions or further process the encoded features.
Output Layer: Generates the final output based on the processed sequence data.
How It Works:
Sequence Processing: GRUs process input data sequentially, maintaining and updating hidden states that capture information from previous time steps. This makes them suitable for tasks like time series forecasting, language modeling, and sequential pattern recognition.

Gates Mechanism:

Update Gate: Determines how much of the previous hidden state to retain and how much to update with new information.
Reset Gate: Determines how much of the past information to forget, allowing the model to adapt to new data.
Candidate Activation: Provides a new potential value for the hidden state, considering both the previous state and the new input.

In [7]:
import numpy as np
import pandas as pd
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, GRU, Dense, Dropout
from tensorflow.keras.optimizers import Adam
from sklearn.utils import shuffle
from sklearn.metrics import classification_report, accuracy_score
from tqdm import tqdm
import time
import tensorflow as tf 
def load_and_preprocess_data(file_path):
    data = pd.read_csv(file_path)
    data = data.dropna()  
    return data

def prepare_features_and_labels(data):
    features = ['TP2', 'DV_pressure', 'Oil_temperature', 'Motor_current', 'DV_eletric', 'Towers', 'LPS', 'Oil_level', 'Caudal_impulses']
    target = 'class'
    X = data[features].values
    y = data[target].values
    return X, y

def balance_and_sample(X, y, sample_fraction=0.4):
    """Sample and balance classes."""
    X, y = shuffle(X, y, random_state=42)
    sample_size = int(sample_fraction * len(X))
    X_sample, y_sample = X[:sample_size], y[:sample_size]
    
    classes = np.unique(y_sample)
    max_samples = max([np.sum(y_sample == cls) for cls in classes])
    
    X_balanced = []
    y_balanced = []
    
    for cls in classes:
        X_cls = X_sample[y_sample == cls]
        y_cls = y_sample[y_sample == cls]
        
        X_balanced.append(X_cls[:max_samples])
        y_balanced.append(y_cls[:max_samples])
    
    X_balanced = np.vstack(X_balanced)
    y_balanced = np.hstack(y_balanced)
    
    return X_balanced, y_balanced

def preprocess_data(data):
    if 'timestamp' not in data.columns or 'class' not in data.columns:
        raise ValueError("Data must contain 'timestamp' and 'class' columns.")
    
    data['timestamp'] = pd.to_datetime(data['timestamp'], errors='coerce')
    data = data.dropna(subset=['timestamp'])
    data['month'] = data['timestamp'].dt.to_period('M')
    months = data['month'].astype(str).unique()
    
    if len(months) < 4:
        raise ValueError("Not enough distinct months to split into global and client data.")
    
    months.sort()
    first_two_months = months[:2]
    last_two_months = months[-2:]
    
    global_data = data[data['month'].astype(str).isin(first_two_months)]
    client1_data = data[data['month'].astype(str) == last_two_months[0]]
    client2_data = data[data['month'].astype(str) == last_two_months[1]]
    
    if global_data.empty or client1_data.empty or client2_data.empty:
        raise ValueError("One or more of the filtered datasets are empty.")
    
    return global_data, client1_data, client2_data


In [8]:
def compress_gradients(gradients, compression_factor=0.1):
    """Quantize gradients to reduce communication overhead."""
    compressed_gradients = {}
    for name, grad in gradients.items():
        # Quantize gradients (e.g., to 8-bit precision)
        compressed_gradients[name] = (grad / np.max(np.abs(grad)) * 127).astype(np.int8)
    return compressed_gradients


In [9]:
def create_gru_model(input_shape):
    """Create a GRU model."""
    inputs = Input(shape=input_shape)
    x = GRU(64, return_sequences=True)(inputs)
    x = Dropout(0.2)(x)
    x = GRU(32)(x)
    x = Dense(50, activation='relu')(x)
    x = Dropout(0.2)(x)
    outputs = Dense(1, activation='sigmoid')(x)
    
    model = Model(inputs, outputs)
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    
    return model


In [10]:
def measure_communication_time(func, *args, **kwargs):
    """Measure the time taken for a function to execute."""
    start_time = time.time()
    result = func(*args, **kwargs)
    elapsed_time = time.time() - start_time
    return result, elapsed_time

def compress_gradients(gradients):
    """Apply gradient compression (e.g., quantization or sparsification)."""
    compressed_gradients = []
    for grad in gradients:
        # Example of simple quantization: round gradients to the nearest integer
        compressed_gradients.append(tf.round(grad))
    return compressed_gradients

def federated_learning(global_data, client1_data, client2_data):
    """Perform federated learning with GRU model, tracking communication time."""
    
    # Prepare global data
    print("Preparing global data...")
    X_global, y_global = prepare_features_and_labels(global_data)
    X_global, y_global = balance_and_sample(X_global, y_global, sample_fraction=0.4)
    
    # Prepare client data
    print("Preparing client data...")
    X_client1, y_client1 = prepare_features_and_labels(client1_data)
    X_client2, y_client2 = prepare_features_and_labels(client2_data)
    X_client1, y_client1 = balance_and_sample(X_client1, y_client1, sample_fraction=0.4)
    X_client2, y_client2 = balance_and_sample(X_client2, y_client2, sample_fraction=0.4)
    
    # Reshape data for GRU (e.g., [samples, timesteps, features])
    X_global = X_global[:, np.newaxis, :]
    X_client1 = X_client1[:, np.newaxis, :]
    X_client2 = X_client2[:, np.newaxis, :]
    
    # Create and train global model
    print("Creating and training global model...")
    model = create_gru_model(input_shape=(X_global.shape[1], X_global.shape[2]))
    _, train_time_global = measure_communication_time(model.fit, X_global, y_global, epochs=10, batch_size=32, verbose=2)
    print(f"Time to train global model: {train_time_global:.2f} seconds.")
    
    # Fine-tune model on client data
    client_data = [
        (X_client1, y_client1, "Client 1"),
        (X_client2, y_client2, "Client 2")
    ]
    
    print("Fine-tuning model on client data...")
    for X_client, y_client, client_name in tqdm(client_data, desc="Clients", unit="client"):
        print(f"Fine-tuning on {client_name}...")
        
        # Measure time to fine-tune
        def train_func():
            model.fit(X_client, y_client, epochs=5, batch_size=32, verbose=2)
        
        _, update_time = measure_communication_time(train_func)
        print(f"Time to fine-tune on {client_name}: {update_time:.2f} seconds.")
        
        # Compress gradients and simulate communication
        with tf.GradientTape() as tape:
            # Calculate logits
            logits = model(X_client, training=True)
            
            # Ensure y_client has the correct shape
            y_client = tf.reshape(y_client, (-1, 1))
            
            # Compute loss
            loss = tf.keras.losses.binary_crossentropy(y_client, logits)
        
        gradients = tape.gradient(loss, model.trainable_variables)
        compressed_gradients = compress_gradients(gradients)
        print(f"Compressed gradients size for {client_name}: {sum(tf.size(grad).numpy() for grad in compressed_gradients) / (1024 * 1024):.2f} MB")
    
    # Evaluate the updated global model
    print("Evaluating the updated global model...")
    y_global_pred = (model.predict(X_global) > 0.5).astype(int)
    print("Model Classification Report:")
    print(classification_report(y_global, y_global_pred))
    print("Model Accuracy Score:", accuracy_score(y_global, y_global_pred))
    
    print("Federated learning completed.")


In [11]:
file_path = 'Metro-Both-Classes.csv'
data = load_and_preprocess_data(file_path)
global_data, client1_data, client2_data = preprocess_data(data)
federated_learning(global_data, client1_data, client2_data)


Preparing global data...
Preparing client data...
Creating and training global model...
Epoch 1/10
5145/5145 - 19s - 4ms/step - accuracy: 0.9733 - loss: 0.0538
Epoch 2/10
5145/5145 - 13s - 2ms/step - accuracy: 0.9840 - loss: 0.0355
Epoch 3/10
5145/5145 - 12s - 2ms/step - accuracy: 0.9897 - loss: 0.0267
Epoch 4/10
5145/5145 - 12s - 2ms/step - accuracy: 0.9913 - loss: 0.0238
Epoch 5/10
5145/5145 - 12s - 2ms/step - accuracy: 0.9927 - loss: 0.0207
Epoch 6/10
5145/5145 - 12s - 2ms/step - accuracy: 0.9936 - loss: 0.0195
Epoch 7/10
5145/5145 - 13s - 2ms/step - accuracy: 0.9938 - loss: 0.0187
Epoch 8/10
5145/5145 - 12s - 2ms/step - accuracy: 0.9942 - loss: 0.0180
Epoch 9/10
5145/5145 - 12s - 2ms/step - accuracy: 0.9947 - loss: 0.0172
Epoch 10/10
5145/5145 - 12s - 2ms/step - accuracy: 0.9946 - loss: 0.0172
Time to train global model: 127.89 seconds.
Fine-tuning model on client data...


Clients:   0%|          | 0/2 [00:00<?, ?client/s]

Fine-tuning on Client 1...
Epoch 1/5
2707/2707 - 7s - 2ms/step - accuracy: 0.9971 - loss: 0.0106
Epoch 2/5
2707/2707 - 6s - 2ms/step - accuracy: 0.9980 - loss: 0.0062
Epoch 3/5
2707/2707 - 6s - 2ms/step - accuracy: 0.9984 - loss: 0.0059
Epoch 4/5
2707/2707 - 7s - 2ms/step - accuracy: 0.9986 - loss: 0.0052
Epoch 5/5
2707/2707 - 7s - 2ms/step - accuracy: 0.9987 - loss: 0.0046
Time to fine-tune on Client 1: 32.57 seconds.


Clients:  50%|█████     | 1/2 [00:34<00:34, 34.13s/client]

Compressed gradients size for Client 1: 0.02 MB
Fine-tuning on Client 2...
Epoch 1/5
2783/2783 - 7s - 2ms/step - accuracy: 0.9965 - loss: 0.0112
Epoch 2/5
2783/2783 - 7s - 3ms/step - accuracy: 0.9980 - loss: 0.0069
Epoch 3/5
2783/2783 - 8s - 3ms/step - accuracy: 0.9983 - loss: 0.0060
Epoch 4/5
2783/2783 - 7s - 3ms/step - accuracy: 0.9984 - loss: 0.0058
Epoch 5/5
2783/2783 - 8s - 3ms/step - accuracy: 0.9986 - loss: 0.0056
Time to fine-tune on Client 2: 36.96 seconds.


Clients: 100%|██████████| 2/2 [01:11<00:00, 35.92s/client]

Compressed gradients size for Client 2: 0.02 MB
Evaluating the updated global model...





[1m5145/5145[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 2ms/step
Model Classification Report:
              precision    recall  f1-score   support

           0       0.97      1.00      0.99    160227
           1       0.00      0.00      0.00      4386

    accuracy                           0.97    164613
   macro avg       0.49      0.50      0.49    164613
weighted avg       0.95      0.97      0.96    164613

Model Accuracy Score: 0.9733435390886503
Federated learning completed.


#### 
Stacked Autoencoder (SAE) Model Overview
A Stacked Autoencoder (SAE) is a type of neural network used for unsupervised learning, typically for dimensionality reduction or feature learning. It consists of multiple autoencoders stacked on top of each other, forming an encoder-decoder architecture.

Components of the Stacked Autoencoder Model:
Encoder:

Input Layer: Accepts the original input data.
Hidden Layers:
Dense Layer 1: Reduces the dimensionality from the input space to a smaller intermediate space.
Dense Layer 2: Further compresses the representation.
Dense Layer 3: Reduces the dimensionality to an even smaller space, often referred to as the bottleneck layer, which captures the most critical features of the data.
Output of Encoder: The compressed or encoded representation of the input data.
Bottleneck Layer:

Dense Layer 4: This is the smallest layer in the network and represents the most compact form of the input data. It captures the most important features for reconstruction.
Decoder:

Hidden Layers:
Dense Layer 5: Expands the compressed representation back to a larger intermediate space.
Dense Layer 6: Further expands it.
Dense Layer 7: Expands it back to the original input space.
Output Layer: Reconstructs the input data from the encoded representation.
How It Works:
Training Objective: The model is trained to reconstruct the input data as accurately as possible. The loss function used is typically binary cross-entropy or mean squared error, which measures the difference between the input and the reconstructed output.

Encoder Function: The encoder compresses the input data into a lower-dimensional representation, which captures the key features of the input.

Decoder Function: The decoder then takes this compressed representation and attempts to reconstruct the original input data.

Dimensionality Reduction: By forcing the data through the bottleneck layer, the model learns a lower-dimensional representation that captures the essential features while discarding less important information.

In [17]:
import numpy as np
import pandas as pd
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.optimizers import Adam
from sklearn.utils import shuffle
from sklearn.metrics import mean_squared_error
from tqdm import tqdm
import time
import tensorflow as tf

def create_autoencoder_model(input_shape):
    """Create a stacked autoencoder model."""
    inputs = Input(shape=input_shape)
    
    # Encoder
    x = Dense(128, activation='relu')(inputs)
    x = Dense(64, activation='relu')(x)
    x = Dense(32, activation='relu')(x)
    
    # Bottleneck (encoded representation)
    encoded = Dense(16, activation='relu')(x)
    
    # Decoder
    x = Dense(32, activation='relu')(encoded)
    x = Dense(64, activation='relu')(x)
    x = Dense(128, activation='relu')(x)
    
    # Output layer
    outputs = Dense(input_shape[0], activation='sigmoid')(x)
    
    autoencoder = Model(inputs, outputs)
    autoencoder.compile(optimizer='adam', loss='binary_crossentropy')
    
    return autoencoder

def compress_gradients(gradients):
    """Apply gradient compression (e.g., quantization or sparsification)."""
    compressed_gradients = []
    for grad in gradients:
        # Example of simple quantization: round gradients to the nearest integer
        compressed_gradients.append(tf.round(grad))
    return compressed_gradients

def measure_communication_time(func, *args, **kwargs):
    """Measure the time taken for a function to execute."""
    start_time = time.time()
    result = func(*args, **kwargs)
    elapsed_time = time.time() - start_time
    return result, elapsed_time

import numpy as np
from sklearn.metrics import mean_squared_error

def check_and_handle_nan(arr):
    """Check for NaN values and handle them by replacing with zeros."""
    if np.any(np.isnan(arr)):
        print("Warning: NaN values found. Replacing NaNs with zeros.")
        arr = np.nan_to_num(arr)  # Replace NaNs with zeros
    return arr

def federated_learning(global_data, client1_data, client2_data):
    """Perform federated learning with Stacked Autoencoder model, tracking communication time."""
    
    # Prepare global data
    print("Preparing global data...")
    X_global, y_global = prepare_features_and_labels(global_data)
    X_global, y_global = balance_and_sample(X_global, y_global, sample_fraction=0.4)
    
    # Prepare client data
    print("Preparing client data...")
    X_client1, y_client1 = prepare_features_and_labels(client1_data)
    X_client2, y_client2 = prepare_features_and_labels(client2_data)
    X_client1, y_client1 = balance_and_sample(X_client1, y_client1, sample_fraction=0.4)
    X_client2, y_client2 = balance_and_sample(X_client2, y_client2, sample_fraction=0.4)
    
    # Reshape data for Autoencoder (e.g., [samples, features])
    X_global = X_global[:, :]
    X_client1 = X_client1[:, :]
    X_client2 = X_client2[:, :]
    
    # Create and train global model
    print("Creating and training global model...")
    model = create_autoencoder_model(input_shape=(X_global.shape[1],))
    _, train_time_global = measure_communication_time(model.fit, X_global, X_global, epochs=10, batch_size=32, verbose=2)
    print(f"Time to train global model: {train_time_global:.2f} seconds.")
    
    # Fine-tune model on client data
    client_data = [
        (X_client1, y_client1, "Client 1"),
        (X_client2, y_client2, "Client 2")
    ]
    
    print("Fine-tuning model on client data...")
    for X_client, y_client, client_name in tqdm(client_data, desc="Clients", unit="client"):
        print(f"Fine-tuning on {client_name}...")
        
        # Measure time to fine-tune
        def train_func():
            model.fit(X_client, X_client, epochs=5, batch_size=32, verbose=2)
        
        _, update_time = measure_communication_time(train_func)
        print(f"Time to fine-tune on {client_name}: {update_time:.2f} seconds.")
        
        # Compress gradients and simulate communication
        with tf.GradientTape() as tape:
            # Calculate logits
            logits = model(X_client, training=True)
            
            # Compute loss
            loss = tf.keras.losses.binary_crossentropy(X_client, logits)
        
        gradients = tape.gradient(loss, model.trainable_variables)
        compressed_gradients = compress_gradients(gradients)
        print(f"Compressed gradients size for {client_name}: {sum(tf.size(grad).numpy() for grad in compressed_gradients) / (1024 * 1024):.2f} MB")
    
    # Evaluate the updated global model
    print("Evaluating the updated global model...")
    X_global_pred = model.predict(X_global)
    
    # Check and handle NaN values
    X_global = check_and_handle_nan(X_global)
    X_global_pred = check_and_handle_nan(X_global_pred)
    
    # Evaluate reconstruction loss
    reconstruction_loss = mean_squared_error(X_global, X_global_pred)
    print("Reconstruction Loss (Mean Squared Error):", reconstruction_loss)
    
    print("Federated learning completed.")


In [18]:
file_path = 'Metro-Both-Classes.csv'
data = load_and_preprocess_data(file_path)
global_data, client1_data, client2_data = preprocess_data(data)
federated_learning(global_data, client1_data, client2_data)


Preparing global data...
Preparing client data...
Creating and training global model...
Epoch 1/10
5145/5145 - 11s - 2ms/step - loss: -2.7707e+20
Epoch 2/10
5145/5145 - 10s - 2ms/step - loss: nan
Epoch 3/10
5145/5145 - 9s - 2ms/step - loss: nan
Epoch 4/10
5145/5145 - 10s - 2ms/step - loss: nan
Epoch 5/10
5145/5145 - 11s - 2ms/step - loss: nan
Epoch 6/10
5145/5145 - 8s - 2ms/step - loss: nan
Epoch 7/10
5145/5145 - 11s - 2ms/step - loss: nan
Epoch 8/10
5145/5145 - 8s - 1ms/step - loss: nan
Epoch 9/10
5145/5145 - 8s - 1ms/step - loss: nan
Epoch 10/10
5145/5145 - 8s - 1ms/step - loss: nan
Time to train global model: 93.35 seconds.
Fine-tuning model on client data...


Clients:   0%|          | 0/2 [00:00<?, ?client/s]

Fine-tuning on Client 1...
Epoch 1/5
2707/2707 - 4s - 1ms/step - loss: nan
Epoch 2/5
2707/2707 - 4s - 2ms/step - loss: nan
Epoch 3/5
2707/2707 - 4s - 1ms/step - loss: nan
Epoch 4/5
2707/2707 - 4s - 2ms/step - loss: nan
Epoch 5/5
2707/2707 - 4s - 2ms/step - loss: nan
Time to fine-tune on Client 1: 21.00 seconds.


Clients:  50%|█████     | 1/2 [00:21<00:21, 21.22s/client]

Compressed gradients size for Client 1: 0.02 MB
Fine-tuning on Client 2...
Epoch 1/5
2783/2783 - 5s - 2ms/step - loss: nan
Epoch 2/5
2783/2783 - 4s - 2ms/step - loss: nan
Epoch 3/5
2783/2783 - 5s - 2ms/step - loss: nan
Epoch 4/5
2783/2783 - 4s - 2ms/step - loss: nan
Epoch 5/5
2783/2783 - 4s - 2ms/step - loss: nan
Time to fine-tune on Client 2: 23.14 seconds.
Compressed gradients size for Client 2: 0.02 MB


Clients: 100%|██████████| 2/2 [00:44<00:00, 22.29s/client]

Evaluating the updated global model...
[1m   1/5145[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m10:20[0m 121ms/step




[1m5145/5145[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 1ms/step
Reconstruction Loss (Mean Squared Error): 418.58946532034776
Federated learning completed.
