#Optimizing Offloading and Resource Allocation in Fog Computing using Deep Q-Networks

In the evolving landscape of Internet of Things (IoT) and edge computing, fog computing has emerged as a pivotal technology to complement cloud infrastructure by providing resources closer to the data source. This proximity aims to reduce latency, save bandwidth, and improve the overall efficiency of computational tasks. However, managing the offloading of tasks and allocating resources in a fog computing environment is a complex challenge due to the dynamic nature of IoT devices, the heterogeneity of resources, and varying network conditions.

This project leverages Deep Q-Networks (DQN), a reinforcement learning technique, to optimize both offloading decisions and resource allocation in a fog computing environment. By employing DQN, the system learns to make intelligent decisions that balance computational cost and time efficiency, thereby enhancing the overall performance of the fog infrastructure.

### Key Components and Methodology:

1. **Fog Computing Environment**:
   - Describes the architecture where fog nodes are placed between IoT devices and cloud data centers.
   - Emphasizes the heterogeneous nature of fog nodes in terms of computational power, storage, and network connectivity.

2. **Reinforcement Learning Framework**:
   - Utilizes a DQN agent to learn the optimal policy for offloading and resource allocation.
   - The agent interacts with the environment (fog nodes and IoT devices) to gather experiences, which are stored in a replay buffer.

3. **Replay Buffer**:
   - Used to store past experiences (state, action, reward, next state) to break the correlation between consecutive experiences and ensure stable learning.

4. **Deep Q-Network Model**:
   - A neural network model that approximates the Q-value function, which represents the expected cumulative reward of taking an action in a given state.
   - The model is trained using the experiences sampled from the replay buffer.

5. **Target Network**:
   - A second neural network that provides stable target values, updated less frequently than the primary network, to stabilize training.

6. **Training Process**:
   - The agent iteratively interacts with the fog environment, making decisions on task offloading and resource allocation.
   - At each step, it updates its knowledge based on the rewards received, aiming to improve the long-term efficiency in terms of cost and computation time.

### Outcomes and Benefits:

The application of DQN to fog computing environments facilitates the following improvements:

- **Reduced Latency**: By strategically offloading tasks to the most appropriate fog nodes, the system minimizes latency, crucial for real-time applications.
- **Cost Efficiency**: Intelligent resource allocation ensures optimal use of available resources, reducing operational costs.
- **Scalability**: The reinforcement learning approach adapts to changes in the environment, making it scalable and robust to varying workloads and network conditions.
- **Enhanced Performance**: The overall efficiency of the fog computing infrastructure is significantly improved, supporting a higher quality of service (QoS) for end-users.

This project demonstrates the potential of reinforcement learning techniques like DQN in transforming fog computing environments, making them more adaptive, efficient, and capable of meeting the demanding needs of modern IoT applications.


## Deep Q-Learning with Keras
we will implement a Deep Q-Network (DQN) for a reinforcement learning task using Keras. The DQN will use a replay buffer to store experiences and a target network for stable learning.

### Step 1: Import Required Libraries
First, we import all the necessary libraries.

In [None]:
from collections import deque
import numpy as np
import random
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import Adam
from keras.models import load_model

### Step 2: Implement the Replay Buffer
The replay buffer stores experiences and allows the agent to train on random batches of these experiences, which leads to more stable learning.

In [None]:
class ReplayBuffer:
    def __init__(self, max_size, input_shape, n_actions, discrete=False):
        self.mem_size = max_size
        self.mem_cntr = 0
        self.discrete = discrete
        self.state_memory = np.zeros((self.mem_size, input_shape))
        self.new_state_memory = np.zeros((self.mem_size, input_shape))
        dtype = np.int8 if self.discrete else np.float32
        self.action_memory = np.zeros((self.mem_size, n_actions), dtype=dtype)
        self.reward_memory = np.zeros(self.mem_size)
        self.terminal_memory = np.zeros(self.mem_size, dtype=np.float32)

    def store_transition(self, state, action, reward, state_, done):
        index = self.mem_cntr % self.mem_size
        self.state_memory[index] = state
        self.new_state_memory[index] = state_
        if self.discrete:
            actions = np.zeros(self.action_memory.shape[1])
            actions[action] = 1.0
            self.action_memory[index] = actions
        else:
            self.action_memory[index] = action
        self.reward_memory[index] = reward
        self.terminal_memory[index] = 1 - done
        self.mem_cntr += 1

    def sample_buffer(self, batch_size):
        max_mem = min(self.mem_cntr, self.mem_size)
        batch = np.random.choice(max_mem, batch_size)
        states = self.state_memory[batch]
        actions = self.action_memory[batch]
        rewards = self.reward_memory[batch]
        states_ = self.new_state_memory[batch]
        terminal = self.terminal_memory[batch]
        return states, actions, rewards, states_, terminal


###Step 3: Implement the DQN Agent
The agent will use two neural networks: one for the current Q-values and one for the target Q-values, which is updated less frequently for stability.

In [None]:
class DQNAgent:
    def __init__(self, state_size, action_size, learning_rate=0.001, discount_factor=0.95, exploration_rate=1.0,
                 exploration_decay=0.995, exploration_min=0.01, batch_size=64, memory_size=2000):
        self.state_size = state_size
        self.action_size = action_size
        self.learning_rate = learning_rate
        self.discount_factor = discount_factor
        self.exploration_rate = exploration_rate
        self.exploration_decay = exploration_decay
        self.exploration_min = exploration_min
        self.batch_size = batch_size
        self.memory = ReplayBuffer(memory_size, state_size, action_size, discrete=True)
        self.model = self._build_model()
        self.target_model = self._build_model()
        self.update_target_model()
        self.target_update_counter = 0

    def _build_model(self):
        model = Sequential()
        model.add(Dense(64, input_dim=self.state_size, activation='relu'))
        model.add(Dropout(0.5))
        model.add(Dense(64, activation='relu'))
        model.add(Dropout(0.5))
        model.add(Dense(32, activation='relu'))
        model.add(Dense(self.action_size, activation='linear'))
        model.compile(optimizer=Adam(learning_rate=self.learning_rate), loss='mse')
        return model

    def update_target_model(self):
        self.target_model.set_weights(self.model.get_weights())

    def remember(self, state, action, reward, next_state, done):
        self.memory.store_transition(state, action, reward, next_state, done)

    def choose_action(self, state):
        if np.random.rand() <= self.exploration_rate:
            return random.randrange(self.action_size)
        state = np.array(state).reshape(1, -1)  # Ensure state is 2D
        q_values = self.model.predict(state, verbose=0)
        return np.argmax(q_values[0])

    def replay(self):
        if self.memory.mem_cntr < self.batch_size:
            return
        states, actions, rewards, next_states, dones = self.memory.sample_buffer(self.batch_size)

        targets = self.model.predict(states, verbose=0)
        target_next = self.target_model.predict(next_states, verbose=0)

        for i in range(self.batch_size):
            action_index = np.argmax(actions[i])  # Find the index of the action
            if dones[i]:
                targets[i, action_index] = rewards[i]
            else:
                targets[i, action_index] = rewards[i] + self.discount_factor * np.amax(target_next[i])

        self.model.fit(states, targets, epochs=1, verbose=0)

        if self.exploration_rate > self.exploration_min:
            self.exploration_rate *= self.exploration_decay

        # Update target model every 10 episodes or steps
        self.target_update_counter += 1
        if self.target_update_counter % 10 == 0:
            self.update_target_model()
            self.target_update_counter = 0

    def load_model(self, path):
        self.model = load_model(path)

    def save_model(self, path):
        self.model.save(path)

## Workflow Parser
This part of code code is designed to facilitate the representation and manipulation of workflows composed of tasks. Each task can have dependencies, and the goal is to model these workflows effectively for use in fog computing environments. It includes the ability to parse DAX files which describe these workflows, and to generate ensembles of workflows based on certain distributions.

Main Components
Task Class: Represents an individual task in the workflow.
Workflow Class: Represents a collection of tasks forming a workflow.
Parsing DAX Files: Functionality to parse XML-based DAX files to create workflow objects.
Generating Workflow Ensembles: Creates multiple workflows based on specified distributions.

### Step 1: Import Required Libraries

In [None]:
import xml.etree.ElementTree as ET
from io import StringIO
import os
import re
import numpy as np

### Step 2: Define the Task Class
The Task class represents an individual computational task.

In [None]:
class Task:
    def __init__(self, id, instructions):
        self.id = id
        self.instructions = instructions  # Execution time or computational instructions
        self.children = []  # List of tasks that depend on this task
        self.parents = []  # List of tasks this task depends on
        self.executed = False  # Status of execution
        self.executed_on = None  # Node this task was executed on
        self.execution_time = 0  # Time taken to execute the task
        self.cost = 0  # Cost of executing the task
        self.comm_delay = 0  # Communication delay in seconds


### Step 3: Define the Workflow Class
The Workflow class manages a collection of Task objects and their dependencies.

In [None]:
class Workflow:
    def __init__(self, id):
        self.id = id  # Workflow identifier
        self.tasks = {}  # Dictionary of tasks in the workflow

    def add_task(self, task_id, instructions, parent_ids=[]):
        if task_id not in self.tasks:
            self.tasks[task_id] = Task(task_id, instructions)
        task = self.tasks[task_id]
        for parent_id in parent_ids:
            if parent_id not in self.tasks:
                self.tasks[parent_id] = Task(parent_id, 0)
            parent_task = self.tasks[parent_id]
            parent_task.children.append(task)
            task.parents.append(parent_task)

### Step 4: Parse DAX File (Static Method)
The parse_dax method parses a DAX XML file and constructs a Workflow object.

In [None]:
    @staticmethod
    def parse_dax(file_path):
        tree = ET.parse(file_path)
        root = tree.getroot()

        workflow_id = root.attrib.get('name')
        workflow = Workflow(workflow_id)

        # Parse jobs
        jobs = {job.attrib['id']: job for job in root.findall('{http://pegasus.isi.edu/schema/DAX}job')}

        # Add jobs to workflow
        for job_id, job in jobs.items():
            instructions = float(job.attrib.get('runtime', 0))
            workflow.add_task(job_id, instructions)

        # Parse dependencies
        for child in root.findall('{http://pegasus.isi.edu/schema/DAX}child'):
            child_id = child.attrib['ref']
            parent_ids = [parent.attrib['ref'] for parent in child.findall('{http://pegasus.isi.edu/schema/DAX}parent')]
            workflow.add_task(child_id, 0, parent_ids)  # Adds a child node with its parent nodes, setting instructions to 0 to avoid overwrite

        return workflow

### Step 5: Generate Ensemble of Workflows
The ensemble_of_workflows method generates a list of workflows based on the specified distribution.

In [None]:
    @staticmethod
    def ensemble_of_workflows(name, size=100, distribution='constant', dax_path=''):
        ensemble = []
        directory_path = dax_path  # Directory containing DAX files

        # List and filter files in directory
        files = os.listdir(directory_path)
        filtered_files = [file for file in files if name in file]

        if distribution == 'constant':
            pattern = r'100(?!\d)'
            for s in filtered_files:
                if re.search(pattern, s):
                    ensemble = [s] * size  # Replicate the matched file 'size' times
                    break
        else:
            numbers = np.random.randint(0, len(filtered_files), size)
            ensemble = [filtered_files[i] for i in numbers]  # Select random files based on uniform distribution

        return ensemble

## Loading Dax files
Importing Required Libraries:

from google.colab import drive: This imports the drive module from the google.colab package. The module provides functions to interact with Google Drive, enabling you to mount your Google Drive storage within a Google Colab environment.
import glob: This imports the glob module, which is used for finding all the pathnames matching a specified pattern according to the rules used by the Unix shell.
Mounting Google Drive:

drive.mount('/content/drive'): This mounts your Google Drive to the /content/drive directory within the Google Colab environment. After mounting, all files and folders stored in your Google Drive become accessible as if they are part of the local file system of the Colab environment. You’ll need to authorize this step, which usually involves a prompt to connect your Google account and grant the necessary permissions.
Specifying the Folder Path:

folder_path = '/content/drive/My Drive/Zahra/dax/': This assigns the directory path /content/drive/My Drive/Zahra/dax/ to the variable folder_path. This path points to a specific folder named dax located inside the Zahra directory in your Google Drive’s “My Drive” section. You can use this path to read files, write files, or perform other file operations within this folder.

In [4]:
from google.colab import drive
import glob

drive.mount('/content/drive')
folder_path = '/content/drive/My Drive/Zahra/dax/'

MessageError: Error: credential propagation was unsuccessful

## simulation of a fog computing environment using reinforcement learning agents
This code snippet implements a comprehensive simulation of a fog computing environment using reinforcement learning agents to optimize task offloading and resource allocation.

### Step1 : Library Imports
* random and numpy: Libraries for random number generation and numerical operations.
* collections: Provides the deque for queue operations and defaultdict for easily creating default dictionary values.
* itertools: Provides product for generating Cartesian products of input iterables, which is useful for hyperparameter tuning.

In [1]:
import random
import numpy as np
from collections import deque, defaultdict
from itertools import product

# Set TensorFlow logging level to suppress detailed logs
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
tf.get_logger().setLevel('ERROR')

NameError: name 'tf' is not defined

### Step2 : Device Class
This class represents a device in the simulation (IoT, Fog, or Server) with attributes such as ID, computational power (mips), and cost per hour. It also includes a task queue and methods to add and retrieve tasks from the queue.

In [None]:
class Device:
    def __init__(self, id, mips, cost_per_hour):
        self.id = id
        self.mips = mips
        self.cost_per_hour = cost_per_hour
        self.queue = deque()

    def add_task_to_queue(self, task):
        self.queue.append(task)

    def get_next_task(self):
        return self.queue.popleft() if self.queue else None


### Step3 : Simulation Class
Initializes the simulation parameters and creates instances of IoT, Fog, and Server devices. It also prepares the reinforcement learning agents and resets the simulation state.

In [None]:
class Simulation:
    def __init__(self, num_iot, num_fog, num_server, learning_rate=0.001, discount_factor=0.95,
                 exploration_rate=1.0, exploration_decay=0.995, exploration_min=0.01, batch_size=64, memory_size=2000):
        self.num_iot = num_iot
        self.num_fog = num_fog
        self.num_server = num_server
        self.learning_rate = learning_rate
        self.discount_factor = discount_factor
        self.exploration_rate = exploration_rate
        self.exploration_decay = exploration_decay
        self.exploration_min = exploration_min
        self.batch_size = batch_size
        self.memory_size = memory_size
        self.reset()

Resets the simulation state, including creating new instances of devices, initializing total delay and cost, and setting up the reinforcement learning agents for IoT and broker tasks.

In [None]:
     def reset(self):
         self.iot_devices = [Device(f'iot_{i}', 500, 0) for i in range(self.num_iot)]
         self.fog_devices = [Device(f'fog_{i}', 4000, 1) for i in range(self.num_fog)]
         self.server_devices = [Device(f'server_{i}', 6000, 8) for i in range(self.num_server)]
         self.total_delay = 0
         self.total_cost = 0
         self.workflows = []
         self.ready_tasks = defaultdict(deque)
         self.iot_agent = DQNAgent(state_size=3, action_size=2, learning_rate=self.learning_rate, discount_factor=self.discount_factor,
                                   exploration_rate=self.exploration_rate, exploration_decay=self.exploration_decay,
                                   exploration_min=self.exploration_min, batch_size=self.batch_size, memory_size=self.memory_size)
         self.broker_agent = DQNAgent(state_size=4, action_size=2, learning_rate=self.learning_rate, discount_factor=self.discount_factor,
                                      exploration_rate=self.exploration_rate, exploration_decay=self.exploration_decay,
                                      exploration_min=self.exploration_min, batch_size=self.batch_size, memory_size=self.memory_size)


### Step4 : Adding Workflows
This method adds workflows to the simulation and assigns initial tasks to IoT devices.

In [None]:
    def add_workflow(self, workflow):
        self.workflows.extend(workflow)
        for workflow in self.workflows:
            iot_device = random.choice(self.iot_devices)
            for task_id, task in workflow.tasks.items():
                if not task.parents:
                    iot_device.add_task_to_queue(task)


### Step5 : Task Execution
Executes a given task on a specified device, adding the execution time and communication delay to the total delay and total cost.

In [None]:
    def execute_task(self, task, device, comm_delay):
        execution_time = task.instructions / device.mips / 1e6
        task.execution_time = execution_time + comm_delay / 1000
        task.comm_delay = comm_delay / 1000
        if device.cost_per_hour == 0:
            delay = task.execution_time
            self.total_delay += delay
            task.cost = 0
        else:
            cost = execution_time * device.cost_per_hour
            self.total_cost += cost
            task.cost = cost
            self.total_delay += comm_delay / 1000
        task.executed_on = device.id
        task.executed = True


### Step6 : Simulation Loop
This is the core simulation loop, iterating over all devices and their task queues, and using the reinforcement learning agents to decide on task execution policies.

In [None]:
    def simulate(self):
      while any([device.queue for device in self.iot_devices + self.fog_devices + self.server_devices]):
         for device in self.iot_devices + self.fog_devices + self.server_devices:
            if not device.queue:
                continue

            task = device.get_next_task()
            if not task:
                continue

            pending_tasks = len(device.queue)
            state = np.array([self.total_cost, self.total_delay, pending_tasks]).reshape(1, -1)

            if device in self.iot_devices:
                action = self.iot_agent.choose_action(state)
            else:
                broker_state = np.array([self.total_cost, self.total_delay, 0, pending_tasks]).reshape(1, -1)
                action = self.broker_agent.choose_action(broker_state)

            done = False

          // More code snippets for task execution


### Step7 : Hyperparameter Tuning
This function performs hyperparameter tuning by running multiple simulations with different sets of parameters and selecting the best ones based on performance metrics.

In [None]:
def hyperparameter_tuning(num_runs=100):
    learning_rates = [0.0001]
    discount_factors = [0.99]
    exploration_rates = [0.5]
    exploration_decays = [0.995]
    exploration_mins = [ 0.05]

    best_mean_delay = float('inf')
    best_mean_cost = float('inf')
    best_params = None

    import os

    for lr, df, er, ed, em in product(learning_rates, discount_factors, exploration_rates, exploration_decays, exploration_mins):
            simulation = Simulation(num_iot=10, num_fog=8, num_server=5, learning_rate=lr, discount_factor=df, exploration_rate=er, exploration_decay=ed, exploration_min=em)
            mean_delay, mean_cost = simulation.run_simulation(num_runs=num_runs, dax_path=folder_path)
            #print(f"Params: LR={lr}, DF={df}, ER={er}, ED={ed}, EM={em} -> Mean Delay: {mean_delay:.2f}, Mean Cost: ${mean_cost:.2f}")
            if mean_delay < best_mean_delay or (mean_delay == best_mean_delay and mean_cost < best_mean_cost):
                best_mean_delay = mean_delay
                best_mean_cost = mean_cost
                best_params = (lr, df, er, ed, em)

                simulation.iot_agent.save_model(f'best_iot_agent_model_.keras')
                simulation.broker_agent.save_model(f'best_broker_agent_model.keras')


            print(f"Best Params: LR={best_params[0]}, DF={best_params[1]}, ER={best_params[2]}, ED={best_params[3]}, EM={best_params [4]} -> Mean Delay: {best_mean_delay:.2f}, Mean Cost: ${best_mean_cost:.2f}")


In [None]:
# Run hyperparameter tuning
hyperparameter_tuning(num_runs=1)