# Monitoring EdgeSimPy Simulations

In-depth data analysis is vital in simulation-based research. Considering this, EdgeSimPy incorporates a robust monitoring mechanism that collects a large amount of information about the simulated entities at each time step, enabling a deep understanding of the phenomena occurred during the simulation.

Rather than using traditional formats for storing monitoring data, EdgeSimPy uses [MessagePack](https://msgpack.org/), an efficient serialization format. In a nutshell, MessagePack is like JSON but faster and smaller. While MessagePack raw files are binary, we can convert them to Python dictionaries and Pandas data frames with a simple command. By adopting MessagePack for storing logs, EdgeSimPy can collect a large amount of data without sacrificing the simulation performance or consuming computational resources excessively.

This notebook overviews how we can retrieve monitored data in EdgeSimPy. In addition, it shows how to instruct EdgeSimpy to collect custom metrics.

## Running the Simulation

As the primary goal of this notebook is detailing EdgeSimPy monitoring, we will not dive into how to configure a simulation on EdgeSimPy. Instead, we will use a simple scenario described in [this notebook](https://github.com/EdgeSimPy/edgesimpy-tutorials/blob/master/notebooks/creating-placement-algorithm.ipynb).

Rather than saving logs to disk at each time step, EdgeSimPy dumps monitoring data to disk at fixed time step intervals. We can set up this disk dump interval through the `dump_interval`, which is set when creating an instance of the `Simulator` class. For example, if we set `dump_interval=10`, EdgeSimPy will store the logs on disk every ten simulation time steps.

By default, simulation logs are stored in the `logs` directory (we don't need to create such a directory. EdgeSimPy will do that automatically). If we don't want to save log files on disk, we can set `dump_interval=float("inf")`.

Let's go ahead and set up the simulation, instructing EdgeSimPy to dump log data to disk each five simulation time steps.

In [None]:
try:
    # Importing EdgeSimPy components
    from edge_sim_py import *
    import networkx as nx
    import msgpack

    # Importing Matplotlib, Pandas, and NumPy for logs parsing and visualization
    import matplotlib.pyplot as plt
    import pandas as pd
    import numpy as np

except ModuleNotFoundError:
    # Downloading EdgeSimPy binaries from GitHub (the "-q" parameter suppresses Pip's output. You check the full logs by removing it)
    %pip install -q git+https://github.com/EdgeSimPy/EdgeSimPy.git

    # Downloading Pandas, NumPy, and Matplotlib (these are not directly used here, but they can be useful for logs parsing and visualization)
    %pip install -q pandas
    %pip install -q numpy
    %pip install -q matplotlib

    # Importing EdgeSimPy components and its built-in libraries (NetworkX and MessagePack)
    from edge_sim_py import *
    import networkx as nx
    import msgpack

    # Importing Matplotlib, Pandas, and NumPy for logs parsing and visualization
    import matplotlib.pyplot as plt
    import pandas as pd
    import numpy as np

# Importing Python's default modules
import os
import random

In [2]:
def my_algorithm(parameters):
    # We can always call the 'all()' method to get a list with all created instances of a given class
    for service in Service.all():
        # We don't want to migrate services are are already being migrated
        if service.server == None and not service.being_provisioned:

            # Let's iterate over the list of edge servers to find a suitable host for our service
            for edge_server in EdgeServer.all():

                # We must check if the edge server has enough resources to host the service
                if edge_server.has_capacity_to_host(service=service):

                    # Start provisioning the service in the edge server
                    service.provision(target_server=edge_server)

                    # After start migrating the service we can move on to the next service
                    break


def stopping_criterion(model: object):
    # Defining a variable that will help us to count the number of services successfully provisioned within the infrastructure
    provisioned_services = 0
    
    # Iterating over the list of services to count the number of services provisioned within the infrastructure
    for service in Service.all():

        # Initially, services are not hosted by any server (i.e., their "server" attribute is None).
        # Once that value changes, we know that it has been successfully provisioned inside an edge server.
        if service.server != None:
            provisioned_services += 1
    
    # As EdgeSimPy will halt the simulation whenever this function returns True, its output will be a boolean expression
    # that checks if the number of provisioned services equals to the number of services spawned in our simulation
    return provisioned_services == Service.count()

In [3]:
# Creating a Simulator object
simulator = Simulator(
    dump_interval=5,
    tick_duration=1,
    tick_unit="seconds",
    stopping_criterion=stopping_criterion,
    resource_management_algorithm=my_algorithm,
)

# Loading a sample dataset from GitHub
simulator.initialize(input_file="https://raw.githubusercontent.com/EdgeSimPy/edgesimpy-tutorials/master/datasets/sample_dataset2.json")

# Executing the simulation
simulator.run_model()

## Checking Logs

There are two ways we can access the simulation logs generated by EdgeSimPy:
- **Option 1:** Accessing the variables that store the logs directly.
- **Option 2:** Accessing log files stored on disk.

> Please notice that whenever EdgeSimPy dumps simulation logs to the disk, it resets the simulation variables that stored that data to avoid excessive memory usage. If you want to access the log variables directly, do not forget to set dump_interval=float("inf").

### Option 1 (Accessing Variables Directly)

We can access the simulation logs through the `agent_metrics` attribute of our instance of the `Simulator` class. As this attribute stores logs of all entities in the simulation, let's get only the user logs.

In [4]:
simulator.agent_metrics["User"]

[{'Object': 'User_1',
  'Time Step': 0,
  'Instance ID': 1,
  'Coordinates': [6, 0],
  'Base Station': 'BaseStation_4 ([6, 0])',
  'Delays': {'1': None},
  'Communication Paths': {},
  'Making Requests': {'1': {'1': True}},
  'Access History': {'1': [{'start': 1,
     'end': inf,
     'duration': inf,
     'waiting_time': 0,
     'access_time': 0,
     'interval': 0,
     'next_access': inf}]}},
 {'Object': 'User_2',
  'Time Step': 0,
  'Instance ID': 2,
  'Coordinates': [3, 1],
  'Base Station': 'BaseStation_6 ([3, 1])',
  'Delays': {'2': None},
  'Communication Paths': {},
  'Making Requests': {'2': {'1': True}},
  'Access History': {'2': [{'start': 1,
     'end': inf,
     'duration': inf,
     'waiting_time': 0,
     'access_time': 0,
     'interval': 0,
     'next_access': inf}]}},
 {'Object': 'User_3',
  'Time Step': 0,
  'Instance ID': 3,
  'Coordinates': [2, 2],
  'Base Station': 'BaseStation_10 ([2, 2])',
  'Delays': {'3': None},
  'Communication Paths': {},
  'Making Requests

### Option 2 (Accessing Log Files)

In the cell below, we use the built-in functions of Python's `os` module to find all the MessagePack files created by EdgeSimPy with the simulation logs. Once we know where the MessagePack files are, we can read these files and convert them to Pandas data frames.

In [5]:
# Gathering the list of msgpack files in the current directory
logs_directory = f"{os.getcwd()}/logs"
dataset_files = [file for file in os.listdir(logs_directory) if ".msgpack" in file]

# Reading msgpack files found
datasets = {}
for file in dataset_files:
    with open(f"logs/{file}", "rb") as data_file:
        datasets[file.replace(".msgpack", "")] = pd.DataFrame(msgpack.unpackb(data_file.read(), strict_map_key=False))

Now we have all the simulation logs stored in Pandas data frames. To check the format of these files, let's access the edge server logs:

In [6]:
datasets["EdgeServer"]

Unnamed: 0,Object,Time Step,Instance ID,Coordinates,Available,CPU,RAM,Disk,CPU Demand,RAM Demand,Disk Demand,Ongoing Migrations,Services,Registries,Layers,Images,Download Queue,Waiting Queue,Max. Concurrent Layer Downloads,Power Consumption
0,EdgeServer_1,0,1,"[0, 0]",True,8,16384,131072,0,0,0,0,[],[],[],[],[],[],3,165.996
1,EdgeServer_2,0,2,"[0, 2]",True,8,16384,131072,0,0,0,0,[],[],[],[],[],[],3,165.996
2,EdgeServer_3,0,3,"[6, 0]",True,8,8192,131072,0,0,0,0,[],[],[],[],[],[],3,66.9914
3,EdgeServer_4,0,4,"[1, 3]",True,8,8192,131072,0,0,0,0,[],[],[],[],[],[],3,66.9914
4,EdgeServer_5,0,5,"[7, 1]",True,12,16384,131072,1,1024,1017,0,[],[1],"[ADD file:5d673d25da3a14ce1f6cf, /bin/sh -c se...","[registry, alpine, nginx, ubuntu, python, redi...",[],[],3,74.508333
5,EdgeServer_6,0,6,"[6, 2]",True,12,16384,131072,0,0,0,0,[],[],[],[],[],[],3,63.1
6,EdgeServer_1,1,1,"[0, 0]",True,8,16384,131072,6,12288,82,6,[],[],[ADD file:5d673d25da3a14ce1f6cf],[],"[ADD file:966d3669b40f5fbaecee1, /bin/sh -c se...",[ADD file:b83df51ab7caf8a4dc35f],3,240.249
7,EdgeServer_2,1,2,"[0, 2]",True,8,16384,131072,0,0,0,0,[],[],[],[],[],[],3,165.996
8,EdgeServer_3,1,3,"[6, 0]",True,8,8192,131072,0,0,0,0,[],[],[],[],[],[],3,66.9914
9,EdgeServer_4,1,4,"[1, 3]",True,8,8192,131072,0,0,0,0,[],[],[],[],[],[],3,66.9914


As EdgeSimPy stores a large amount of data for each entity, we can retrieve only information we are interested in through the Pandas `filter()` method.

In [7]:
# Defining the data frame columns that will be exhibited
properties = ['Coordinates', 'CPU Demand', 'RAM Demand', 'Disk Demand', 'Services']
columns = ['Time Step', 'Instance ID'] + properties

dataframe = datasets["EdgeServer"].filter(items=columns)
dataframe

Unnamed: 0,Time Step,Instance ID,Coordinates,CPU Demand,RAM Demand,Disk Demand,Services
0,0,1,"[0, 0]",0,0,0,[]
1,0,2,"[0, 2]",0,0,0,[]
2,0,3,"[6, 0]",0,0,0,[]
3,0,4,"[1, 3]",0,0,0,[]
4,0,5,"[7, 1]",1,1024,1017,[]
5,0,6,"[6, 2]",0,0,0,[]
6,1,1,"[0, 0]",6,12288,82,[]
7,1,2,"[0, 2]",0,0,0,[]
8,1,3,"[6, 0]",0,0,0,[]
9,1,4,"[1, 3]",0,0,0,[]


## Monitoring Custom Metrics

Although EdgeSimPy collects a large amount of data from the simulated entities, we may need to monitor custom metrics. We can do that by extending the `collect()` method, which is present in all EdgeSimPy entities.

If our custom metric involves data from multiple entities, we can collect it by extending the `collect()` method of the `Simulator` class. Otherwise, we can customize the `collect()` method of a specific entity.

In this example, let's extend the `collect()` method of the `NetworkSwitch` class, adding a sample metric called `temperature`, which is generated randomly at each simulation time step. More specifically, let's add a new key to the `metrics` dictionary exported by the `collect()` method with our new metric.

> Please notice that changing any existing information in the `collect()` method will affect the set of logs collected by EdgeSimPy.

In [8]:
def custom_collect_method(self) -> dict:
    temperature = random.randint(10, 50)  # Generating a random integer between 10 and 50 representing the switch's temperature
    metrics = {
        "Instance ID": self.id,
        "Power Consumption": self.get_power_consumption(),
        "Temperature": temperature,
    }
    return metrics

# Overriding the NetworkSwitch's collect() method
NetworkSwitch.collect = custom_collect_method

Now that we've extended the `collect()` method to get the temperature of the network switches at each simulation time step, let's create a new simulation and check the logs.

In [9]:
# Creating a Simulator object
simulator = Simulator(
    dump_interval=5,
    tick_duration=1,
    tick_unit="seconds",
    stopping_criterion=stopping_criterion,
    resource_management_algorithm=my_algorithm,
)

# Loading a sample dataset from GitHub
simulator.initialize(input_file="https://raw.githubusercontent.com/EdgeSimPy/edgesimpy-tutorials/master/datasets/sample_dataset2.json")

# Executing the simulation
simulator.run_model()

# Creating a Pandas data frame with the network switch logs
logs = pd.DataFrame(simulator.agent_metrics["NetworkSwitch"])
logs

Unnamed: 0,Object,Time Step,Instance ID,Power Consumption,Temperature
0,NetworkSwitch_1,0,1,60.6,43
1,NetworkSwitch_2,0,2,61.2,25
2,NetworkSwitch_3,0,3,61.2,44
3,NetworkSwitch_4,0,4,60.9,37
4,NetworkSwitch_5,0,5,61.5,19
...,...,...,...,...,...
139,NetworkSwitch_12,8,12,61.5,34
140,NetworkSwitch_13,8,13,60.9,28
141,NetworkSwitch_14,8,14,61.2,36
142,NetworkSwitch_15,8,15,61.2,15
