# Pipeline
This small notebook serves to describe the usage and implementation of the python scripts used by the pipeline. Each section shows the flow of each script's main function, in a way to illustrate how they work.

In [1]:
RES_DIR="../proj/simulations/results/"
CONFIG_NAME="NonFrequentUpdates-#0"

These will be the results that will be used for this demonstration.

## Importing Data from Omnet
The process of importing results from their `sca` and `vec` formats (straight from a simulation run) is done in the parseData script

First, the script converts data to their csv format using `scavetool` and opens the csvs as dataframes.

In [2]:
from parseData import convertToCsv, filterMetrics, filterNans, openDatasets, saveCsv, convertValsToList
convertToCsv(RES_DIR, CONFIG_NAME)

sca, vec = openDatasets(CONFIG_NAME) # This opens the files and parses some values into a workable format
# Remove entries that don't have data
sca, vec = filterNans(sca, vec)
sca, vec = convertValsToList(sca, vec)

Exported 935 scalars, 2625 parameters, 2 statistics, 70 histograms
Exported 3112 vectors


### Metrics
Then, the scripts filters all data to select only the attributes that we deem relevant and want to analyze later.
Bellow are the metrics we chose to filter:
+ link layer throughput => eth mac txPk
+ application layer throughput - => Load in routers (incomingDataRate)
+ end to end delay? => Useful?
+ request-response/communication latency
+ link utilization => Use throughput and channel capacity TODO
+ number of train updates that the server has received by simulation time=> Vector
+ Number of train updates sent as a response to each client request => Histogram + vector
+ Number of train updates that were discarded per request as a result of being expired => Vector + hist

We save this data to override the original csvs

In [3]:
import pandas as pd
# The values bellow are replicated to parseData, to change the scripts go there instead.
# Vectors - parse vec dataset
linkLayerThroughput = lambda x: (x["name"] == "txPk:vector(packetBytes)") & (("type" not in x) or (x["type"] == "vector"))
appLayerThroughput = lambda x: (x["name"].str.contains("DataRate")) & (("type" not in x) or (x["type"] == "vector")) #  (x["module"].str.contains("Router")) => Use this to filter only router
clientResponseDelay = lambda x: x["name"] == "timeToResponse"
serverSentTrainUpdates = lambda x: (x["name"] == "serverSentTrainUpdates")
serverDroppedTrainUpdates = lambda x: (x["name"] == "serverDroppedTrainUpdates")
serverReceivedTrainUpdates = lambda x: (x["name"] == "serverReceivedTrainUpdates")
vec_metrics = (linkLayerThroughput, appLayerThroughput, clientResponseDelay, serverSentTrainUpdates, serverDroppedTrainUpdates, serverReceivedTrainUpdates)

# Histograms - parse sca dataset
clientEndToEndDelay = lambda x: (x["name"] == "endToEndDelay:histogram")  & (x["module"].str.contains("client")) & (("type" not in x) or (x["type"] == "histogram"))
#sca_metrics = (clientEndToEndDelay, trainEndToEndDelay)
sca_metrics = [clientEndToEndDelay]

# vec[linkLayerThroughput]
#counts = sca["name"].value_counts()
#counts.to_csv('name.csv',index=True)

# Filter metrics only selects the values above
sca, vec = filterMetrics(sca, vec)
vec


Unnamed: 0,run,module,name,vectime,vecvalue
179,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.train[1].eth[0].queue,incomingDataRate:vector,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ..."
197,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.train[1].eth[0].queue,outgoingDataRate:vector,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ..."
222,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.train[1].eth[0].mac,txPk:vector(packetBytes),"[6.307255452421, 6.307275232421, 16.0602778538...","[64.0, 166.0, 166.0, 166.0, 166.0, 166.0, 166...."
257,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.trainRouter.eth[1].queue,incomingDataRate:vector,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ..."
275,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.trainRouter.eth[1].queue,outgoingDataRate:vector,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ..."
300,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.trainRouter.eth[1].mac,txPk:vector(packetBytes),"[6.307261262421, 130.760419253572, 255.5405511...","[64.0, 64.0, 64.0, 64.0, 64.0, 64.0, 64.0, 64...."
330,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.trainRouter.eth[2].queue,incomingDataRate:vector,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ..."
348,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.trainRouter.eth[2].queue,outgoingDataRate:vector,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ..."
373,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.trainRouter.eth[2].mac,txPk:vector(packetBytes),"[6.307281042421, 6.307300822421, 7.37915885222...","[64.0, 166.0, 166.0, 166.0, 166.0, 166.0, 166...."
408,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.server.eth[1].queue,incomingDataRate:vector,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ..."


## Evaluate Results
This script uses the data produced in the previous step to generate statistics and results to be analyzed later using plots or tables.
The statistics that were deemed relevant are as follows:
+ link layer throughput and application layer throughput
  + clients => **calculate overall std, min, max**
  + server apps => **evolution of throughput**
+ end to end delay/request-response/communication latency => **clients histogram, max, mean, 99 percentile** - Quality of experience
+ link utilization => **Percentage Router to Server** => See if it's proportional to no of trains
+ number of train updates that the server has received by simulation time
+ Number of train updates sent as a response to each client request
+ Number of train updates that were discarded per request as a result of being expired => Relationship with delay, **Standard deviation, Max**j

Add std min and max to dataset

In [4]:
def addStatistics(df, colname):
    import numpy as np
    df[colname + "_std"] = df[colname].apply(lambda x: np.std(x))
    df[colname + "_max"] = df[colname].apply(lambda x: np.max(x))
    df[colname + "_min"] = df[colname].apply(lambda x: np.min(x))
    df[colname + "_mean"] = df[colname].apply(lambda x: np.mean(x))
    df[colname + "_avg"] = df[colname].apply(lambda x: np.average(x))

addStatistics(vec, "vecvalue")
#addStd(sca, "vecvalue")
vec


Unnamed: 0,run,module,name,vectime,vecvalue,vecvalue_std,vecvalue_max,vecvalue_min,vecvalue_mean,vecvalue_avg
179,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.train[1].eth[0].queue,incomingDataRate:vector,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...",1227.539043,16960.0,0.0,110.606496,110.606496
197,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.train[1].eth[0].queue,outgoingDataRate:vector,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...",1227.539043,16960.0,0.0,110.606496,110.606496
222,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.train[1].eth[0].mac,txPk:vector(packetBytes),"[6.307255452421, 6.307275232421, 16.0602778538...","[64.0, 166.0, 166.0, 166.0, 166.0, 166.0, 166....",29.237622,166.0,64.0,156.787097,156.787097
257,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.trainRouter.eth[1].queue,incomingDataRate:vector,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...",104.357768,3680.0,0.0,2.961771,2.961771
275,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.trainRouter.eth[1].queue,outgoingDataRate:vector,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...",104.357768,3680.0,0.0,2.961771,2.961771
300,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.trainRouter.eth[1].mac,txPk:vector(packetBytes),"[6.307261262421, 130.760419253572, 255.5405511...","[64.0, 64.0, 64.0, 64.0, 64.0, 64.0, 64.0, 64....",0.0,64.0,64.0,64.0,64.0
330,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.trainRouter.eth[2].queue,incomingDataRate:vector,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...",1720.989535,26560.0,0.0,218.251222,218.251222
348,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.trainRouter.eth[2].queue,outgoingDataRate:vector,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...",1720.989535,26560.0,0.0,218.251222,218.251222
373,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.trainRouter.eth[2].mac,txPk:vector(packetBytes),"[6.307281042421, 6.307300822421, 7.37915885222...","[64.0, 166.0, 166.0, 166.0, 166.0, 166.0, 166....",21.651951,166.0,64.0,161.175676,161.175676
408,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.server.eth[1].queue,incomingDataRate:vector,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...",104.357768,3680.0,0.0,2.961771,2.961771


Some functions to parse modules - for later use

In [5]:
filterByClients = lambda x: x["module"].str.contains("client\[")
filterByClientRouter = lambda x: x["module"].str.contains("clientR")
filterByTrains = lambda x: x["module"].str.contains("train\[")
filterByTrainRouter = lambda x: x["module"].str.contains("trainR")

In [6]:
vec[(appLayerThroughput(vec)) & (filterByClients(vec))]

Unnamed: 0,run,module,name,vectime,vecvalue,vecvalue_std,vecvalue_max,vecvalue_min,vecvalue_mean,vecvalue_avg
667,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.client[0].eth[0].queue,incomingDataRate:vector,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...",2001.370637,28320.0,0.0,367.208968,367.208968
685,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.client[0].eth[0].queue,outgoingDataRate:vector,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...",2001.370637,28320.0,0.0,367.208968,367.208968
1095,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.client[1].eth[0].queue,incomingDataRate:vector,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...",1939.290375,22640.0,0.0,348.886462,348.886462
1113,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.client[1].eth[0].queue,outgoingDataRate:vector,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...",1939.290375,22640.0,0.0,348.886462,348.886462
1350,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.client[2].eth[0].queue,incomingDataRate:vector,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...",1970.001153,22640.0,0.0,355.909169,355.909169
1368,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.client[2].eth[0].queue,outgoingDataRate:vector,"[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, ...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...",1970.001153,22640.0,0.0,355.909169,355.909169


Save results to csv

In [None]:
saveCsv(sca, vec, CONFIG_NAME)