# Pipeline
This small notebook serves to describe the usage and implementation of the python scripts used by the pipeline. Each section shows the flow of each script's main function, in a way to illustrate how they work.

In [1]:
RES_DIR="../proj/simulations/results/"
CONFIG_NAME="NonFrequentUpdates-#0"

These will be the results that will be used for this demonstration.

## Importing Data from Omnet
The process of importing results from their `sca` and `vec` formats (straight from a simulation run) is done in the parseData script

First, the script converts data to their csv format using `scavetool` and opens the csvs as dataframes.

In [2]:
from parseData import convertToCsv, filterMetrics, openDatasets, saveCsv
convertToCsv(RES_DIR, CONFIG_NAME)

sca, vec = openDatasets(CONFIG_NAME)

Exported 935 scalars, 2625 parameters, 2 statistics, 70 histograms
Exported 3112 vectors


### Metrics
Then, the scripts filters all data to select only the attributes that we deem relevant and want to analyze later.
Bellow are the metrics that chose to filter:
+ link layer throughput => eth mac txPk
+ application layer throughput - => Load in routers (incomingDataRate)
+ end to end delay? => Useful?
+ request-response/communication latency
+ link utilization => Use throughput and channel capacity TODO
+ number of train updates that the server has received by simulation time=> Vector
+ Number of train updates sent as a response to each client request. => Histogram + vector
+ Number of train updates that were discarded per request as a result of being expired => Vector + hist

We save this data to override the original csvs

In [3]:
import pandas as pd
# The values bellow are replicated to parseData, to change the scripts go there instead
# Vectors - parse vec dataset
linkLayerThroughput = lambda x: (x["name"] == "txPk:vector(packetBytes)") & (x["type"] == "vector")
appLayerThroughput = lambda x: (x["name"].str.contains("DataRate")) & (x["type"] == "vector") #  (x["module"].str.contains("Router")) => Use this to filter only router
clientResponseDelay = lambda x: x["name"] == "timeToResponse"
serverSentTrainUpdates = lambda x: (x["name"] == "serverSentTrainUpdates")
serverDroppedTrainUpdates = lambda x: (x["name"] == "serverDroppedTrainUpdates")
serverReceivedTrainUpdates = lambda x: (x["name"] == "serverReceivedTrainUpdates")
vec_metrics = (linkLayerThroughput, appLayerThroughput, clientResponseDelay, serverSentTrainUpdates, serverDroppedTrainUpdates, serverReceivedTrainUpdates)

# Histograms - parse sca dataset
clientEndToEndDelay = lambda x: (x["name"] == "endToEndDelay:histogram")  & (x["module"].str.contains("client")) & (x["type"] == "histogram")
#sca_metrics = (clientEndToEndDelay, trainEndToEndDelay)
sca_metrics = [clientEndToEndDelay]

# vec[linkLayerThroughput]
#counts = sca["name"].value_counts()
#counts.to_csv('name.csv',index=True)

# Filter metrics only selects the values above
sca, vec = filterMetrics(sca, vec)
saveCsv(sca, vec, CONFIG_NAME)
sca.head(5)


Unnamed: 0,run,module,name,vectime,vecvalue
179,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.train[1].eth[0].queue,incomingDataRate:vector,0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 ...,0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...
197,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.train[1].eth[0].queue,outgoingDataRate:vector,0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 ...,0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...
222,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.train[1].eth[0].mac,txPk:vector(packetBytes),6.307255452421 6.307275232421 16.060277853876 ...,64 166 166 166 166 166 166 166 166 166 166 64 ...
257,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.trainRouter.eth[1].queue,incomingDataRate:vector,0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 ...,0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...
275,NonFrequentUpdates-0-20221208-21:22:50-7525,MuenchenNetwork.trainRouter.eth[1].queue,outgoingDataRate:vector,0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 ...,0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...


## Evaluate Results
This script uses the data produced in the previous step to generate statistics and results to be analyzed later using plots or tables.

TODO: Definir o que raio queremos medir em cada simulação