
# Post-mortem analysis

Referenced test plan: https://docs.openstack.org/performance-docs/latest/test_plans/massively_distribute_rpc/plan.html

Test considered: Test Case 1 (One single large target)

## System under test


```
Client 1---------+      +----------------------+     +-----> Server 1
                 |      |                      |     |
                 +----> |  RabbitMQ            | ----+-----> Server 2
Client 2--------------> |  Standalone          |     |
                 +----> |                      |     |
...              |      |                      |     |
                 |      +----------------------+     +------> Server n
Client n---------+              |                             /
  \                                                         /
    \                           |                         / 
      \  --  --  --  --  -- Monitoring --  --  --  --  --
```

Direct links : 

* [Hardware](#Hardware)
* [Software](#Software)
* Get Ombt stats
    * [Ombt-statistics](#Ombt-statistics) (general stats: message_rate, latency, ...)
    * [Graphs](#Ombt-statistics-graphs)
* Get Influxdb stats
    * [Influxdb-Metrics](#Influxdb-Metrics) (can take time, use if system metrics are needed, ...)
    * [Graphs](#Graphs) of system metrics
        * [RPC-CALLs-metrics](#RPC-CALLs-metrics)
        * [RPC-CASTs-metrics](#RPC-CASTs-metrics)
        
# Hardware

Platform: [Grid'5000](https://www.grid5000.fr/mediawiki/index.php/Grid5000:Home)

* Bus: One dedicated physical machine
    * 1x: https://www.grid5000.fr/mediawiki/index.php/Rennes:Hardware#Dell_Poweredge_R630_.28parasilo.29

* Clients: 125 physical machines.
    * 40x: https://www.grid5000.fr/mediawiki/index.php/Rennes:Hardware#Dell_Poweredge_R630_.28paravance.29

* servers: 30 + 8 physical machines
    * 20x: https://www.grid5000.fr/mediawiki/index.php/Rennes:Hardware#Dell_Poweredge_R630_.28paravance.29
 
 
# Software

* Linux distribution

```
Distributor ID: Debian
Description:    Debian GNU/Linux 9.3 (stretch)
Release:        9.3
Codename:       stretch
```

* RabbitMQ

TODO

* Ombt versions

TODO

Built from [6ac8255](https://github.com/kgiusti/ombt/commit/6ac8255896c794254bad080fc89dcbb1d9e0cb38)
    * oslo.messaging==5.35.0
    * pyngus==2.2.2
    * python-qpid-proton==0.19.0
    

# Data

This notebook was run with: http://enos.irisa.fr/ombt-orchestrator/test_case_1_rabbitmq/



# Get ombt statistics
## Preparation

In [None]:
# The path to the env dir of the experimental campaign
RESULT_PATH = "./test_case_1-incremental-it-1/" 

In [None]:
# Inserting some ombt code (this could be removed when used as a library)
# This is used to recover the global stats from the per-agent stats
# Per agent stats are outputed from the controller in a dedicated.

import math

class Stats(object):
    """Manage a single statistic"""
    def __init__(self, min=None, max=None, total=0, count=0, sum_of_squares=0, distribution=None):
        self.min = min
        self.max = max
        self.total = total
        self.count = count
        self.sum_of_squares = sum_of_squares
        # distribution of values grouped by powers of 10
        self.distribution = distribution or dict()

    @classmethod
    def from_dict(cls, values):
        if 'distribution' in values:
            # hack alert!
            # when a Stats is passed via an RPC call it appears as if the
            # distribution map's keys are converted from int to str.
            # Fix that by re-indexing the distribution map:
            new_dict = dict()
            old_dict = values['distribution']
            for k in old_dict.keys():
                new_dict[int(k)] = old_dict[k];
            values['distribution'] = new_dict
        return Stats(**values)

    def to_dict(self):
        new_dict = dict()
        for a in ["min", "max", "total", "count", "sum_of_squares"]:
            new_dict[a] = getattr(self, a)
        new_dict["distribution"] = self.distribution.copy()
        return new_dict

    def update(self, value):
        self.total += value
        self.count += 1
        self.sum_of_squares += value**2
        self.min = min(self.min, value) if self.min else value
        self.max = max(self.max, value) if self.max else value
        log = int(math.log10(value)) if value >= 1.0 else 0
        base = 10**log
        index = int(value/base)  # 0..9
        if log not in self.distribution:
            self.distribution[log] = [0 for i in range(10)]
        self.distribution[log][index] += 1

    def reset(self):
        self.__init__()

    def average(self):
        return (self.total / float(self.count)) if self.count else 0

    def std_deviation(self):
        return math.sqrt((self.sum_of_squares / float(self.count))
                         - (self.average() ** 2)) if self.count else -1

    def merge(self, stats):
        if stats.min is not None and self.min is not None:
            self.min = min(self.min, stats.min)
        else:
            self.min = self.min or stats.min
        if stats.max is not None and self.max is not None:
            self.max = max(self.max, stats.max)
        else:
            self.max = self.max or stats.max

        self.total += stats.total
        self.count += stats.count
        self.sum_of_squares += stats.sum_of_squares
        for k in stats.distribution.keys():
            if k in self.distribution:
                self.distribution[k] = [z for z in map(lambda a, b: a + b,
                                                       stats.distribution[k],
                                                       self.distribution[k])]
            else:
                self.distribution[k] = stats.distribution[k]

    def __str__(self):
        return "min=%i, max=%i, avg=%f, std-dev=%f" % (self.min, self.max,
                                                       self.average(),
                                                       self.std_deviation())

    def print_distribution(self):
        keys = list(self.distribution.keys())
        keys.sort()
        for order in keys:
            row = self.distribution[order]
            # order=0, index=0 is special case as it is < 1.0, for all orders >
            # 0, index 0 is ignored since everthing < 10^order is accounted for
            # in index 9 of the (order - 1) row
            index = 0 if order == 0 else 1
            while index < len(row):
                print("[%d..<%d):  %d" %
                      ((10 ** int(order)) * index,
                       (10 ** int(order)) * (index + 1),
                       row[index]))
                index += 1

class TestResults(object):
    """Client results of a test run.
    """
    def __init__(self, start_time=None, stop_time=None, latency=None,
                 msgs_ok=0, msgs_fail=0, errors=None):
        super(TestResults, self).__init__()
        self.start_time = start_time
        self.stop_time = stop_time
        self.latency = latency or Stats()
        self.msgs_ok = msgs_ok  # count of successful msg transfers
        self.msgs_fail = msgs_fail  # count of failed msg transfers
        self.errors = errors or dict()  # error msgs and counts

    @classmethod
    def from_dict(cls, values):
        if 'latency' in values:
            values['latency'] = Stats.from_dict(values['latency'])
        if 'errors' in values:
            values['errors'] = values['errors'].copy()
        return TestResults(**values)

    def to_dict(self):
        new_dict = dict()
        for a in ['start_time', 'stop_time', 'msgs_ok', 'msgs_fail']:
            new_dict[a] = getattr(self, a)
        new_dict['latency'] = self.latency.to_dict()
        new_dict['errors'] = self.errors.copy()
        return new_dict

    def error(self, reason):
        key = str(reason)
        self.errors[key] = self.errors.get(key, 0) + 1

    def reset(self):
        self.__init__()

    def merge(self, results):
        self.start_time = (min(self.start_time, results.start_time)
                           if self.start_time and results.start_time
                           else (self.start_time or results.start_time))
        self.stop_time = (max(self.stop_time, results.stop_time)
                              if self.stop_time and results.stop_time
                          else (self.stop_time or results.stop_time))
        self.msgs_ok += results.msgs_ok
        self.msgs_fail += results.msgs_fail
        self.latency.merge(results.latency)
        for err in results.errors:
            self.errors[err] = self.errors.get(err, 0) + results.errors[err]

    def print_results(self):
        if self.msgs_fail:
            print("Error: %d message transfers failed"
                  % self.msgs_fail)
        if self.errors:
            print("Error: errors detected:")
            for err in self.errors:
                print("  '%s' (occurred %d times)" % (err, self.errors[err]))

        total = self.msgs_ok + self.msgs_fail
        print("Total Messages: %d" % total)

        delta_time = self.stop_time - self.start_time
        print("Test Interval: %f - %f (%f secs)" % (self.start_time,
                                                    self.stop_time,
                                                    delta_time))

        if delta_time > 0.0:
            print("Aggregate throughput: %f msgs/sec" % (float(total)/delta_time))

        latency = self.latency
        if latency.count:
            print("Latency %d samples (msecs): Average %f StdDev %f"
                  " Min %f Max %f"
                  % (latency.count,
                     latency.average(), latency.std_deviation(),
                     latency.min, latency.max))
            print("Latency Distribution: ")
            latency.print_distribution()


In [None]:
# Some util functions

import glob
import json
import statistics
from os import path

def load_stats(param):
    """Loads the stats for the controller output file."""
    try:
        all_controller = path.join(RESULT_PATH, param["backup_dir"], "*controller*.log")
        all_controller_docker = path.join(RESULT_PATH, param["backup_dir"], "*controller*_docker.log")
        # beware of the files _docker.log that would also match
        # and contains the global stats in a human readable format.
        allfiles = glob.glob(all_controller)
        alldocker = glob.glob(all_controller_docker)
        controller_logs = set(allfiles) - set(alldocker)

        stats_clients = {}
        stats_servers = {}
        # We build aggregates on all shards
        for controller_log in controller_logs:
            with open(controller_log) as f:
                a = f.readlines()
                # NOTE make sure rpc client|server names are different accros shards
                stats_clients.update(json.loads(a[0]))
                stats_servers.update(json.loads(a[1]))
        return stats_clients, stats_servers
    except:
        return False
    
def build_agg_results(results):
    agg = TestResults()
    for result in results:
        result["latency"] = Stats(**result["latency"])
        agg.merge(TestResults(**result))
        
    duration = agg.stop_time - agg.start_time
    total = agg.msgs_ok + agg.msgs_fail
    rate = float(total)/duration
    result = agg.to_dict()
    result["rate"] = rate
    result["latency_avg"] = agg.latency.average()
    result["latency_stdev"] = agg.latency.std_deviation()
    return result

def build_msgs_stats(results, msg_type):
    # NOTE(msimonin): we don't expect a TestResult here
    msgs = [r[msg_type] for r in results]
    return {
        "mean": statistics.mean(msgs),
        #"stdev": statistics.stdev(msgs),
        "min": min(msgs),
        "max": max(msgs)
    }

def augment(mydict, myparams, in_key, out_key=None):
    out_key = out_key or in_key
    mydict.update({out_key: [p[in_key] for p in myparams]})

In [None]:
# Load the params from the params file

params = []
with open(path.join(RESULT_PATH, "params.json")) as f:
    params = json.load(f)

In [None]:
# Wich parameters to deal with
# this allows to test for a subset only

PARAMS = []
for param in params:
    stats = load_stats(param)
    if not stats:
        continue
    clients, servers = stats
    # what has been seen by ombt
    param["_ombt_clients"] = len(clients.values())
    param["_ombt_servers"] = len(servers.values())
    param["_ombt_msgs_sent_ok"] = build_msgs_stats(clients.values(), "msgs_ok")
    param["_ombt_msgs_received_ok"] = build_msgs_stats(servers.values(), "msgs_ok")
    param["_ombt_msgs_sent_fail"] = build_msgs_stats(clients.values(), "msgs_fail")
    param["_ombt_msgs_received_fail"] = build_msgs_stats(servers.values(), "msgs_fail")
    #param["_raw_servers_test_result"] = servers
    #param["_raw_clients_test_result"] = clients
    param["_agg_servers"] = build_agg_results(servers.values())
    param["_agg_clients"] = build_agg_results(clients.values())
    PARAMS.append(param)

In [None]:
with open("params_calculated.json", "w") as f:
    json.dump(PARAMS, f)

## Getting some stats

In [None]:
extraction = {}
to_extract = ["_ombt_clients", "_ombt_servers",  "call_type", "driver", "iteration_id"]
for e in to_extract:
    augment(extraction, PARAMS, e)

# Rate server side
extraction.update({
    "server_rate": [p["_agg_servers"]["rate"] for p in PARAMS]
})

# Number of message processed correctly by all the servers
extraction.update({
    "server_ok": [p["_agg_servers"]["msgs_ok"] for p in PARAMS]
})

# Number of message processed with a failure by all the servers
extraction.update({
    "server_fail": [p["_agg_servers"]["msgs_fail"] for p in PARAMS]
})

# Average latency server side
extraction.update({
    "server_latency_avg": [p["_agg_servers"]["latency_avg"] for p in PARAMS]
})

# Stddev latency server side
extraction.update({
    "server_latency_stdev": [p["_agg_servers"]["latency_stdev"] for p in PARAMS]
})

# Average latency client side
extraction.update({
    "client_latency_avg": [p["_agg_clients"]["latency_avg"] for p in PARAMS]
})

# Stddev latency client side
extraction.update({
    "client_latency_stdev": [p["_agg_clients"]["latency_stdev"] for p in PARAMS]
})

# Rate server side
extraction.update({
    "client_rate": [p["_agg_clients"]["rate"] for p in PARAMS]
})

# Number of message processed correctly by all the clients
extraction.update({
    "client_ok": [p["_agg_clients"]["msgs_ok"] for p in PARAMS]
})

# Number of message processed with a failure by all the servers
extraction.update({
    "client_fail": [p["_agg_clients"]["msgs_fail"] for p in PARAMS]
})

# Get a sense of what is happening on each client/server
# min, max, avg of the number of message processed correctly by the clients
extraction.update({
    "per_client_ok": [p["_ombt_msgs_sent_ok"] for p in PARAMS]
})

# min, max, avg of the number of message processed with a failure by the clients
extraction.update({
    "per_client_fail": [p["_ombt_msgs_sent_fail"] for p in PARAMS]
})

# min, max, avg of the number of message processed correctly the servers
extraction.update({
    "per_server_ok": [p["_ombt_msgs_received_ok"] for p in PARAMS]
})

# min, max, avg of the number of message processed with a failure by the servers
extraction.update({
    "per_server_fail": [p["_ombt_msgs_received_fail"] for p in PARAMS]
})


## Ombt statistics

In [None]:
import pandas
from IPython.display import display

df = pandas.DataFrame(extraction)
df

# Ombt statistics graphs

## Latency distribution

In [None]:
import matplotlib.pyplot as plt
import itertools
import operator


def plot_latency_dist(params, client_server, index):
    """
    Draw the (aggregated) latency distribution 
    For casts we take the server latency
    For calls we take the client latency
    """
    agent = "_agg_clients" if params[index]["call_type"] == "rpc-call" else "_agg_servers"
    distribution = params[index][agent]["latency"]["distribution"]
    x = []
    data = []
    labels = []
    max_pw = 0
    for p, numbers in distribution.items():
        pw = int(p)
        x.extend([math.log(x * 10 ** pw, 10)  for x in range(1, 11)])
        labels.extend([10 ** pw] + 9 * [""])
        data.extend(numbers)
        max_pw = max(pw, max_pw)
        plt.bar(x, data, tick_label=labels, align='edge', edgecolor='black', width=-0.01)
        # TODO fix the title
        plt.title("%s - %s" %( index + 1, params[index]["call_type"]))
    
PARAMS = sorted(PARAMS, key=operator.itemgetter('driver', 'call_type', 'nbr_clients'))
# groups = [list(g) for _, g in itertools.groupby(PARAMS, lambda x: x['call_type'])]
# print(len(groups))
#print(len(groups[0]))

In [None]:
for iteration in range(len(PARAMS)):
    plot_latency_dist(PARAMS, "server", iteration)
    plt.show()

## Metric vs clients metrics

In [None]:
#PARAMS = sorted(PARAMS, key=operator.itemgetter('driver', 'call_type', 'nbr_clients'))
#groups = [list(g) for _, g in itertools.groupby(PARAMS, lambda x: x['driver'])]
#groups = [list(itertools.accumulate(iteration['nbr_servers'] for iteration in g)) for g in groups]
drivers = df.driver.unique().tolist()

In [None]:
def draw_metric_vs_iteration(df, drivers, call_type, metric, yerr=None):
    ax = None
    for driver in drivers:
        extract = df[(df.call_type == call_type) & (df.driver == driver)]
        kwargs = {"y": metric, "use_index": False}                        
        if yerr:
            kwargs.update({"yerr": yerr})
        if ax:         
            kwargs.update({"ax": ax})
        ax = extract.plot(**kwargs)
            
    #ax.set_xticklabels(extract._ombt_servers)
    ax.legend(drivers)
        

In [None]:
draw_metric_vs_iteration(df, drivers, "rpc-call", "client_latency_avg")
draw_metric_vs_iteration(df, drivers, "rpc-cast", "server_latency_avg")

# Recovering metrics from influxdb


## Preparation

In [None]:
from influxdb import InfluxDBClient
from influxdb.exceptions import InfluxDBServerError
from requests import exceptions 

def wait_influx_to_get_ready(debug=False, log=False):
    influxdb = InfluxDBClient(database=database, timeout=600)
    if debug:
        print('Connecting to InfluxDB ', end='', flush=True)
    while True:
        try:
            version = influxdb.ping()                        
            if debug:
                print(' DONE')
            if log:
                print(version)
                print(influxdb.get_list_database())
            break
            
        except (InfluxDBServerError,
                exceptions.HTTPError,
                exceptions.ConnectionError,
                exceptions.Timeout,
                exceptions.RequestException) as error:            
            if debug:
                print('.', end='', flush=True)
            time.sleep(2)

In [None]:
import docker
from influxdb import DataFrameClient

client = docker.from_env()
for container in client.containers.list():
    container.stop()
    container.remove(force=True)

In [None]:
import shutil
import tarfile
import time
from datetime import datetime
from tqdm import tqdm_notebook
from tqdm import tqdm
import tqdm
import os
import subprocess
import sys

tqdm.monitor_interval = 0

# NOTE: depending on the version of ombt-orchestrator the 
# database may vary (telegraf or ombt-orchestrator)
database = "ombt-orchestrator"

#container_name=~/^router/
# Assumption : there is at least a group.

epoch = "30"

# router metrics
usage_mem_bus = "SELECT mean(usage) FROM docker_container_mem WHERE container_name=~/^router/ and time>='%s' AND time<='%s' GROUP BY container_name, time({}s)".format(epoch)
usage_cpu_percent_bus = "SELECT mean(usage_percent) FROM docker_container_cpu WHERE container_name=~/^router/ and time>='%s' AND time<='%s' GROUP BY container_name, time({}s)".format(epoch)
# This will work for rabbitmq or qdr
tcp_established_bus = "SELECT mean(tcp_established) FROM netstat WHERE role='bus' and time>='%s' AND time<='%s' GROUP BY host, time({}s)".format(epoch)

# rabbit metrics
usage_mem_rabbit = "SELECT mean(usage) FROM docker_container_mem WHERE container_name=~/^rabbit/ and time>='%s' AND time<='%s' GROUP BY container_name, time({}s)".format(epoch)
usage_cpu_percent_rabbit = "SELECT mean(usage_percent) FROM docker_container_cpu WHERE container_name=~/^rabbit/ and time>='%s' AND time<='%s' GROUP BY container_name, time({}s)".format(epoch)

# rpc-server metric
usage_mem_ombt_servers = "SELECT mean(usage) FROM docker_container_mem WHERE container_name=~/^rpc-server/ and time>='%s' AND time<='%s' GROUP BY container_image, time({}s)".format(epoch) # container_name
usage_cpu_percent_ombt_servers = "SELECT mean(usage_percent) FROM docker_container_cpu WHERE container_name=~/^rpc-server/ and time>='%s' AND time<='%s' GROUP BY container_image, time({}s)".format(epoch)

usage_mem_ombt_clients = "SELECT mean(usage) as usage_mem_ombt_clients FROM docker_container_mem WHERE container_name=~/^rpc-client/ and time>='%s' AND time<='%s' GROUP BY container_image, time({}s)".format(epoch)
usage_cpu_percent_ombt_clients = "SELECT mean(usage_percent) FROM docker_container_cpu WHERE container_name=~/^rpc-client/ and time>='%s' AND time<='%s' GROUP BY container_image, time({}s)".format(epoch)

usage_mem_ombt_controller = "SELECT mean(usage) FROM docker_container_mem WHERE container_name=~/^controller/ and time>='%s' AND time<='%s' GROUP BY container_image, time({}s)".format(epoch)
usage_cpu_percent_ombt_controller = "SELECT mean(usage_percent) FROM docker_container_cpu WHERE container_name=~/^controller/ and time>='%s' AND time<='%s' GROUP BY container_image, time({}s)".format(epoch)

# rabbitmq overview serie
connections_rabbitmq_overview = "SELECT mean(connections) from rabbitmq_overview WHERE role='bus' AND time>='%s' AND time<='%s' GROUP BY time({}s)".format(epoch)

# network traffic
net_recv_bus = "SELECT derivative(mean(bytes_recv), 1s)/1048576 FROM net WHERE role='bus' and time>='%s' AND time<='%s' GROUP BY time({}s), host".format(epoch)
net_sent_bus = "SELECT derivative(mean(bytes_recv), 1s)/1048576 FROM net WHERE role='bus' and time>='%s' AND time<='%s' GROUP BY time({}s), host".format(epoch)

tqdm_params = tqdm_notebook(PARAMS, desc="Iterations:")
for param in tqdm_params:    
    # get experimentation boundaries
    start_time = max(param['_agg_clients']['start_time'], param['_agg_servers']['start_time']) - 30
    stop_time = max(param['_agg_clients']['stop_time'], param['_agg_servers']['stop_time']) + 30
    duration = stop_time - start_time
    start_utc = datetime.utcfromtimestamp(start_time)
    stop_utc = datetime.utcfromtimestamp(stop_time)    
    #print("start=%s, stop=%s" % (start_utc, stop_utc))
    iteration_directory = path.join(RESULT_PATH, param['backup_dir'])
    tar = path.join(iteration_directory, 'influxdb-data.tar.gz')
    subprocess.check_call("tar xfz {}".format(tar), shell=True)
    #tarfile.open(tar).extractall(numeric_owner=True) #path=iteration_directory)
    # docker run --name influxdb -v $(pwd)/influxdb-data:/var/lib/influxdb -p 8083:8083 -p 8086:8086 -ti influxdb
    # Evaluate the "load" of ombt-server/bus :
    # we take the min of the usage_idle of all host in the groups ombt-server/bus
    # we take the max of the memory usage of all routers in the groups bus (we assume that the bus containers are eating more than the other containers)
    QUERIES = {
        "usage_mem_bus": usage_mem_bus % (start_utc, stop_utc),
        "usage_cpu_percent_bus": usage_cpu_percent_bus % (start_utc, stop_utc),
        "tcp_established_bus": tcp_established_bus % (start_utc, stop_utc),
        "usage_mem_rabbit": usage_mem_rabbit % (start_utc, stop_utc),
        "usage_cpu_percent_rabbit": usage_cpu_percent_rabbit % (start_utc, stop_utc),
        "usage_mem_ombt_servers": usage_mem_ombt_servers % (start_utc, stop_utc),
        "usage_cpu_percent_ombt_servers": usage_cpu_percent_ombt_servers % (start_utc, stop_utc),
        "usage_mem_ombt_clients": usage_mem_ombt_clients % (start_utc, stop_utc),
        "usage_cpu_percent_ombt_clients": usage_cpu_percent_ombt_clients % (start_utc, stop_utc),
        "usage_mem_ombt_controller": usage_mem_ombt_controller % (start_utc, stop_utc),
        "usage_cpu_percent_ombt_controller": usage_cpu_percent_ombt_controller % (start_utc, stop_utc),
        "connections_rabbitmq_overview": connections_rabbitmq_overview % (start_utc, stop_utc),
        "net_recv_bus(MB/s)": net_sent_bus % (start_utc, stop_utc),
        "net_sent_bus(MB/s)": net_sent_bus % (start_utc, stop_utc),
    }

    #tqdm_queries = tqdm_notebook(QUERIES.items(), desc="Iteration", bar_format="{n}/|/ {desc}: {n_fmt}/{total_fmt}") 
    try:
        volume_key = path.join(os.getcwd(), 'influxdb-data')
        container = client.containers.run('influxdb:latest',
                                          name="influxdb",
                                          detach=True,
                                          ports={'8086/tcp': 8086, 
                                                 '8083/tcp': 8083},
                                          volumes={volume_key: {'bind': '/var/lib/influxdb', 
                                                                'mode': 'rw'}})        
        wait_influx_to_get_ready(debug=False)
        influx = DataFrameClient(database=database, timeout=600)       
        for key, query in QUERIES.items(): #tqdm_queries:            
            #tqdm_queries.set_description("Iteration {}:".format(tqdm_params.last_print_n + 1))
            result = influx.query(query)            
            # saving the dataframes
            param.setdefault("_metrics", {})           
            param["_metrics"][key] = result
    except Exception as e:
        print(e) 
    finally:
        container.stop()
        container.remove(force=True)
        subprocess.check_call("rm -rf influxdb-data", shell=True)

# Graphs

In [None]:
import operator
import pandas as pd
import matplotlib.dates as mdate
import matplotlib.ticker as ticker

from pandas.tseries.converter import (TimeSeries_DateLocator, TimeSeries_DateFormatter)


def draw_metrics(metric, params, title=""):
    # hold the plot/legends of the same key 
    # e.g (router0: ax, router1: ax)
    axs = {}
    legends = {}    
    for param in sorted(params, key=operator.itemgetter("_ombt_servers")):    
        dfs = param["_metrics"][metric]
        keys = dfs.keys() 
        
        for key in keys:
            # shift
            df = dfs[key]            
            v = df.index.values - df.index.values[0]            
            df['shifted'] = pd.Series(v, index=df.index)
            
            axs.setdefault(key, None)            
            axs[key] = df.plot(x="shifted", ax=axs[key], rot=30)      
            #axs[key].xaxis.set_major_formatter(TimeSeries_DateFormatter(df['shifted']))
            #print(type(axs[key]))
            #print(axs[key].get_xticks())
            #print(len(axs[key].get_xticks()))
            #print(type(axs[key].get_xticks()))
            
            # TODO add the key somewhere to differentiate between agent of the bus
            # e.g router0, router1, ...
            legends.setdefault(key, [])
            legends[key].append("%s, %s" % (param["_ombt_clients"], param["_ombt_servers"]))    
    for key, ax in axs.items():        
        if ax:            
            ax.set_xlabel("Time")            
            ax.set_title("%s \n %s \n %s" % (metric, key, title))
            ax.legend(legends[key], bbox_to_anchor=(1, 0.5), loc="center left")       
            ax.xaxis.set_major_locator(ticker.MaxNLocator(10))
            #ax.xaxis.set_minor_locator(ticker.MaxNLocator(len(axs.items())*10))
            #delta = pd.Timedelta(0, unit='s')
            #print(ax.get_xticklabels())
            #print(ax.xaxis.get_minor_locator())
            #majlocator = TimeSeries_DateLocator("S", plot_obj=ax)
            #ax.xaxis.set_minor_locator(majlocator)
            #ax.set_xticklabels(xxx.index)
            
            #date_fmt = '%M:%S'           
            #date_formatter = mdate.DateFormatter(date_fmt)
            #ax.xaxis.set_major_formatter(date_formatter)
                
        
def draw_metrics_versus_iteration(data, drivers, metric, call_type):        
    for driver in drivers:
        params = [p for p in data if p["call_type"] == call_type and p["driver"] == driver]
        draw_metrics(metric, params, title="%s - %s" % (driver, call_type))
 
# For each client there will one line per number of servers in the test
# plus one axe per group (e.g router1, router2 ...)

#PPARAMS = sorted(PARAMS, key=operator.itemgetter('driver', 'call_type', 'nbr_clients'))
#groups = [list(g) for _, g in itertools.groupby(PPARAMS, lambda x: x['driver'])]
#clients = [list(itertools.accumulate(iteration['nbr_clients'] for iteration in g)) for g in groups]

## RPC-CALLs metrics

### Memory usage on the bus (qdr)¶


In [None]:
draw_metrics_versus_iteration(PARAMS, drivers,"usage_mem_bus", "rpc-call")

### CPU usage on the bus (qdr)

In [None]:
draw_metrics_versus_iteration(PARAMS, drivers,"usage_cpu_percent_bus", "rpc-call")


### TCP connections established on the bus node (qdr or rabbit)¶


In [None]:
draw_metrics_versus_iteration(PARAMS, drivers, "tcp_established_bus", "rpc-call")
draw_metrics_versus_iteration(PARAMS, drivers, "connections_rabbitmq_overview", "rpc-call")


### Network traffic on the bus

In [None]:
draw_metrics_versus_iteration(PARAMS, drivers, "net_recv_bus(MB/s)", "rpc-call")
draw_metrics_versus_iteration(PARAMS, drivers, "net_sent_bus(MB/s)", "rpc-call")


### Memory usage on the bus (rabbit)

In [None]:
draw_metrics_versus_iteration(PARAMS, drivers, "usage_mem_rabbit", "rpc-call")


### CPU usage on the bus (rabbit)

In [None]:
draw_metrics_versus_iteration(PARAMS, drivers, "usage_cpu_percent_rabbit", "rpc-call")


###  Memory usage of the ombt_servers

In [None]:
draw_metrics_versus_iteration(PARAMS, drivers, "usage_mem_ombt_servers", "rpc-call")


### CPU usage of the ombt_servers

In [None]:
draw_metrics_versus_iteration(PARAMS, drivers, "usage_cpu_percent_ombt_servers", "rpc-call")


### Memory usage of the ombt_clients

In [None]:
draw_metrics_versus_iteration(PARAMS, drivers, "usage_mem_ombt_clients", "rpc-call")

### CPU usage of the ombt clients


In [None]:
draw_metrics_versus_iteration(PARAMS, drivers, "usage_cpu_percent_ombt_clients", "rpc-call")

### Memory usage of the ombt_controller

In [None]:
draw_metrics_versus_iteration(PARAMS, drivers, "usage_mem_ombt_controller", "rpc-call")

### CPU usage of the ombt controller

In [None]:
draw_metrics_versus_iteration(PARAMS, drivers, "usage_cpu_percent_ombt_controller", "rpc-call")


## RPC-CASTs metrics

### Memory usage on the bus (qdr)¶


In [None]:
draw_metrics_versus_iteration(PARAMS, drivers,"usage_mem_bus", "rpc-cast")

### CPU usage on the bus (qdr)

In [None]:
draw_metrics_versus_iteration(PARAMS, drivers,"usage_cpu_percent_bus", "rpc-cast")

### TCP connections established on the bus node (qdr or rabbit)¶


In [None]:
draw_metrics_versus_iteration(PARAMS, drivers, "tcp_established_bus", "rpc-cast")
draw_metrics_versus_iteration(PARAMS, drivers, "connections_rabbitmq_overview", "rpc-cast")


### Network traffic on the bus

In [None]:
draw_metrics_versus_iteration(PARAMS, drivers, "net_recv_bus(MB/s)", "rpc-cast")
draw_metrics_versus_iteration(PARAMS, drivers, "net_sent_bus(MB/s)", "rpc-cast")


### Memory usage on the bus (rabbit)

In [None]:
draw_metrics_versus_iteration(PARAMS, drivers, "usage_mem_rabbit", "rpc-cast")


### CPU usage on the bus (rabbit)

In [None]:
draw_metrics_versus_iteration(PARAMS, drivers, "usage_cpu_percent_rabbit", "rpc-cast")


###  Memory usage of the ombt_servers

In [None]:
draw_metrics_versus_iteration(PARAMS, drivers, "usage_mem_ombt_servers", "rpc-cast")


### CPU usage of the ombt_servers

In [None]:
draw_metrics_versus_iteration(PARAMS, drivers, "usage_cpu_percent_ombt_servers", "rpc-cast")


### Memory usage of the ombt_clients

In [None]:
draw_metrics_versus_iteration(PARAMS, drivers, "usage_mem_ombt_clients", "rpc-cast")

### CPU usage of the ombt clients


In [None]:
draw_metrics_versus_iteration(PARAMS, drivers, "usage_cpu_percent_ombt_clients", "rpc-cast")

### Memory usage of the ombt_controller

In [None]:
draw_metrics_versus_iteration(PARAMS, drivers, "usage_mem_ombt_controller", "rpc-cast")

### CPU usage of the ombt controller

In [None]:
draw_metrics_versus_iteration(PARAMS, drivers, "usage_cpu_percent_ombt_controller", "rpc-cast")
