# Demonstration

This notebook guides you on how to reproduce figure 5 of the paper

Section ```Local scheduler - Offline mode``` from README file must have been previously followed

In [None]:
import pandas as pd
pd.options.mode.chained_assignment = None  # default='warn'
import os.path
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
dataset = pd.read_csv('debug/monitoring.csv', sep='\t')

keys_as_float = ['tmp', 'val', 'config', 'sb_oc', 'sb_unused']
for key in keys_as_float: dataset[key] = dataset[key].apply(lambda x : None if x == 'None' else float(x))
dataset['time'] = dataset['tmp'] / 60

#Â Overall experiment

In [None]:
subsets = dataset.loc[dataset['rec'] == 'subset']
subsets_cpu = subsets.loc[subsets['res'] == 'cpu']
subsets_mem = subsets.loc[subsets['res'] == 'mem']

This figure resumes the experiment: a 12h session in which a ascendant number of 8-cores VMs were deployed under our local scheduler usinga specific oversubscription template.
Specifically, each vCPU0 were not oversubscribed (i.e. proposed at a 1:1 ratio), each vCPU1 were proposed to a 1.5:1 ratio and vCPU2-7 to a 2.6:1 ratio

In our context, a subset is a collection of physical cores on which vCPUs may be pinned. Each subset as an individual oversubscription ratio. Continuous lines referred to the size allocation (i.e. the number of physical CPUs associated to a given oversubscription level) whereas the transparent lines refers to the amount being used

In [None]:
palette = sns.color_palette("Set2", subsets_cpu['subset'].nunique())

g_val = sns.lineplot(data=subsets_cpu, x='tmp', y='val', hue='subset', palette=palette, linestyle='--', legend=False, alpha=0.5)
g_config = sns.lineplot(data=subsets_cpu, x='tmp', y='config', hue='subset', palette=palette)
g_config.legend(loc='upper right', title=None)

plt.xlim([0, 30000])
plt.ylim([0, 140])
g_config.set_ylabel('cores')
g_config.set_xlabel('time')
res = g_config.set_xticklabels([])

# VM usage pattern

In [None]:
vms = dataset.loc[dataset['rec'] == 'vm']
vms_cpu = vms.loc[vms['res'] == 'cpu']
vms_mem = vms.loc[vms['res'] == 'mem']

In [None]:
vms_cpu.tail()

In [None]:
print('Number of VMs:', vms_cpu['vm_cmn'].nunique())

We now illustrate the diversity of CPU usage patterns among our hosted VMs

In [None]:
vms_cpu_focus = vms_cpu.loc[vms_cpu['vm_cmn'].isin(['vm1','vm2','vm3'])]
vms_cpu_focus['core_used'] =  vms_cpu_focus['val'] * vms_cpu_focus['config']
palette = sns.color_palette("Set2", vms_cpu_focus['vm_cmn'].nunique())

print('This step is time consuming...')
g = sns.lineplot(data=vms_cpu_focus, x='tmp', y='core_used', hue='vm_cmn', palette=palette)
g.set_ylabel('cores')
g.set_xlabel('time')
g.legend(loc='upper right', title=None)
res = g.set_xticklabels([])

# Host load

In [None]:
host = dataset.loc[dataset['rec'] == 'global']
host_cpu = host.loc[host['res'] == 'cpu']
host_mem = host.loc[host['res'] == 'mem']

We now report on host load during experiment

In [None]:
host_cpu['core_used'] =  host_cpu['val'] * host_cpu['config']

palette = sns.color_palette("Set2", 2)

g = sns.lineplot(data=host_cpu, x='tmp', y='core_used', color=palette[0])
plt.hlines(host_cpu['config'].max(), xmin=0, xmax=host_cpu['tmp'].max(), colors=palette[1], linestyles='solid', label='config')
g.set_xlabel('time')

res = g.set_xticklabels([])