**Table of contents**<a id='toc0_'></a>    
- [Importing Libraries](#toc1_)    
- [Reading a Result](#toc2_)    
- [Reding Example of Audible](#toc3_)    
- [Reading Example of CLT](#toc4_)    
- [Reading Example of oversubscription-oracle](#toc5_)    
- [Reporting Average Utilization and Violation Rate](#toc6_)    

<!-- vscode-jupyter-toc-config
	numbering=false
	anchor=true
	flat=false
	minLevel=1
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

# <a id='toc1_'></a>[Importing Libraries](#toc0_)

In [1]:
import pandas as pd
import numpy as np

# <a id='toc2_'></a>[Reading a Result](#toc0_)

The `result_df` dataframe stores information on server **usage** and potential **carry_over** (resulting from resource shortage) throughout the steady-state phase. Each row within the dataframe is representative of an individual server. The steady-state time frame is established at 2016, indicating that the metrics for utilization and carry-over are recorded at 2016 distinct time points towards the end of the simulation. The **deployed_time** column includes a list of tuples, each consisting of a the simulation time point when the VM was deployed, terminated and the VMID, for VMs that influence steady-state usage. This means every VM active during the entire steady state period or a portion of it is documented in this list for each server.


Additional columns in the dataframe are algorithm-dependent. For comprehensive insights into each column unique to the algorithm, we delve into the results of executing each example in `run_simulator.ipynb`, as detailed below.

In [7]:
# reading result function
def parse_filename_to_dict(filename):
    keys = ["rand_seed", "algorithm_name", "ds_name", "num_arrival_vms_per_time_idx", "time_bound", "first_model", "prediction_type", "lb_name", "number_of_servers", "server_capacity", "acceptable_violation", "retreat_num_samples", "drop", "steady_state_time"]
    values = filename.strip('.feather').split('_')[1:]
    values[2] += '_' + values[3] # to account for '_' in the ds_name
    values[8] += '_' + values[9] # to account for '_' in the lb_name
    values.pop(3)
    values.pop(8)
    return dict(zip(keys, values))

def read_result(location):
    try:
        simulation_param_dict = np.load(f'{location}_params.npy', allow_pickle = True).reshape(1, )[0]['params']
    except:
        simulation_param_dict = parse_filename_to_dict(location.split('/')[-1])
    result_df = pd.read_feather(f'{location}.feather')

    print('Result for the following simulation setting has been retrieved:\n', simulation_param_dict)
    return result_df, simulation_param_dict

# <a id='toc3_'></a>[Reding Example of Audible](#toc0_)

In [8]:
location = 'results/audible/small_777_audible_2021_burstable_269_87264_0.95_oracle_worst-fit_usage_756_48_0.01_0_True_2880'
result_df, simulation_param_dict = read_result(location)
result_df.head(2)

Result for the following simulation setting has been retrieved:
 {'rand_seed': '777', 'algorithm_name': 'audible', 'ds_name': '2021_burstable', 'num_arrival_vms_per_time_idx': '269', 'time_bound': '87264', 'first_model': '0.95', 'prediction_type': 'oracle', 'lb_name': 'worst-fit_usage', 'number_of_servers': '756', 'server_capacity': '48', 'acceptable_violation': '0.01', 'retreat_num_samples': '0', 'drop': 'True', 'steady_state_time': '2880'}


Unnamed: 0,usage,carry_over,deployed_times
0,"[24.11999999999997, 26.23999999999996, 26.7799...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[[379, 87264, 1698679], [4541, 87264, 2002747]..."
1,"[35.40999999999998, 24.74999999999998, 26.9899...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[[3417, 87264, 3009257], [4190, 87264, 52101],..."


# <a id='toc4_'></a>[Reading Example of CLT](#toc0_)

Besides the standard columns present in result dataframes, the dataframe specific to the CLT algorithm features additional columns: **variance** and **mean**. These columns record the variance and mean values of the Gaussian distribution that models the aggregated server usage across each time point in the steady-state period for each of the servers.

In [9]:
location = 'results/CLT/small_777_CLT_2021_burstable_278_87264_0.95_oracle_worst-fit_usage_756_48_0.01_0_True_2880'
result_df, simulation_param_dict = read_result(location)
result_df.head(2)

Result for the following simulation setting has been retrieved:
 {'rand_seed': '777', 'algorithm_name': 'CLT', 'ds_name': '2021_burstable', 'num_arrival_vms_per_time_idx': '278', 'time_bound': '87264', 'first_model': '0.95', 'prediction_type': 'oracle', 'lb_name': 'worst-fit_usage', 'number_of_servers': '756', 'server_capacity': '48', 'acceptable_violation': '0.01', 'retreat_num_samples': '0', 'drop': 'True', 'steady_state_time': '2880'}


Unnamed: 0,usage,variance,mean,carry_over,deployed_times
0,"[35.84999999999998, 28.749999999999975, 32.239...","[13.261393227559951, 13.261393227559951, 13.26...","[34.60376454997121, 34.60376454997121, 34.6037...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[[8, 87264, 1900296], [2305, 87264, 2027791], ..."
1,"[31.749999999999968, 28.149999999999963, 27.42...","[8.356456911768229, 8.356456911768229, 8.35645...","[40.038667449687225, 40.038667449687225, 40.03...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[[1616, 87264, 170725], [10774, 87264, 1431534..."


# <a id='toc5_'></a>[Reading Example of oversubscription-oracle](#toc0_)

In addition to the regular columns in result dataframes, the dataframe for the oversubscription-oracle algorithm includes an extra column: **mean**. This column reflects the total allocated CPU, based on the 'first_model' algorithm parameter, at every point in the steady state for each server. For instance, if 'first_model' is set to '2X', the column would display the sum of 2X the baseline for colocated VMs at each simulation point.

In [10]:
location = 'results/oversubscription-oracle/small_777_oversubscription-oracle_2021_burstable_242_87264_0.4X_oracle_worst-fit_usage_756_48_0.01_0_True_2880'
result_df, simulation_param_dict = read_result(location)
result_df.head(2)

Result for the following simulation setting has been retrieved:
 {'rand_seed': '777', 'algorithm_name': 'oversubscription-oracle', 'ds_name': '2021_burstable', 'num_arrival_vms_per_time_idx': '242', 'time_bound': '87264', 'first_model': '0.4X', 'prediction_type': 'oracle', 'lb_name': 'worst-fit_usage', 'number_of_servers': '756', 'server_capacity': '48', 'acceptable_violation': '0.01', 'retreat_num_samples': '0', 'drop': 'True', 'steady_state_time': '2880'}


Unnamed: 0,usage,mean,carry_over,deployed_times
0,"[26.00999999999998, 30.38999999999997, 21.8099...","[39.97999999999992, 39.61999999999992, 39.6199...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[[3224, 87264, 1842505], [4032, 85536, 574079]..."
1,"[24.139999999999972, 23.809999999999963, 27.00...","[37.40399999999993, 37.40399999999993, 37.4039...","[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...","[[1983, 87264, 811923], [4852, 87264, 625541],..."


The dataframe corresponding to the rc algorithm does not contain any extra columns; therefore, we have chosen not to include it here.

# <a id='toc6_'></a>[Reporting Average Utilization and Violation Rate](#toc0_)

the following function is used ot report the utilization and violation rate for each experiment result (metrics are defined in the paper in section 5.3). 

- Server utilization: The average CPU utilization in the steady state for each server.
- Server capacity violation Rate: The fraction of all steady state points with a server capacity violation(BVM CPU demand exceeded server capacity) for each server.

In [11]:
def report_usage_violation(result_df, server_capacity, acceptable_violation, steady_state_time):
        avg_usage = np.mean(result_df['usage'].apply(np.mean))*100/server_capacity
        print('Average utilization (%) accross all servers:', avg_usage)
        num_servers_with_severe_violation = np.count_nonzero(result_df['usage'].apply(lambda u: 1 if np.sum(u>=server_capacity)/steady_state_time >= acceptable_violation else 0))
        print('Number of servers with violation more than {}% in the last week is {}'.format(acceptable_violation, num_servers_with_severe_violation) )
        avg_violation_rate = np.mean(result_df['usage'].apply(lambda x: 100*len(x[x>=server_capacity])/len(x)))
        print('Average violation rate is {}%'.format(avg_violation_rate) )
        p99_violation_rate = np.quantile(result_df['usage'].apply(lambda x: 100*len(x[x>=server_capacity])/len(x)), 0.99)
        print('99 percentile violation rate is {}%'.format(p99_violation_rate) )
        max_violation_rate = np.max(result_df['usage'].apply(lambda x: 100*len(x[x>=server_capacity])/len(x)))
        print('max violation rate is {}%'.format(max_violation_rate) )

        return avg_usage, num_servers_with_severe_violation, avg_violation_rate, p99_violation_rate, max_violation_rate

In [13]:
server_capacity = int(simulation_param_dict['server_capacity'])
acceptable_violation = float(simulation_param_dict["acceptable_violation"])
steady_state_time = int(simulation_param_dict['steady_state_time'])
report_usage_violation(result_df, server_capacity, acceptable_violation, steady_state_time)

Average utilization (%) accross all servers: 51.08046053064675
Number of servers with violation more than 0.01% in the last week is 0
Average violation rate is 0.00496031746031746%
99 percentile violation rate is 0.1857638888888936%
max violation rate is 0.5902777777777778%


(51.08046053064675,
 0,
 0.00496031746031746,
 0.1857638888888936,
 0.5902777777777778)