**Table of contents**<a id='toc0_'></a>    
- [Graphical Interface to Set Simulation Parameters](#toc1_1_)    
  - [Example of Running Audible](#toc1_2_)    
  - [Example of Running CLT](#toc1_3_)    
  - [Example of Running Resource Central](#toc1_4_)    
  - [Example of Running oversubscription-oracle](#toc1_5_)    
- [Simulation Parameter Values for the ASPLOS24 Paper](#toc2_)    

<!-- vscode-jupyter-toc-config
	numbering=false
	anchor=true
	flat=false
	minLevel=1
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

## <a id='toc1_1_'></a>[Graphical Interface to Set Simulation Parameters](#toc0_)

**Configuration dictionary**

The configuration dictionary contains various parameters used for simulation settings.

- `rand_seed`: Specifies the seed value to shuffle the order of arriving VMs for simulation.

- `algorithm_name`: Specifies the name of the algorithm being used.

- `ds_name`: Specifies the dataset name.

- `num_arrival_vms_per_time_idx`: Takes an integer to specify how many VMs are placed at each simulation time point.

- `time_bound`: Specifies the number of simulation time points, where each time point simulates 5 minutes in reality. 86400 reflects 10 months.

- `first_model`: Specifies the first model to be used for each algorithm upon VM arrival based on its type. For Audible and CLT, it uses the 95th percentile, similar to what Resource Central did. For oversubscription-oracle, it could be any coefficient of the baseline.

- `prediction_type`: Specifies the prediction type.

- `lb_name`: Specifies the name of the load balancer. We fix this to "worst-fit_usage".

- `number_of_servers`: Specifies the number of servers.

- `server_capacity`: Specifies the number of cores in each server.

- `acceptable_violation`: Specifies the target violation that each algorithm is trying to achieve.

- `retreat_num_samples`: Specifies the number of simulation points to stop placing VMs on a server that had a violation in the past x simulation points. This value is zero for the reported results as it doesn't make a huge difference.

- `drop`: If set to "False", it allows for rejecting VMs for placement. In this work, it's always set to "true" to ensure that algorithms always accept VMs for placement.

- `steady_state_time`: Specifies the simulation time needed to reach a steady state where the results are credible. For that reason, the simulation in the last week (or 2016 * 5-minute points) after running for almost 10 months is used for generating the results.


In [1]:
# Configuration dictionary Example
config_dict = {
    "rand_seed": 0, 
    "algorithm_name": "audible",
    "ds_name": "2021_burstable", 
    "num_arrival_vms_per_time_idx": 9, 
    "time_bound": 86400, 
    "first_model": 0.95, 
    "prediction_type": "est", 
    "lb_name": "worst-fit_usage", 
    "number_of_servers": 30, 
    "server_capacity": 48, 
    "acceptable_violation": 0.01, 
    "retreat_num_samples": 0, 
    "drop": True, 
    "steady_state_time": 2016 
}

GUI

In [2]:
import os
import json
from ipywidgets import widgets
from IPython.display import display

# Initial widget options and default values
widget_options = {
    "server_capacity": [36, 48, 64],
    "acceptable_violation": [0.0025, 0.005, 0.01, 0.02, 0.03, 0.4, 0.05],
    "drop": [True, False],
    "ds_name": ['2021_burstable'],
    "first_model": {'audible': [0.95], 'CLT': [0.95], 'oversubscription-oracle': ['0.1X', '0.2X', '0.3X', '0.4X', '0.5X', '0.6X', '0.7X', '0.8X', '0.9X', '1.1X', '1.3X', '2X'], 'rc': ['rc-0.95']},
    "prediction_type": {'audible': ['est'], 'CLT': ['est', 'oracle'], 'oversubscription-oracle': ['oracle'], 'rc': ['oracle']},
    "lb_name": ['worst-fit_usage'],
    "algorithm_name": ['audible', 'CLT', 'oversubscription-oracle', 'rc']
}

all_widgets = {}
for key in config_dict:
    style = {'description_width': 'initial'}
    if key in widget_options:
        all_widgets[key] = widgets.Dropdown(options=widget_options[key] if key not in ['first_model', 'prediction_type'] else widget_options[key][config_dict["algorithm_name"]], 
                                            value=config_dict[key], 
                                            description=key.replace('_', ' ').capitalize() + ':', style = style)
    elif key in ["rand_seed", "num_arrival_vms_per_time_idx", "number_of_servers"]:
        all_widgets[key] = widgets.BoundedIntText(value = 1, min=1, max=1000000, description = key.replace('_', ' ').capitalize() + ':', style = style)
    elif key in ["time_bound"]:
        all_widgets[key] = widgets.BoundedIntText(value = 86400, min=1, max=1000000, description = key.replace('_', ' ').capitalize() + ':', style = style)
    else:
        all_widgets[key] = widgets.IntText(config_dict[key], description = key.replace('_', ' ').capitalize() + ':', disabled = True, style = style)
    

# Dynamic update functions
def update_first_model_options(*args):
    all_widgets['first_model'].options = widget_options['first_model'][all_widgets['algorithm_name'].value]
    if all_widgets['first_model'].value not in all_widgets['first_model'].options:
        all_widgets['first_model'].value = all_widgets['first_model'].options[0]

def update_prediction_type_options(*args):
    all_widgets['prediction_type'].options = widget_options['prediction_type'][all_widgets['algorithm_name'].value]
    if all_widgets['prediction_type'].value not in all_widgets['prediction_type'].options:
        all_widgets['prediction_type'].value = all_widgets['prediction_type'].options[0]

# Set observers
all_widgets['algorithm_name'].observe(update_first_model_options, 'value')
all_widgets['algorithm_name'].observe(update_prediction_type_options, 'value')

# Call update functions to set initial state correctly
update_first_model_options()
update_prediction_type_options()

# Function to save the widget values as a JSON file
def save_json_button_clicked(b):
    settings_dict = {key: widget.value for key, widget in all_widgets.items()}
    # print(settings_dict)
    fn = '_'.join([str(settings_dict[i]) for i in settings_dict]) 
    fn += '.json'
    file_path = f'simulation_param_files/{fn}'
    os.makedirs('simulation_param_files', exist_ok=True)
    with open(file_path, 'w') as json_file:
        json.dump(settings_dict, json_file, indent=4)
    print(f"Configuration saved to {file_path}")

# Generate Simulation Dict and Save it
gen_json_button = widgets.Button(description='Save Simulation Param Dict!')
gen_json_button.layout.width = '200px'
gen_json_button.on_click(save_json_button_clicked)

# Display all widgets and the button
widget_width = '600px'  # Adjust the width as needed)

# Set layout width for each widget
for widget in all_widgets.values():
    widget.layout.width = widget_width
    display(widget)
display(gen_json_button)


BoundedIntText(value=1, description='Rand seed:', layout=Layout(width='600px'), max=1000000, min=1, style=Desc…

Dropdown(description='Algorithm name:', layout=Layout(width='600px'), options=('audible', 'CLT', 'oversubscrip…

Dropdown(description='Ds name:', layout=Layout(width='600px'), options=('2021_burstable',), style=DescriptionS…

BoundedIntText(value=1, description='Num arrival vms per time idx:', layout=Layout(width='600px'), max=1000000…

BoundedIntText(value=86400, description='Time bound:', layout=Layout(width='600px'), max=1000000, min=1, style…

Dropdown(description='First model:', layout=Layout(width='600px'), options=(0.95,), style=DescriptionStyle(des…

Dropdown(description='Prediction type:', layout=Layout(width='600px'), options=('est',), style=DescriptionStyl…

Dropdown(description='Lb name:', layout=Layout(width='600px'), options=('worst-fit_usage',), style=Description…

BoundedIntText(value=1, description='Number of servers:', layout=Layout(width='600px'), max=1000000, min=1, st…

Dropdown(description='Server capacity:', index=1, layout=Layout(width='600px'), options=(36, 48, 64), style=De…

Dropdown(description='Acceptable violation:', index=2, layout=Layout(width='600px'), options=(0.0025, 0.005, 0…

IntText(value=0, description='Retreat num samples:', disabled=True, layout=Layout(width='600px'), style=Descri…

Dropdown(description='Drop:', layout=Layout(width='600px'), options=(True, False), style=DescriptionStyle(desc…

IntText(value=2016, description='Steady state time:', disabled=True, layout=Layout(width='600px'), style=Descr…

Button(description='Save Simulation Param Dict!', layout=Layout(width='200px'), style=ButtonStyle())

Configuration saved to simulation_param_files/1_CLT_2021_burstable_5_86400_0.95_est_worst-fit_usage_10_48_0.01_0_True_2016.json
Configuration saved to simulation_param_files/1_CLT_2021_burstable_4_86400_0.95_est_worst-fit_usage_10_48_0.01_0_True_2016.json


## <a id='toc1_2_'></a>[Example of Running Audible](#toc0_)

In [3]:
%run src/main.py "simulation_param_files/1_audible_2021_burstable_2_86400_0.95_est_worst-fit_usage_10_48_0.01_0_True_2016.json"

Running  {'rand_seed': 1, 'algorithm_name': 'audible', 'ds_name': '2021_burstable', 'num_arrival_vms_per_time_idx': 2, 'time_bound': 86400, 'first_model': 0.95, 'prediction_type': 'est', 'lb_name': 'worst-fit_usage', 'number_of_servers': 10, 'server_capacity': 48, 'acceptable_violation': 0.01, 'retreat_num_samples': 0, 'drop': True, 'steady_state_time': 2016}
Reading data files took 0:00:04.783214


100%|██████████| 86400/86400 [00:09<00:00, 8819.47it/s]


length of dropped_vmids for arrival  2  is  0
Average utilization (%) accross all servers: 32.848474702380926
Number of servers with violation more than 1.0% in the last week is 0


## <a id='toc1_3_'></a>[Example of Running CLT](#toc0_)

In [4]:
%run src/main.py "simulation_param_files/1_CLT_2021_burstable_2_86400_0.95_est_worst-fit_usage_10_48_0.01_0_True_2016.json"

Running  {'rand_seed': 1, 'algorithm_name': 'CLT', 'ds_name': '2021_burstable', 'num_arrival_vms_per_time_idx': 2, 'time_bound': 86400, 'first_model': 0.95, 'prediction_type': 'est', 'lb_name': 'worst-fit_usage', 'number_of_servers': 10, 'server_capacity': 48, 'acceptable_violation': 0.01, 'retreat_num_samples': 0, 'drop': True, 'steady_state_time': 2016}
Reading data files took 0:00:03.924341


100%|██████████| 86400/86400 [00:07<00:00, 11784.16it/s]


length of dropped_vmids for arrival  2  is  0
Average utilization (%) accross all servers: 32.848474702380926
Number of servers with violation more than 1.0% in the last week is 0


## <a id='toc1_4_'></a>[Example of Running Resource Central](#toc0_)

In [5]:
%run src/main.py "simulation_param_files/1_rc_2021_burstable_2_86400_rc-0.95_oracle_worst-fit_usage_10_48_0.01_0_True_2016.json"

Running  {'rand_seed': 1, 'algorithm_name': 'rc', 'ds_name': '2021_burstable', 'num_arrival_vms_per_time_idx': 2, 'time_bound': 86400, 'first_model': 'rc-0.95', 'prediction_type': 'oracle', 'lb_name': 'worst-fit_usage', 'number_of_servers': 10, 'server_capacity': 48, 'acceptable_violation': 0.01, 'retreat_num_samples': 0, 'drop': True, 'steady_state_time': 2016}
Reading data files took 0:00:03.631148


 23%|██▎       | 19608/86400 [00:00<00:02, 28354.30it/s]


Rejecting a VM at time 19608


## <a id='toc1_5_'></a>[Example of Running oversubscription-oracle](#toc0_)

In [6]:
%run src/main.py "simulation_param_files/1_oversubscription-oracle_2021_burstable_2_86400_0.5X_oracle_worst-fit_usage_10_48_0.01_0_True_2016.json"

Running  {'rand_seed': 1, 'algorithm_name': 'oversubscription-oracle', 'ds_name': '2021_burstable', 'num_arrival_vms_per_time_idx': 2, 'time_bound': 86400, 'first_model': '0.5X', 'prediction_type': 'oracle', 'lb_name': 'worst-fit_usage', 'number_of_servers': 10, 'server_capacity': 48, 'acceptable_violation': 0.01, 'retreat_num_samples': 0, 'drop': True, 'steady_state_time': 2016}
Reading data files took 0:00:03.775505


100%|██████████| 86400/86400 [00:02<00:00, 30038.19it/s]


length of dropped_vmids for arrival  2  is  0
Average utilization (%) accross all servers: 32.848474702380926
Number of servers with violation more than 1.0% in the last week is 0


# <a id='toc2_'></a>[Simulation Parameter Values for the ASPLOS24 Paper](#toc0_)

The results detailed in the paper stem from experiments executed with simulation parameters specified in the `all_sim_params` list. It's important to note the extensive computational resources needed for these large-scale experiments. To facilitate running these simulations, we utilized Azure Batch.

| Configuration Key                         | Description                                                                                             |
|-------------------------------------------|---------------------------------------------------------------------------------------------------------|
| `rand_seed`                               | 0, 1, 2, 3, ...                                                                                         |
| `algorithm_name`                          | "audible", "oversubscription oracle", "CLT", or "rc"                                                    |
| `ds_name`                                 | "2021_burstable" or "2021_regular"                                                                      |
| `num_arrival_vms_per_time_idx`            | A positive value up to hundreds per simulation time point, varies depending on the algorithm            |
| `time_bound`                              | 86400 (equivalent to 10 months of simulation) \*                                                        |
| `first_model`                             | 95th percentile conservative model for "audible" and "clt", variable "Xbaseline" (e.g., "0.1X", "0.2X", "0.3X", ...) for "oversubscription oracle", "rc-0.95" for "rc" |
| `prediction_type`                         | "Oracle" or "est"                                                                                       |
| `lb_name`                                 | Load balance is worst-fit according to server usages \*                                                 |
| `(number_of_servers, server_capacity)`    | (1008, 36), (756, 48), (567, 64)                                                                        |
| `acceptable_violation`                    | 0.005, 0.01, 0.025, and 0.05                                                                            |
| `retreat_num_samples`                     | No retreat upon violation \*                                                                            |
| `drop`                                    | No rejection \*                                                                                         |
| `steady_state_time`                       | Constant one week or 2016 \*                                                                            |

\* Constant across simulations

In [7]:
from itertools import product
# Import simulator module
for module in ["simulator.py"]:
    temp =  'src/' + module
    %run $temp

# Define parameters
random_seeds = range(10)
dataset_names = ['2021_burstable', '2021_regular']
algorithms = ['audible', "CLT", "rc", "oversubscription-oracle"]
server_configurations = [(1008, 36), (756, 48), (567, 36)]
violation_thresholds = [0.005, 0.01, 0.025, 0.05]
arrival_rates = range(10, 400)


# Generate configurations
configurations = product(
    random_seeds,
    dataset_names,
    algorithms,
    server_configurations,
    violation_thresholds,
    arrival_rates
)

all_sim_params = []
results = {}
# Iterate over configurations
for rs, ds_name, algo, (num_server, server_cap), viol_th, arr_rate in configurations:
    for first_model in widget_options['first_model'][algo]:
        for prediction_type in widget_options["prediction_type"][algo]:
            config_dict = {
                "rand_seed": rs, 
                "algorithm_name": algo,
                "ds_name": ds_name, 
                "num_arrival_vms_per_time_idx": arr_rate, 
                "time_bound": 86400, 
                "first_model": first_model, 
                "prediction_type": prediction_type, 
                "lb_name": "worst-fit_usage", 
                "number_of_servers": num_server, 
                "server_capacity": server_cap, 
                "acceptable_violation": viol_th, 
                "retreat_num_samples": 0, 
                "drop": True, 
                "steady_state_time": 2016 
            }
            all_sim_params.append(config_dict)
            # Execute the simulator using this configuration dictionary. (Activate the code line at your discretion.)
            # results['_'.join([str(i) for i in config_dict.values()])] = Simulation(all_sim_params[0], {}).run_simulation() # call on the simulator with the current config

print(f'{len(all_sim_params)} simulation configuration has generated!')

1497600 simulation configuration has generated!


To avoid the excessive generation of simulation parameter dictionaries for every possible VM arrival rate, we've revised the preceding code to identify the highest arrival rate each algorithm can support through a binary search method. This adjustment considerably decreases the number of simulations needed. The code has been meticulously commented below to prompt users to employ it with caution, considering it still demands a considerable amount of computational resources.

```
for module in ["simulator.py"]:
    temp =  'src/' + module
    %run $temp
        
def find_max_arr_rate(initial_config, arrival_rates=range(1, 400)):
    # at most it will take np.ceil(np.log2(max(arrival_rates))) steps to find it!
    left, right = min(arrival_rates), max(arrival_rates)  # Assuming arrival_rates is range(1, 405)
    best_rate = None
    history = {}
    while left <= right:
        mid = (left + right) // 2
        current_config = initial_config.copy()
        current_config['num_arrival_vms_per_time_idx'] = mid
        result = Simulation(current_config, {}).run_simulation()
        history[mid] = result
        if result == 'succeed':
            best_rate = mid  # Found a working rate, try to find a higher one
            left = mid + 1
        else:
            right = mid - 1  # Failed, reduce the search space
        print(history)
    return best_rate

# Generate configurations without including arrival_rates
configurations = product(
    random_seeds,
    dataset_names,
    algorithms,
    server_configurations,
    violation_thresholds
)

all_sim_params = []
# Iterate over configurations to find the max arr_rate
for rs, ds_name, algo, (num_server, server_cap), viol_th in configurations:
    for first_model in widget_options['first_model'][algo]:
        for prediction_type in widget_options["prediction_type"][algo]:
            initial_config = {
                "rand_seed": rs,
                "algorithm_name": algo,
                "ds_name": ds_name,
                "time_bound": 86400,
                "first_model": first_model,
                "prediction_type": prediction_type,
                "lb_name": "worst-fit_usage",
                "number_of_servers": num_server,
                "server_capacity": server_cap,
                "acceptable_violation": viol_th,
                "retreat_num_samples": 0,
                "drop": True,
                "steady_state_time": 2016
            }
            
            # Find the max arr_rate for this configuration
            max_arr_rate = find_max_arr_rate(initial_config)
            initial_config['num_arrival_vms_per_time_idx'] = max_arr_rate
            all_sim_params.append(initial_config) # for each setup record the max VM arrival it supports


```