# Performing SDN-assisted Adaptive Bit Rate streaming (SABR)

## Overview:

This Jupyter environment was used to recreate _Network Assisted Content Distribution for Adaptive Bitrate
Video Streaming_ experiment by Divyashri Bhat, Amr Rizk, Michael Zink, and Ralf Steinmetz (DOI: http://dx.doi.org/10.1145/3083187.3083196)

## Prerequisites

This tutorial assumes that you have created two leases in Chameleon:

- A lease for 5 node and (optional) 1 floating IP
- A lease for several nodes

These leases contain your server node and your client nodes respectively. You can reserve these resources using the Chameleon Cloud web interface, or using the python-chi library which allows access of Chameleon resources through Python. If you don't know how to reserve these resources on python-chi, a great tutorial on how to reserve and use networking resources can be found in the [Python Create Isolated VLAN](../../tutorials/networking/Python-Networking-CreateIsolatedVLAN.ipynb) tutorial.

It is strongly encouraged that this experiment is run out of the [chameleoncloud_SABR](https://github.com/mwhicks-dev/chameleoncloud_SABR) experimental repository as some files references can be found there, and have been linked appropriately in this notebook.

## Configure the Environment

### Configure Environment Variables

Modify all of the variables below to match your resources and Chameleon configuration.

In [None]:
import json
import os
import chi

import chi.lease as lease_manager
import chi.server as server_manager
import chi.network as networking_manager

from datetime import datetime, timedelta
from dateutil import tz

# Configure project and sites
project_name = 'CH-822154'  # In form of CH-XXXXXX
site = 'CHI@TACC'  # In form of CHI@site

# Configure SSH keypair
key_name = 'my_chameleon_key'  # Configure with your keypair's name on Chameleon
key_extension = '.pem'  # Configure with your keypair file's actual extension if it has one (e.g. '.pem')
key_path = '/Users/{}/.ssh/my_chameleon_key_tacc.pem'.format(os.getenv("USER"))  # Path to key file in this container associated with key_name

# Configure Resource Names
# Tip: Name resources with your username for easier identification
username = 'mwhicks2'
server_username = username + '_ABVS_Server'
server_lease_name = server_username + '_Lease'
server_node_name = server_username + '_Node'
cache_username = username + '_ABVS_Cache'
cache_lease_name = cache_username + '_Lease'
cache_node_name = cache_username + '_Node'
client_username = username + '_ABVS_Client'
client_lease_name = client_username + '_Lease'
client_node_name = client_username + '_Node_'

# Configure Server
image_name='CC-Ubuntu16.04'
flavor_name='baremetal'
network_name='sharednet1'

# Configure conditional variables
VERBOSE = True  # If set to True, this program will produce more output
LAZY = True  # If set to True, this program will only require 75% of the client nodes to be running in order to perform the experiment.

### Launch Server and Caches

Using the server lease defined in prerequisites, create a server node with a floating IP address.

##### Get the Lease

Get the lease ID and the lease corresponding to the name provided in the environment variables cell

In [None]:
# Set project/site
chi.set('project_name', project_name)
chi.use_site(site)

# Get the lease ID and lease by name
server_lease_id = lease_manager.get_lease_id(server_lease_name)
server_lease = lease_manager.get_lease(server_lease_id)

if VERBOSE:
    print(json.dumps(server_lease, indent=2))
else:
    print('server_lease: {}, server_lease_id: {}'.format(server_lease['name'], server_lease_id))

##### Get the Compute Reservation

The compute reservation ID will be needed to launch our server node. We find the ID by pulling the ID from the physical host reservation in our lease.

In [None]:
# Get the compute reservation from the lease
server_compute_reservation_id = list(filter(lambda reservation: reservation['resource_type'] == 'physical:host', server_lease['reservations']))[0]['id']

print("server_compute_reservation_id: {}".format(server_compute_reservation_id))

#### Start the Server Node

Now, we have all of the information needed to launch our server node.

In [None]:
# Create the server
server_node = server_manager.create_server(server_node_name, 
                       reservation_id=server_compute_reservation_id, 
                       key_name=key_name, 
                       network_name=network_name, 
                       image_name=image_name, 
                       flavor_name=flavor_name)

print('{} id: {}'.format(server_node.name, server_node.id))

#### Start the Cache Nodes

We can also launch our cache nodes.

In [None]:
# Create the server
size = 4
cache_nodes = []
for i in range(0, size):
    cache_node = server_manager.create_server(cache_node_name + str(i),
                       reservation_id=server_compute_reservation_id, 
                       key_name=key_name, 
                       network_name=network_name, 
                       image_name=image_name, 
                       flavor_name=flavor_name)
    if VERBOSE:
        print('{} id: {}'.format(cache_node.name, cache_node.id))
    cache_nodes.append(cache_node)

### Launch Clients

Using the client lease defined in prerequisites, create several client nodes.

##### Get the Lease

Get the lease ID and the lease corresponding to the name provided in the environment variables cell

In [None]:
# Set project/site
chi.set('project_name', project_name)
chi.use_site(site)

# Get the lease ID and lease by name
client_lease_id = lease_manager.get_lease_id(client_lease_name)
client_lease = lease_manager.get_lease(client_lease_id)

if VERBOSE:
    print(json.dumps(client_lease, indent=2))
else:
    print('server_lease: {}, server_lease_id: {}'.format(client_lease['name'], client_lease_id))

##### Get the Compute Reservation

The compute reservation ID will be needed to launch our client nodes. We find the ID by pulling the ID from the physical host reservation in our lease.

In [None]:
# Get the compute reservation from the lease
client_compute_reservation_id = list(filter(lambda reservation: reservation['resource_type'] == 'physical:host', client_lease['reservations']))[0]['id']

print("client_compute_reservation_id: {}".format(client_compute_reservation_id))

##### Start the Client Nodes

This is a bit more tricky than creating the server node. Your client lease has several physical nodes allocated, and we are intending to launch all of them. We go about this by creating each node, and appending it to the end of a list so that they can be accessed later. Using a list will also make it easy for us to clean up our nodes later.

In [None]:
# Create the server
size = list(filter(lambda reservation: reservation['resource_type'] == 'physical:host', client_lease['reservations']))[0]['max']  # Find number of reserved nodes from lease
client_nodes = []
for i in range(0, size):
    client_node = server_manager.create_server(client_node_name + str(i),
                       reservation_id=client_compute_reservation_id, 
                       key_name=key_name, 
                       network_name=network_name, 
                       image_name=image_name, 
                       flavor_name=flavor_name)
    if VERBOSE:
        print('{} id: {}'.format(client_node.name, client_node.id))
    client_nodes.append(client_node)

### Server Networking Setup

We will be accessing the server node directly using the `fabric2` library. Consequently, we need two conditions to be met before effectively being able to resume the program:

1. The server node must be active and running
2. The server node must have a floating IP attached to it

Without these two conditions, we will not be able to access our server in order to conduct experiments.

##### Wait for Server Node

We need to wait for the server to have an `ACTIVE` status to resume computation. If the status is `ERROR` or the server cannot be found, an error will be raised.

In [None]:
import dateutil.parser as parse
from dateutil import tz
from datetime import datetime
import time

MAX_LAUNCH_TIME = 15  # minutes

# Reducing duplicate code using a status getter
def get_status( node_id ):
    node = server_manager.get_server(node_id)
    launch_date = parse.isoparse(node.created)
    now = datetime.now(tz=tz.tzutc())
    diff = now - launch_date
    minutes = diff.total_seconds() / 60
    status = node.status
    if VERBOSE:
        print('Node {} Status: {}, Age: {}'.format(node.name, status, diff))
    return minutes, status

# Wait until server status is ACTIVE or ERROR
minutes, status = get_status(server_node.id)
while ( status != 'ACTIVE' ):
    if ( status == 'ERROR' or minutes > MAX_LAUNCH_TIME ):
        raise Exception('Node {} has ERROR status.'.format(server_node_name))
    minutes, status = get_status(server_node.id)
    if status != 'ACTIVE':
        time.sleep(60)

##### Associate Floating IP to Server

The server node is remotely accessed using `fabric2`. To be able to do this, the server must have a floating IP address that allows us to connect from a different network.

In [None]:
try:
    server_floating_ip = server_manager.associate_floating_ip(server_node.id)
    if VERBOSE:
        print('Server Floating IP: {}'.format(server_floating_ip))
except Exception as e:
    output = 'The server {} raised an exception while attaching a floating IP.\n Reason: '.format(server_node_name)
    output += str(e)
    raise Exception(output)

### Client Networking Setup

We will be accessing our client nodes as a series of local hops from the server node. As a result, we do not need to associate a floating IP address for these nodes - however, we do need to carefully document the fixed IP addresses of each client node. Before proceeding, we will need to ensure the following condition:

* All client nodes must be active and running

However, this must not always be met. In the environment variables configuration cell, there is one variable called `LAZY`. If this variable is set to **True**, then as long as the percentage of running servers is >= 75% the program will not throw an error.

##### Wait for Client Nodes

We need to wait for each client node to have the `ACTIVE` status. If `LAZY`, then we are also fine with a node having `ERROR` status unless more than 25% of nodes have it. Otherwise, or if not `LAZY`, an error will be thrown on `ERROR` status.

In [None]:
# Define threshold as well as active nodes array
if LAZY:
    error_threshold = len(client_nodes) / 4
    error_counter = 0
active_client_nodes = []

# Wait until all client node statuses are ACTIVE or ERROR
for client_node in client_nodes:
    client_node_id = client_node.id
    minutes, status = get_status(client_node_id)
    while ( status != 'ACTIVE' ):
        if ( status == 'ERROR' or minutes > MAX_LAUNCH_TIME ):
            if LAZY:
                error_counter += 1
                if error_counter > error_threshold:
                    raise Exception('At least {} client nodes have ERROR status.'.format(error_counter))
            else:
                raise Exception('Node {} has ERROR status.'.format(client_node.name))
            break
        minutes, status = get_status(client_node_id)
        if status != 'ACTIVE':
            time.sleep(60)
    if status == 'ACTIVE':
        active_client_nodes.append(client_node)

### Experiment Configuration

Now that all of our nodes are up and running, there are several things that we need to do before we can run our experiment. Most of these things are done to the server, but one is done to the client, too.

Within the [chameleoncloud_SABR](https://github.com/mwhicks-dev/chameleoncloud_SABR) experimental repository, there are five scripts which allow for us to import our scripts without the clutter of having them all in this notebook. This and similar reasons are why we *strongly* encourage running the experiment from this repository.

##### Fabric Setup

We need to import `fabric2`'s `Connection` tool. This will allow for us to SSH into our server node and run scripts.

We also must establish a dictionary `key` containing the path to our SSH key defined in the environment variables configuration cell. This is passed into `fabric3` so that we can access the secured node, which has been set up behind that SSH key.

In [None]:
import paramiko
from fabric2 import Connection

key = {
    "key_filename": key_path,
}

with Connection(host=server_floating_ip, user="cc", connect_kwargs=key) as c:
    print('Attempting to access client...')
    with Connection(host=server_floating_ip, user="cc", connect_kwargs=key) as c:
        c.put(local=key_path, remote='.ssh/my_chameleon_key.pem')
    print('Success')

##### Clients Setup

In order to perform client actions, the client nodes must be given the AStream GitHub repository. AStream is a command-line based video streaming service that works well for experimental systems with no graphical user interface. For more information, see https://github.com/pari685/AStream.git. Slight modifications were made to AStream to convert the program from Python 2 to Python 3; the edited version can be found at https://github.com/abstractionAlpha/AStream.git.

In [None]:
user='cc'
port=22

# Set up script environment
client_setup = '#!/bin/bash\n'
client_setup += 'USER={}\n'.format(user)
client_setup += 'PORT_NUMBER={}\n'.format(port)
client_setup += 'KEY_NAME={}.pem\n'.format(key_name)
active_client_ips = []
for node in active_client_nodes:
    active_client_ips.append(server_manager.get_host_ip(node.id))
client_setup += 'CLIENT_IPS=({})\n'.format(str(active_client_ips)[1:-1]).replace(",", "")  # Adds all characters except for the brackets Python list to-string

# Import script execution
handle = open('./scripts/client_setup.txt')
for line in handle.readlines():
    client_setup += line
handle.close()

In [None]:
with Connection(host=server_floating_ip, user=user, connect_kwargs=key) as c:
    c.run(client_setup)

##### Caches Setup

We have 4 cache nodes that initially contain nothing, but will in time contain whatever is requested from them.

In [None]:
# Set up script environment
caches = '#!/bin/bash\n'
caches += 'USER={}\n'.format(user)
caches += 'PORT_NUMBER={}\n'.format(port)
caches += 'KEY_NAME={}{}\n'.format(key_name, key_extension)
cache_ips = []
for node in cache_nodes:
    cache_ips.append(server_manager.get_host_ip(node.id))
caches += 'CACHE_IPS=({})\n'.format(str(cache_ips)[1:-1]).replace(",", "")  # Adds all characters except for the brackets Python list to-string

# Import script execution
handle = open('./scripts/caches.txt')
for line in handle.readlines():
    caches += line
handle.close()

In [None]:
with Connection(host=server_floating_ip, user=user, connect_kwargs=key) as c:
    c.run(caches)

##### Server Setup

The paper linked in the top of this notebook contains instructions for repeatability within its appendix on page 13. We follow these instructions very closely in order to make sure all needed packages and repositories are installed by the server.

In [None]:
of_port = str(6653)

# Set up script environments
controller = '#!/bin/bash\n'

clients = '#!/bin/bash\n'

orchestration = '#!/bin/bash\n'
orchestration += 'PORT_NUMBER={}\n'.format(of_port)
orchestration += 'CONTROLLER_IP={}\n'.format(server_floating_ip)
ovs_tags = ['1a', '2a', '2b', '3a', '3b', '3c', '3d', '4a', '4b', '4c', '4d']
orchestration += 'OVS_TAGS=({})\n'.format(str(ovs_tags)[1:-1])  # Adds all tags and does not include brackets from Python list to-string

# Import script executions
handle = open('./scripts/controller.txt')
for line in handle.readlines():
    controller += line
handle.close()

handle = open('./scripts/clients.txt')
for line in handle.readlines():
    clients += line
handle.close()

handle = open('./scripts/orchestration.txt')
for line in handle.readlines():
    orchestration += line
handle.close()

In [None]:
scripts = [controller, orchestration, clients]
index = 0

In [None]:
with Connection(host=server_floating_ip, user="cc", connect_kwargs=key) as c:
    if index < len( scripts ):
        c.run(scripts[ index ])
        index += 1
    else:
        print("All scripts ran")

### Experiment Execution

Now that all of our nodes have been adequately set up, we can finally run our experiment!

#### Automate SABR

We have modified a script provided by Bhat et. al. that artificially runs the SABR framework and stores relevant results in the server node. Similar to our node set-up scripts, we will create a new script that runs the framework based off of existing variables, and then run this on the controller (server) node.

In [None]:
# Set up Python variables
raw_clients = ''
for i in active_client_ips:
    raw_clients += i
    raw_clients += ' '
raw_clients = raw_clients[0:-1]

raw_caches = ''
for i in cache_ips:
    raw_caches += i
    raw_caches += ' '
raw_caches = raw_caches[0:-1]

server_ip = server_manager.get_host_ip(server_node.id)

# Set up script environment
automate_sabr = '#!/bin/bash\n'
automate_sabr += 'KEY_NAME={}\n'.format(key_name)
automate_sabr += 'CLIENT_IPS={}\n'.format(raw_clients)
automate_sabr += 'CACHE_IPS={}\n'.format(raw_caches)
automate_sabr += 'SERVER_IP={}\n'.format(key_name)

# Import script execution
handle = open('./scripts/automate_sabr.txt')
for line in handle.readlines():
    automate_sabr += line
handle.close()

In [None]:
with Connection(host=server_floating_ip, user="cc", connect_kwargs=key) as c:
    c.run(automate_sabr)

### Clean Up

Clean up your resources when you are finished with them.

##### Delete Server Node

Delete your server node when experimentation is complete.

In [None]:
server_manager.delete_server(server_node.id)

##### Delete Cache Node

Delete all of your cache nodes when experimentation is complete.

In [None]:
while len(cache_nodes) > 0:
    node = cache_nodes.pop()
    server_manager.delete_server(node.id)

##### Delete Client Nodes

Delete all of your client nodes when experimentation is complete. This is done using the client_nodes list (as we want to delete `ERROR` status nodes as well).

In [None]:
while len(client_nodes) > 0:
    node = client_nodes.pop()
    server_manager.delete_server(node.id)