# Artifact Evaluation Instructions for "Acto: Push-Button End-to-End Testing of Kubernetes Operators and Controllers"

## Create experiment container

This container provides the following:

- One node of type "compute_skylake" ([see all types](https://chameleoncloud.readthedocs.io/en/latest/technical/reservations.html#chameleon-node-types))
- One public IP

### Configuration

Enter your project ID in the code block below, if you are not a member of `CHI-231080`.

In [None]:
import chi

chi.use_site("CHI@UC")
chi.set("project_name", "CHI-231080")

print(f'Using Project {chi.get("project_name")}')

### Create reservation

Chameleon resources need to be reserved before they can be used. 
We will reserve one bare metal node and one public IP address, for right now.

If you get an error such as "no host availiable", it may be the case that all of our nodes are reserved. Check the availiablility calendar to see if this is true:
https://chi.uc.chameleoncloud.org/project/leases/calendar/host/

It may take around a minute or so for your lease to become active.

In [None]:
import os

USER = os.getenv('USER')

In [None]:
import os
import keystoneauth1, blazarclient
from chi import lease

reservations = []
lease_node_type = "compute_cascadelake_r"

try:
    print("Creating lease...")
    lease.add_fip_reservation(reservations, count=1)
    lease.add_node_reservation(reservations, node_type=lease_node_type, count=1)

    start_date, end_date = lease.lease_duration(hours=3)

    l = lease.create_lease(
        f"{os.getenv('USER')}-power-management", 
        reservations, 
        start_date=start_date, 
        end_date=end_date
    )
    lease_id = l["id"]

    print("Waiting for lease to start ...")
    lease.wait_for_active(lease_id)
    print("Lease is now active!")
except keystoneauth1.exceptions.http.Unauthorized as e:
    print("Unauthorized.\nDid set your project name and and run the code in the first cell?")
except blazarclient.exception.BlazarClientException as e:
    print(f"There is an issue making the reservation. Check the calendar to make sure a {lease_node_type} node is available.")
    print("https://chi.uc.chameleoncloud.org/project/leases/calendar/host/")
    print(e)
except Exception as e:
    print("An unexpected error happened.")
    print(e)

### Provision bare metal node

Next, we will launch the reserved node with an image. 
It will take approximately 10 minutes for the bare metal node to be successfully provisioned. 

This step takes the longest. First, our controller node must configure the requested node, which first sets up a deploy image. This image then downloads and copies the real image onto the hard drive, and the node is configured to reboot to the new OS. 

You can browse the images we offer in our appliance catalog: http://chameleoncloud.org/appliances

In [None]:
from chi import server, lease

image = "CC-Ubuntu22.04"

s = server.create_server(
    f"{os.getenv('USER')}-power-management", 
    image_name=image,
    reservation_id=lease.get_node_reservation(lease_id)
)

print("Waiting for server to start ...")
server.wait_for_active(s.id)
print("Done")

In [None]:
floating_ip = lease.get_reserved_floating_ips(lease_id)[0]
with open("floating_ip.txt", "w") as f:
    f.write(f"{floating_ip}")
server.associate_floating_ip(s.id, floating_ip_address=floating_ip)

print(f"Waiting for SSH connectivity on {floating_ip} ...")
timeout = 60*2
import socket
import time
# Repeatedly try to connect via SSH.
start_time = time.perf_counter()
while True:
    try:
        with socket.create_connection((floating_ip, 22), timeout=timeout):
            print("Connection successful")
            break
    except OSError as ex:
        time.sleep(10)
        if time.perf_counter() - start_time >= timeout:
            print(f"After {timeout} seconds, could not connect via SSH. Please try again.")

## Setup environment in the node (~10 minute)

In [None]:
from chi import ssh
import subprocess
import sys
from os.path import expanduser

subprocess.check_call([sys.executable, "-m", "pip", "install", "ansible"])
subprocess.run(["ansible-galaxy", "collection", "install", "ansible.posix"])
subprocess.run(["ansible-galaxy", "collection", "install", "community.general"])

with open("./ansible/ansible_hosts", mode="w") as f:
    f.write("{} ansible_connection=ssh ansible_user=cc ansible_port=22".format(floating_ip))

if not os.path.exists(expanduser("~") + "/.ssh"):
    os.system("mkdir $HOME/.ssh")

os.system("ssh-keyscan "+ floating_ip + " >> $HOME/.ssh/known_hosts")
    
subprocess.run(["ansible-playbook", "-i", "./ansible/ansible_hosts", "./ansible/configure.yaml", "--key-file", "$HOME/work/.ssh/id_rsa"])

In [None]:
with ssh.Remote(floating_ip) as conn:
    conn.put("requirements.sh")
    conn.run("bash requirements.sh")

## Run the experiment
Following the instructions, you will reproduce all the bugs (56 in total) that found by Acto and confirmed by developpers.

The process will take approximately 6 hours. 

In [None]:
with ssh.Remote(floating_ip) as conn:
    conn.put("start_acto.sh")
    print("Start reproducing all bugs...")
    print("Please wait 6 hours...")
    conn.run("bash start_acto.sh", disown=True)

# Generate Results

The following commands can run independently and will gather all the results from the reproduction. It will generate Tables 5, 6, 7, and 8 of the paper.

In [None]:
from chi import ssh

with open("floating_ip.txt", "r") as f:
    floating_ip = f.read().strip()
    
with ssh.Remote(floating_ip) as conn:
    conn.get("./workdir/acto/table5.txt")
    conn.get("./workdir/acto/table6.txt")
    conn.get("./workdir/acto/table7.txt")
    
with open('table5.txt', 'r') as f:
    print("Table 5:\n" + f.read() + "\n")
    
with open('table6.txt', 'r') as f:
    print("Table 6:\n" + f.read() + "\n")
    
with open('table7.txt', 'r') as f:
    print("Table 7:\n" + f.read() + "\n")
    
with ssh.Remote(floating_ip) as conn:
    print("Table 8:")
    conn.run("cd ./workdir/acto/ && python3 collect_number_of_ops.py")