# Edge to Cloud Hello world

This notebook is intended to contain the skeleton of an experiment running from Edge to Cloud. The main parts of this are:

## Cloud worker
1. Leases a node and launches and deploys an instance on it
2. Installs required software, and configures an MQTT broker.
3. Runs a service, which subscribes to the broker and passes in messages to an ML model.

## Edge worker
1. Leases a device and launches a container on it.
2. Runs a container command which gets data, preprocesses and checks it, and passes to the MQTT broker.

## Analysis
1. Periodically download results from the cloud and plot them.

First, we start by establishing which sites we want to use. Here, we use CHI@UC, and also the only Edge site. You could also use `CHI@TACC`. Other bare metal sites will work as well, provided that you update the node type based on what is supported at that site.

**Enter your project ID** before you continue.

In [None]:
CLOUD_SITE = "CHI@UC"
EDGE_SITE = "CHI@Edge"
# Please enter your project ID!
PROJECT_ID = "Chameleon"

import chi
chi.set("project_name", PROJECT_ID)

# Cloud site

First, we will set up the cloud node.

In [None]:
chi.use_site(CLOUD_SITE)

from chi import lease
from chi import server
import os
import keystoneauth1, blazarclient

Now, we will reserve a node. You may want to change the `lease_node_type`, as your experiment might require a GPU or a different type of node. If this fails, check the [node calendar](https://chi.uc.chameleoncloud.org/project/leases/calendar/host/) to see what nodes are available.

In [None]:
lease_node_type = "compute_cascadelake_r"

reservations = []
try:
    print("Creating lease...")
    lease.add_fip_reservation(reservations, count=1)
    lease.add_node_reservation(reservations, node_type=lease_node_type, count=1)

    start_date, end_date = lease.lease_duration(days=1)

    l = lease.create_lease(
        f"cloud-{lease_node_type}-{start_date}", 
        reservations, 
        start_date=start_date, 
        end_date=end_date
    )
    cloud_lease_id = l["id"]

    print("Waiting for lease to start ...")
    lease.wait_for_active(cloud_lease_id)
    print("Lease is now active!")
except keystoneauth1.exceptions.http.Unauthorized as e:
    print("Unauthorized.\nDid set your project name and and run the code in the first cell?")
except blazarclient.exception.BlazarClientException as e:
    print(f"There is an issue making the reservation. Check the calendar to make sure a {lease_node_type} node is available.")
    print("https://chi.uc.chameleoncloud.org/project/leases/calendar/host/")
    print(e)
except Exception as e:
    print("An unexpected error happened.")
    print(e)

Now, launch an instance on the server. This part may take several minutes. Here, we use a general purpose Ubuntu image. If you want to use Cuda, you can change this to our Cuda appliance. See [this page](https://chameleoncloud.org/appliances/) describing the images we provide. If you would rather use a custom image, see our [documentation](https://chameleoncloud.readthedocs.io/en/latest/technical/images.html) describing how to set one up.

In [None]:
s = server.create_server(
    f"cloud-{lease_node_type}-{start_date}", 
    image_name="CC-Ubuntu20.04",
    reservation_id=lease.get_node_reservation(cloud_lease_id)
)

print("Waiting for server to start ...")
server.wait_for_active(s.id)
print("Done")

After the server has started, attach a floating IP to it.

In [None]:
floating_ip = lease.get_reserved_floating_ips(cloud_lease_id)[0]
server.associate_floating_ip(s.id, floating_ip_address=floating_ip)

print(f"Waiting for SSH connectivity on {floating_ip} ...")
server.wait_for_tcp(floating_ip, 22)
print("SSH successful")

Now, we will configure the cloud worker by uploading configuration files, and then running a script. If you are 

In [None]:
# Run the worker on the cloud
from chi import ssh

with ssh.Remote(floating_ip) as conn:
    # The MQTT broker configuration
    conn.put("mosquitto.conf")
    # The python program used for ML inference, that loops waiting for new input data.
    conn.put("predict_loop.py")
    # The systemd service which runs `predict_loop.py` in the background
    conn.put("edge_cloud.service")
    # The script to setup needed components and enable the service
    conn.put("cloud_worker.sh")
    conn.run("bash cloud_worker.sh")
print("Configuration complete")

Now the cloud configuration is complete. 

## Edge Worker

First, we must switched to the edge site.

In [None]:
chi.use_site(EDGE_SITE)

from chi import container

Next, we lease a device by name. Check the [device calendar](https://chi.edge.chameleoncloud.org/project/leases/calendar/device/) to see when each device is available, and change the name as needed.

In [None]:
# Create and wait for a lease
device_name = "iot-rpi4-03"
start, end = lease.lease_duration(days=1)
reservations = []
lease.add_device_reservation(reservations, count=1, device_name=device_name)
container_lease = lease.create_lease(f"edge-{device_name}-{start}", reservations)
edge_lease_id = container_lease["id"]

print("Waiting for lease to start ...")
lease.wait_for_active(edge_lease_id)
print("Done!")

And then we launch a container. To keep this experiment generic, here we just use the default ubuntu container, and run a script to set up everything. Look at `edge_worker.sh` for more details about how to configure the edge device further.

For your own experiment, you could build and publish a custom image to Docker Hub. With a custom image, it may be easiest to pass in needed configuration values via the `environment` keyword argument to `create_container` as described [here](https://python-chi.readthedocs.io/en/latest/modules/container.html#chi.container.create_container).

In [None]:
with open("edge_worker.sh", "rb") as worker:
    mqtt_server = container.create_container(f"edge-worker", 
         image="arm64v8/ubuntu", 
         mounts=[{
                 "type": "bind", 
                 "source": worker.read(), 
                 "destination": "/edge_worker.sh"
             }
         ],
         command=["bash", "/edge_worker.sh", "edge_data", 100, floating_ip],
         reservation_id=lease.get_device_reservation(edge_lease_id),
         platform_version=2,
         interactive=True)
    print("Done!")

Now we have a cloud server listening for incoming data, and an edge container sending data to it.

# Analysis

To analyze the data, we periodically download the results file. In this example, our results contains one prediction per line, where entries on the line are predicted classes. This is displayed as a histogram, showcasing how many times each class is seen in the data. My results show the most common identified objects are cow, bird, horse, sheep, zebra, and elephant. These results make sense, since the data comes from Serengeti pictures, and in particular pictures of mammals.

In [None]:
from time import sleep
from collections import defaultdict

from IPython import display
import matplotlib.pyplot as plt

%matplotlib inline
plt.rcParams['figure.figsize'] = (10, 10)

with ssh.Remote(floating_ip) as conn:
    while True:
        # Download the results file and parse it into a dictionary
        conn.get("out.csv")
        data = defaultdict(int)
        with open("out.csv") as f:
            for line in f.readlines():
                line_categories = set(line.split(","))
                for line_category in [lc.strip() for lc in line_categories if lc.strip()]:
                    data[line_category] += 1
        sorted_data = sorted(data.items(), key=lambda x: x[1], reverse=False)
        categories = [i[0] for i in sorted_data]
        counts = [i[1] for i in sorted_data]
        
        # Display a plot of the data
        display.clear_output(wait=True)
        plt.barh(categories, counts)
        plt.xlabel("Count") 
        plt.ylabel("Class") 
        plt.title("Classes appearing in edge data")
        plt.show()
        
        # Wait a few seconds before downloading the data again.
        sleep(5)