# Baremetal Experiment Pattern

This is a simple experiment that illustrates how you can capture power measurements from the bare metal machine directly. It's also an example of how to structure an experiment in a reproducible manner, and illustrates several of the capabilities offered by Chameleon for reproducibility. 

## Configuring resources

We need to configure the experiment "container," meaning to isolated, reproducible environment that our experiment will run in. Initially, this will configure Chameleon resources to the point where you will be able `ssh` to a node. The remainder of this setup is installing software on that node, which in this case is loaded from GitHub.

The first thing you must do is set what site and project to use. You can select this from the dropdown that displays, or use the defaults that are automatically picked. For more information about setting context in python-chi, [see here](https://python-chi.readthedocs.io/en/dev/modules/context.html).

In [None]:
from chi import context

# During the transition period, we need to opt into the some of the 
# new python-chi functions. Otherwise the functional interface will
# return the old types.
context.version = "1.0"

context.choose_site(default="CHI@TACC")
context.choose_project()

### Check available hardware

Next, we'll pick which hardware to us. The following code cell looks for nodes of type matching the `node_type` variable, and filters our ones that are reserved.

This information comes from [the Chameleon hardware repository](https://chameleoncloud.org/hardware), and can be [queryed via python-chi](https://python-chi.readthedocs.io/en/dev/modules/hardware.html).

In [None]:
from chi import hardware

node_type = "compute_cascadelake_r"
available_nodes = hardware.get_nodes(node_type=node_type, filter_reserved=True)
if available_nodes:
    print(f"There currently are {len(available_nodes)} {node_type} nodes ready to use")
else:
    print(f"All {node_type} nodes are in use! You could use next_free_timeslot to see how long you need to wait, or use the calendar.")

### Reserve node

In order to use hardware on Chameleon, you'll need to [make a reservation](https://chameleoncloud.readthedocs.io/en/latest/technical/reservations.html). You can do this via [the python-chi lease module](https://python-chi.readthedocs.io/en/dev/modules/lease.html).

If the resources you want to use are currently free, you can make a lease that starts right now. The following code does this to reserve one of the nodes found above, and it also reserves a floating ip.

In [None]:
from chi import lease
from datetime import timedelta
import os

my_lease = lease.Lease(f"{os.getenv('USER')}-power-management", duration=timedelta(hours=3))
my_lease.add_node_reservation(nodes=[available_nodes[0]]) # or you could use node_type=node_type
my_lease.add_fip_reservation(1) # include a floating ip
my_lease.submit(idempotent=True)

### Create a server on the node

Next, we will launch the reserved node with [an image](https://chameleoncloud.readthedocs.io/en/latest/technical/images.html). You can search for [Chameleon supported images in python-chi](https://python-chi.readthedocs.io/en/dev/modules/image.html). Here, we use an Ubuntu image.

Then, we submit the request to create a server, using the reservation information from our lease. You'll need to wait for the server to fully provision, which can around 10 minutes depending on the node.

In [None]:
from chi import server

my_server = server.Server(
    f"{os.getenv('USER')}-power-management",
    reservation_id=my_lease.node_reservations[0]["id"],
    image_name="CC-Ubuntu22.04", # or use image_name
)
my_server.submit(idempotent=True)

### Configure networking on the node

Now, we must configure the server to use the floating IP we reserved earlier. We'll also need to wait for the networking to finish configuring, which may take a few additional minutes.

In [None]:
fip = my_lease.get_reserved_floating_ips()[0]
my_server.associate_floating_ip(fip)
my_server.check_connectivity(host=fip)

### Install software on the node

Now we will install our software on the node, over SSH. In your own experiment, you would likely want to change these commands.

You can use the `execute` [method](https://python-chi.readthedocs.io/en/dev/modules/server.html#chi.server.Server.execute) or for more advance usage you can [get a Fabric Connection object](https://python-chi.readthedocs.io/en/dev/modules/server.html#chi.server.Server.ssh_connection).

In [None]:
# Clone git repo with experiment source code
my_server.execute("git clone https://github.com/ChameleonCloud/bare_metal_experiment_pattern")

# Run setup script
my_server.execute("bash bare_metal_experiment_pattern/scripts/setup.sh")

## Run Experiment

Now, we can finally run the experiment. This will run the `stress-ng` program on different numbers of CPUs for 10 seconds, and measures the power consumption via `perf`. For more information on measuring power, see [this Chameleon blog post](https://chameleoncloud.org/blog/2024/06/18/power-measurement-and-management-on-chameleon/).

You can edit `iterations` to gather more data points, which will result in a more interesting result.

In [None]:
iterations = 1
for i in range(iterations):
    my_server.execute("bash bare_metal_experiment_pattern/scripts/run_experiment.sh 10")

## Analyze Results
The experiment uploaded results to an object store bucket for today. If you re-run the experiment above, additional results will be placed there.

To analyze these, we'll download the data locally. The Jupyter environment has limited resources, so if the data was large, we could process the data on the baremetal server instead.

In [None]:
from chi import storage
from datetime import datetime
import os

# Get current date in YYYY-MM-DD format
os.makedirs("./out", exist_ok=True)
current_date = datetime.now().strftime('%Y-%m-%d')
b = storage.ObjectBucket(f"bare_metal_experiment_pattern_data_{current_date}")
for obj in b.list_objects():
    print(f"Downloading {obj.name}")
    obj.download(f"out/{obj.name}")

Now, we process the data from the files so we can plot it.

In [None]:
from collections import defaultdict
data = {
    "power/energy-pkg/": defaultdict(list),
    "power/energy-ram/": defaultdict(list),
}

for filename in os.listdir("out/"):
    # Only parse our data files
    if ".out" not in filename:
        continue
    with open(f"out/{filename}") as f:
        for line in f.readlines():
            line = line.strip()
            cores, value, measurement = line.split(" ")
            data[measurement][cores].append(float(value))

Finally, we display a box chart of the data. If you ran the experiment multiple times, you may see some variation on the plot.

In [None]:
import matplotlib.pyplot as plt

# Format perf's label into a nicer string
PERF_CHART_TYPE_FORMAT = {
    "power/energy-pkg/": "CPU",
    "power/energy-ram/": "RAM",
}

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4))
axes = iter([ax1, ax2])

for chart_type, chart_data in data.items():
    subplot = next(axes)
    labels = list(chart_data.keys())
    values = list(chart_data.values())
    subplot.boxplot(values, labels=labels)
    
    subplot.set_title(f'{PERF_CHART_TYPE_FORMAT[chart_type]} Energy Consumption\nfor CPU Utilization % Box Plot')
    subplot.set_xlabel('CPU Utilization %')
    subplot.set_ylabel('Joules')
plt.show()