# Spirit Artifact Evaluation

So far, we have installed system dependencies and applications. We also started the remote memory system from the memory node side. 

Now, we need to do:

- Start the remote memory client on the compute node.
- Set up an Ansible connection from the memory node to the compute node to orchestrate benchmarks.
  - Specifically, we need to generate an SSH key on the memory node and register it on the compute node.
- Run the example benchmark with the configured Ansible.

**Important: in the menu above, make sure that you select `bash` kernel**
- Right top, before the hamburger menu icon (☰), instead of `Python`, it should say `Bash`
- You can change the kernel from the top menu: Kernel - Change Kernel - select Bash - click 'Select'

To run each cell, you can use a shortcut (e.g., cmd + return on Mac) or use the ▶️ button in the menu bar.

## Starting remote memory on the compute node.

On the 🖥️compute node, run the following code:
```bash
cd /opt/spirit/spirit-controller/ae/compute_node
./3.setup_memdev.sh
```

It will take a couple of minutes to set up the remote memory.

Expected output:
```bash
... (omitted)
make[2]: Leaving directory '/opt/spirit/spirit-controller/remote_mem/drivers/mind_ram'
make[1]: Leaving directory '/opt/spirit/linux-6.13'
Using server IP: 10.10.10.221
Using RDMA device: mlx5_3
Wait for the daemon to map the queue
Setting up swapspace version 1, size = 48 GiB (51539603456 bytes)
no label, UUID=c931f058-a39a-4356-bda8-c4c4afff0183
3
50
60
1
99
none
```

## Set up an Ansible connection

We need to prepare an ssh key that will be used by Ansible to run experiments. On the 🗂️memory node (i.e., this node running this Jupyter notebook), run the following cell:

In [1]:
ssh-keygen -t ed25519 -N '' -f ~/.ssh/id_ed25519

echo "Your public key is:"
cat ~/.ssh/id_ed25519.pub

Generating public/private ed25519 key pair.
Your identification has been saved in /users/sslee_cs/.ssh/id_ed25519
Your public key has been saved in /users/sslee_cs/.ssh/id_ed25519.pub
The key fingerprint is:
SHA256:MXii/YHarzmozFieCOeAESqLHkUUPU+VY1JbtxULiCA sslee_cs@node-1.sslee-cs-264825.mind-disagg-pg0.utah.cloudlab.us
The key's randomart image is:
+--[ED25519 256]--+
|   ooE .+oo..o o.|
|  .  o.+ =o.. + .|
|.  .  * *..  . . |
|...  o = o       |
|+  .. o S        |
|+o.  o . .       |
|*.o ... .        |
|o@... .o         |
|o.B.  oo.        |
+----[SHA256]-----+
Your public key is:
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFHF4OD7aN/UW0yFVaOismfasxzViQIH5OavUGBuWtSq sslee_cs@node-1.sslee-cs-264825.mind-disagg-pg0.utah.cloudlab.us


The last line of the cell above should print a generated SSH public key (starting with `ssh-ed25519 `), which needs to be registered on the _compute node_.

**Note) You cannot use the prepopulate key above. Please run the cell above to get your key.**

On the 🖥️ **compute node**, run the following command to register that key (including ' '):
```bash
echo 'YOUR_GENERATED_PUBLIC_KEY' >> ~/.ssh/authorized_keys
```

After this step, this 🗂️memory node should be able to access the compute node as:

In [2]:
ssh -o StrictHostKeyChecking=no 10.10.10.201 -t exit

Connection to 10.10.10.201 closed.


We use Ansible to orchestrate the benchmark on the compute node. To make it run, you need to update the user name in Ansible's `hosts` file.

On the 🗂️**memory node**, update the `hosts` file located at to correctly reflect your user name.

In [3]:
export SPIRIT_PATH="/opt/spirit"
echo "location of hosts file:"
ls $SPIRIT_PATH/spirit-controller/scripts/disagg/ansible/hosts
echo "user name:" $(whoami)

location of hosts file:
/opt/spirit/spirit-controller/scripts/disagg/ansible/hosts
user name: sslee_cs


_Update the `hosts` file above on the 🗂️memory node. Do not change the IP address but the user name_
```
[servers]
10.10.10.201 ansible_user=<your_user_name>
```

## Running the example benchmark

Once you have updated the file, you can run the following command to start the example benchmark.

**Note: Please be patient—this will take approximately 3 to 4 hours to complete.**

---

#### 1) Testing the environment with static allocation

Running the cell below will start the example benchmark using static resource allocation. ⏰ _This step will take approximately 45 ~ 60 minutes to complete._

You can check the log (see the following cells) during the experiment to ensure the system is functioning correctly. Since the cell runs only sequentially, you may need to run the code in your ssh session.

In [4]:
cd $SPIRIT_PATH/spirit-controller/scripts/disagg
./run_eval_2apps_static.sh

Global enforcer config: ../../sample_configs/global_2_apps.json
Resource config: configs/2apps/config_5g_5gbps_30sec.json
Resource picked config: configs/2apps/config_5g_5gbps_30sec_oracle.json
[1mmkdir -p ../../logs[0m
[1msudo docker run --rm -it -v $(pwd)/ansible:/ansible -v $(pwd)/../../sample_configs/global_2_apps.json:/config.json -p 8001:8000 -v $HOME/.ssh/id_ed25519:/root/.ssh/id_ed25519:ro -v $(pwd)/../../:/spirit-controller -e ANSIBLE_HOST_KEY_CHECKING=False ansible-docker ansible-playbook -i hosts run_full_env_static.yml -e "res_alloc_config=configs/2apps/config_5g_5gbps_30sec.json" -e "local_config_prefix=sample_configs/local_2_apps_no_mrc_vm"[0m

PLAY [servers] *****************************************************************

TASK [Gathering Facts] *********************************************************
[0;32mok: [10.10.10.201][0m

TASK [Build the local enforcer] ************************************************
[0;33mchanged: [10.10.10.201][0m

TASK [Get list of 

---

You can check the progress of the benchmark based on Ansible's output.
Or, you can additionally check if the containers on the 🖥️**compute node** are running:

```bash
docker ps
```

Expected output
(note that `bench-mc-client-docker` starts after `TASK [Run the benchmark clients]`):
```
CONTAINER ID   IMAGE                    COMMAND                  CREATED          STATUS          PORTS     NAMES
3c82dd271383   bench-mc-client-docker   "./bench-mc-client -…"   7 seconds ago    Up 6 seconds              spirit_mc_client_2
d705fb35d6b3   memcached                "docker-entrypoint.s…"   37 seconds ago   Up 36 seconds             spirit_memcached_1
53d69399cb6e   stream-docker            "./stream 12"            11 minutes ago   Up 11 minutes             spirit_stream_1
```

The evaluation results will be stored in the log directory in the 🗂️**memory node**: `$SPIRIT_PATH/spirit-controller/res_allocation/logs`

Note that the logs are available after the resource allocator starts, i.e., after the Ansible script prints:
```bash
TASK [Run resource allocator] **************************************************
```

After each experiment finishes, the log files are relocated to the directory according to the resource allocation scheme

For instance, for the static allocation, the sub-directory path is:
``
$SPIRIT_PATH/spirit-controller/res_allocation/logs/mindv2/static
``

In [5]:
export SPIRIT_PATH="/opt/spirit"
ls -al $SPIRIT_PATH/spirit-controller/res_allocation/logs/mindv2/static

total 12
drwxr-xr-x 3 root root 4096 Aug  1 20:51 [0m[01;34m.[0m
drwxr-xr-x 3 root root 4096 Aug  1 20:51 [01;34m..[0m
drwxr-xr-x 3 root root 4096 Aug  1 20:51 [01;34mApp2_int_30.0sec[0m


---

#### 2) Testing Spirit and other resource allocation methods.

By running the cell below, you will run the example benchmark with Spirit and other resource allocation methods. ⏰ _This step will take approximately 3+ hours to complete._

In [6]:
cd $SPIRIT_PATH/spirit-controller/scripts/disagg
./run_eval_2apps_others.sh

Global enforcer config: ../../sample_configs/global_2_apps.json
Resource config: configs/2apps/config_5g_5gbps_30sec.json
Resource picked config: configs/2apps/config_5g_5gbps_30sec_oracle.json
[1mmkdir -p ../../logs[0m
[1msudo docker run --rm -it -v $(pwd)/ansible:/ansible -v $(pwd)/../../sample_configs/global_2_apps.json:/config.json -p 8001:8000 -v $HOME/.ssh/id_ed25519:/root/.ssh/id_ed25519:ro -v $(pwd)/../../:/spirit-controller -e ANSIBLE_HOST_KEY_CHECKING=False ansible-docker ansible-playbook -i hosts run_full_env.yml -e "res_alloc_config=configs/2apps/config_5g_5gbps_30sec.json" -e "local_config_prefix=sample_configs/local_2_apps_vm"[0m

PLAY [servers] *****************************************************************

TASK [Gathering Facts] *********************************************************
[0;32mok: [10.10.10.201][0m

TASK [Build the local enforcer] ************************************************
[0;33mchanged: [10.10.10.201][0m

TASK [Get list of all Docker con

You can check the logs at

In [7]:
export SPIRIT_PATH="/opt/spirit"
ls -al $SPIRIT_PATH/spirit-controller/res_allocation/logs/mindv2/

total 28
drwxr-xr-x 7 root     root            4096 Aug  2 00:11 [0m[01;34m.[0m
drwxr-xr-x 3 sslee_cs mind-disagg-PG0 4096 Aug  2 00:11 [01;34m..[0m
drwxr-xr-x 3 root     root            4096 Aug  1 23:37 [01;34mfij-trade[0m
drwxr-xr-x 3 root     root            4096 Aug  2 00:11 [01;34minc-trade[0m
drwxr-xr-x 3 root     root            4096 Aug  1 23:03 [01;34moracle[0m
drwxr-xr-x 3 root     root            4096 Aug  1 22:29 [01;34mspirit[0m
drwxr-xr-x 3 root     root            4096 Aug  1 20:51 [01;34mstatic[0m


Evaluation results in the papers are mostly depicted by analysing the collected logs.

---

</br>

Contratulations! You have successfully run the example benchmark with Spirit and other resource allocation methods. 🎉

See below for how to extend the example benchmark.

</br>

## MISC – Configuration files for experiments with more than two applications

We provide example configurations that support running more than two applications across up to four compute and memory nodes. However, due to resource limitations, you cannot run more than two applications on this CloudLab instance, as doing so may destabilize the system or even cause it to crash.

There are two types of configurations: (i) cluster setup and (ii) resource allocation setup.

- Cluster setup configurations in [this directory](https://github.com/yale-nova/spirit/tree/main/sample_configs) define how different applications are launched on the compute node(s).

- Resource allocation configurations in [this directory](https://github.com/yale-nova/spirit/tree/main/res_allocation/configs) specify the system-wide resource budgets, such as local DRAM size and network bandwidth.

For the resource allocation schemes, [this directory](https://github.com/yale-nova/spirit/tree/main/scripts/disagg/ansible) contains Ansible workbooks that define each experiment run, including which resource allocation algorithm to use.

Finally, you can see other scripts that we used to run other experiments in the paper at [this directory](https://github.com/yale-nova/spirit/tree/main/scripts/disagg) (`run_eval_...`):
- `run_eval_1app.sh`: runs a single application to measure performance. It can be across different resource allocation (e.g., motivation figures).
- `run_eval_2app.sh`: runs two applications to measure performance. It can be across different allocation interval (e.g., Figure 7 in the paper).
- `run_eval_4apps_dyn.sh`: runs four applications of which resource sensitivity changes dynamically (e.g., Figure 9 in the paper).
- `run_eval_6apps.sh`: runs six applications to measure performance (e.g., Figure 6 in the paper).

---

We hope you find these configurations useful 🚀. If you have any questions or need further assistance, please feel free to reach out to us 💬.