# Start Lease (for bare metal and instance created from images)

This notebook is adapted from "Hello, Chameleon" by Fraida Fund [link](https://www.chameleoncloud.org/experiment/share/a10a1b51-51d7-4c6e-ba83-010a5cf759d6)

In this notebook, we will create and reserve resources in Chameleon (volume, leases, security group, etc.) for this project

# Reserve resources

Whenever you run an experiment on Chameleon, you will

1.  Open a Python notebook, which includes commands to reserve and configure the resources (VMs, bare metal servers, or networks) that you need for your experiment. Run these commands.
2.  Wait until the resources in your experiment are ready to log in.
3.  Log in to the resources and run your experiment (either by executing commands in the notebook, or by using SSH in a terminal and running commands in those SSH sessions).

First, we will need to initialize the environment - tell it what Chameleon project to associate our experiment with.

You should already be a part of a Chameleon project, which has a project ID in the form “CHI-XXXXX”. If you don’t know your project ID, you can find it by logging in to the Chameleon web portal, and checking your [dashboard](https://chameleoncloud.org/user/dashboard/). When you run the next cell, you will see a drop-down menu for selecting your project.

In [1]:
import chi, os, time, datetime
from chi import lease
from chi import server
from chi import context
from chi import hardware
from chi import network
from chi import storage
from chi.clients import cinder, nova, neutron
from chi.image import get_image


context.version = "1.0" 
context.choose_project()
context.choose_site(default="KVM@TACC")
username = os.getenv('USER') # all exp resources will have this suffix

VBox(children=(Dropdown(description='Select Project', options=('CHI-231138',), value='CHI-231138'), Output()))

VBox(children=(Dropdown(description='Select Site', index=6, options=('CHI@TACC', 'CHI@UC', 'CHI@NU', 'CHI@NCAR…

Next, we’ll give our resource a name. Every resource in a project should have a unique name, so we will include a username, as well as a description of the experiment, in the name.

In [39]:
exp_name = "OpenMCP"  # Resource Prefix

server_name = f"{exp_name}-{username}"
lease_name = f"{exp_name}-{username}"

key_name = "id_rsa"  # Specify your Key-Pair that's in the selected chameleon project (to SSH to the instance)

## Create an instance

First we will reserve an instance (bare-metal or VM)

In [None]:
flavor_name = "m1.xxlarge"  # Choose the resource you want to reserve (e.g. m1.xxlarge, gpu_p100, etc.)
hours = 5 * 24  # As much time as you need

In [None]:
l = lease.Lease(lease_name, duration=datetime.timedelta(hours=hours))
l.add_flavor_reservation(id=chi.server.get_flavor_id(flavor_name), amount=1)
l.submit(idempotent=True)

In [None]:
l.show()

then we can launch it:

In [None]:
image_name = "CC-Ubuntu24.04-CUDA"  # CC-Ubuntu24.04 if you only need CPU
s = server.Server(
    name=server_name,
    image_name=image_name,
    flavor_name=l.get_reserved_flavors()[0].name
)
s.submit(idempotent=True)

Once the resource is allocated and ready, we will associate a network address to it, so that we can log in to the resource over the Internet using the SSH protocol.

In [None]:
s.associate_floating_ip()

In [None]:
reserved_fip = s.get_floating_ip()
print(reserved_fip)

There’s one more step before we can log in to the resource - by default, all connections to VM resources are blocked, as a security measure. We will need to add a “security group” that permits SSH connections to our project (if it does not already exist), then attach this security group to our VM resource.

In [None]:
sg_list = network.list_security_groups(name_filter="allow-ssh")
if sg_list: # allow-ssh already exists
    sg = sg_list[0]
else:       # create allow-ssh
    sg = network.SecurityGroup({"name": "allow-ssh", "description": "Enable SSH traffic on TCP port 22"})
    sg.add_rule("ingress", "tcp", 22)
    sg.submit()
s.add_security_group(sg.id)

In [None]:
# Add the other security groups that will be needed for the experiments

PORTS = [8080, 5900, 8501, 6080]
for PORT in PORTS:
    sg_extra_name = f"allow-{str(PORT)}"
    sg_list = network.list_security_groups(name_filter=sg_extra_name)
    if sg_list: # allow-{PORT} already exists
        sg = sg_list[0]
    else:       # create allow-{PORT}
        sg = network.SecurityGroup({"name": sg_extra_name, "description": f"Enable traffic on TCP port {str(PORT)}"})
        sg.add_rule("ingress", "tcp", PORT)
        sg.submit()
    s.add_security_group(sg.id)

That’s all we need to do to prepare a resource to log in!

In [None]:
s.check_connectivity()

## Log in over SSH from local terminal

To log in to the VM over SSH from your local terminal, you will follow a similar process:

-   open the terminal application *installed on your computer*,
-   run the cell below, which will print an SSH login command,
-   copy this command and make any necessary modifications (if needed, as described in the following cell),
-   paste it into your terminal and hit Enter.

In this case, you will specify the key location as part of the SSH command. These instructions assume that, as described in the previous steps, you have created a key pair named `id_rsa_chameleon`, put it in the default `.ssh` subdirectory in your home directory, and uploaded it to the KVM@TACC web interface.

In [71]:
print(f"ssh -i ~/.ssh/{key_name} cc@{floating_ip.get("floating_ip_address")}")

ssh -i ~/.ssh/ cc@129.114.25.14


If your Chameleon key is in a different location, or has a different name, then you may need to modify the `~/.ssh/{key_name}` part of this command to point to *your* key.

The first time you log in to each new host, your computer may display a warning similar to the following:

``` shell
The authenticity of host "129.114.26.xx (129.114.26.xx)" cannot be established.
ED25519 key fingerprint is SHA256:1fcbGrgLDdOeorauhz3CTyhmFqOHsrEWlu0TZ6yGoDM.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])?
```

and you will have to type the word *yes* and hit Enter to continue.

If you have specified your key path and other details correctly, it won’t ask you for a password when you log in to the resource. (It may ask for the passphrase for your private key if you’ve set one.)

# Inside the instance

## Install Docker

- At first ssh into the instance: `ssh -i {key-location} cc@<floating-ip>`
- After you've entered the instance run the following command

```bash
curl -sSL https://get.docker.com/ | sudo sh
sudo groupadd -f docker; sudo usermod -aG docker $USER
```

- Restart the instance
- To check if docker has been installed, run:
```bash 
docker run hello-world
```

## Installing NVIDIA container toolkit

Run the following command in the container

```bash
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
    
sudo apt update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
```
To test if everything is working 
```bash
docker run --rm --gpus all ubuntu nvidia-smi
```
This should show the GPU information of the system

This installs the NVIDIA toolkit and NVIDIA container toolkit so that the docker environment can access NVIDIA drivers for GPU processing

## Clone the Git Repository

The repo is located at
```bash
git clone https://github.com/AguLeon/MCPWorld

# To get all the submodules
cd MCPWorld
git submodule update --init PC-Canary

git submodule update --init --recursive
```

## The Test Environment
Please follow the README.md present in https://github.com/AguLeon/MCPWorld

# Delete resources

Chameleon is a shared facility, and it is important to be mindful of your resource usage and to “free” resources for use by other experimenters when you are finished with them. Your resource will be deleted automatically at the end of your lease, but if you finish sooner, you should delete the compute instance and the lease.

In the cell below, uncomment both lines of code, then run the cell to free

-   the VM and the network address you attached to it.
-   and the reservation.

Note that removing the resources will revoke your access to them, and all the information stored on them will be erased. Therefore, ensure that you have saved all your work before deleting the resources.

In [None]:
# s.delete()
# l.delete()

Alternatively, you can delete your instance using the GUI:

-   From the [Chameleon website](https://chameleoncloud.org/), click on “Experiment \> KVM@TACC” in the menu (since that is the site that our instance is on).
-   Select “Instances” from the menu on the left side.
-   Find your instance in the list. If the project that you are part of has many instances, you can filter by name to make it easier to find yours: change the filter criteria to “Instance Name”, put part of your instance name in the text input field, and click “Filter”.
-   Check the box next to *your* instance (make sure not to select someone else’s!)
-   and press the red “Delete Instances” button.

and you can similarly delete a lease using the GUI:

-   Select “Leases” from the menu on the left side.
-   Find your lease in the list.
-   Check the box next to *your* lease (make sure not to select someone else’s!)
-   and press the red “Delete Lease” button.