# Start Lease
This notebook is adapted from "Hello, Chameleon" by Fraida Fund [link](https://www.chameleoncloud.org/experiment/share/a10a1b51-51d7-4c6e-ba83-010a5cf759d6)

In this notebook, you will pick up where you left off after creating a Chameleon account, joining a Chameleon project, and preparing key pair. Now, you will learn how to:

-   Reserve resources in Chameleon
-   Access your reserved resources over SSH
-   Execute commands on your resources
-   Retrieving files saved on Chameleon resources
-   Extend your Chameleon lease (in case you need more time) or delete it (in case you finish early)

## Reserve resources

Whenever you run an experiment on Chameleon, you will

1.  Open a Python notebook, which includes commands to reserve and configure the resources (VMs, bare metal servers, or networks) that you need for your experiment. Run these commands.
2.  Wait until the resources in your experiment are ready to log in.
3.  Log in to the resources and run your experiment (either by executing commands in the notebook, or by using SSH in a terminal and running commands in those SSH sessions).

Also, when you finish an experiment and have saved all the data somewhere safe, you will *delete* the resources in your experiment to free them for use by other experimenters.

In this exercise, we will reserve a single virtual machine on Chameleon, and practice logging in to execute commands on this VM.

First, we will need to initialize the environment - tell it what Chameleon project to associate our experiment with.

You should already be a part of a Chameleon project, which has a project ID in the form “CHI-XXXXX”. If you don’t know your project ID, you can find it by logging in to the Chameleon web portal, and checking your [dashboard](https://chameleoncloud.org/user/dashboard/). When you run the next cell, you will see a drop-down menu for selecting your project.

We will also indicate which Chameleon site we want to use. Since this experiment uses a virtual machine, the site will be KVM@TACC - the only Chameleon site that supports VMs.

In [1]:
import chi, os, time, datetime
from chi import lease
from chi import server
from chi import context
from chi import hardware
from chi import network
from chi import storage
from chi.clients import cinder, nova, neutron
from chi.image import get_image


context.version = "1.0" 
context.choose_project()
context.choose_site(default="KVM@TACC")
username = os.getenv('USER') # all exp resources will have this suffix

VBox(children=(Dropdown(description='Select Project', options=('CHI-231138',), value='CHI-231138'), Output()))

VBox(children=(Dropdown(description='Select Site', index=6, options=('CHI@TACC', 'CHI@UC', 'CHI@NU', 'CHI@NCAR…

Next, we’ll give our resource a name. Every resource in a project should have a unique name, so we will include a username, as well as a description of the experiment, in the name.

In [39]:
exp_name = "MCPWorld_AP_Project_single_gpu"
server_name = f"{exp_name}-{username}"
lease_name = f"{exp_name}-{username}"

key_name = "id_rsa"  # Specify the Key Pair name (to SSH to the instance)

## Creating a volume
Since we will need bigger volume to run all the models, we will create a volume at first

In [20]:
# Get the Ubuntu 24.04 image ID
image = get_image("CC-Ubuntu24.04-CUDA")
image_id = image.uuid

In [22]:
# Create a volume from the image using the cinder client directly
volume = cinder().volumes.create(
    size=250,                    
    name=lease_name,
    description="Ubuntu 24.04-CUDA bootable volume. For coding and development tasks",
    imageRef=image_id,          
    volume_type="ceph-ssd",
)

NOTE: Wait until the volume is available before using it!

Now we are ready to ask Chameleon to allocate a resource to us! For a VM, we specify the “flavor” or size of the resource (in terms of CPU, memory, and storage) and the operating system image that we want to have pre-installed.

First we will reserve the VM instance for 1 week, starting now:

In [5]:
flavor_name = "g1.h100.pci.1"  # Choose the resource you want to reserve
# flavor_name = "g1.h100.pci.4"
# flavor_name = "m1.small"
hours = 5 * 24  # As much time as you need

In [6]:
l = lease.Lease(lease_name, duration=datetime.timedelta(hours=hours))  
l.add_flavor_reservation(id=chi.server.get_flavor_id(flavor_name), amount=1)
l.submit(idempotent=True)

Waiting for lease to start...


HBox(children=(Label(value=''), IntProgress(value=0, bar_style='success')))

Lease MCPWorld_AP_Project_single_gpu-arn8147_nyu_edu has reached status active


In [7]:
l.show()

HTML(value='\n        <h2>Lease Details</h2>\n        <table>\n            <tr><th>Name</th><td>MCPWorld_AP_Pr…

Lease Details:
Name: MCPWorld_AP_Project_single_gpu-arn8147_nyu_edu
ID: 804a994b-f7ea-4773-b2fd-3cff66cfafd9
Status: ACTIVE
Start Date: 2026-01-30 19:34:00
End Date: 2026-02-04 19:34:00
User ID: e3daefa0fc353dc1d7aaa21f0af4b64aa299482d36fd32ef3332c3966d6e4667
Project ID: 13a1ac1ce275484caedc3394339486a1

Node Reservations:

Floating IP Reservations:

Network Reservations:

Flavor Reservations:
ID: 2bc156d1-376c-4ae8-a8ae-f64e3715c258, Status: active, Flavor: 2bc156d1-376c-4ae8-a8ae-f64e3715c258, Amount: 1

Events:


then we can launch it:

In [16]:
# We are currently using nova().servers method to create server
# As it has easy API to create server and connect existing volume
help(nova().servers.create)

Help on method create in module novaclient.v2.servers:

create(name, image, flavor, meta=None, files=None, reservation_id=False, min_count=None, max_count=None, security_groups=None, userdata=None, key_name=None, availability_zone=None, block_device_mapping=None, block_device_mapping_v2=None, nics=None, scheduler_hints=None, config_drive=None, disk_config=None, admin_pass=None, access_ip_v4=None, access_ip_v6=None, description=None, tags=None, trusted_image_certificates=None, host=None, hypervisor_hostname=None, hostname=None) method of novaclient.v2.servers.ServerManager instance
    Create (boot) a new server.

    In order to create a server with pre-existing ports that contain a
    ``resource_request`` value, such as for guaranteed minimum bandwidth
    quality of service support, microversion ``2.72`` is required.

    :param name: Something to name the server.
    :param image: The :class:`Image` to boot with.
    :param flavor: The :class:`Flavor` to boot onto.
    :param meta:

In [63]:
# Get network
networks = neutron().list_networks(name='sharednet1')['networks']
network_id = networks[0]['id']

# Block device mapping
block_device_mapping_v2 = [{
    'boot_index': 0,
    'uuid': volume.id,  # You can replace this id with a pre-existing Volume ID as well!
    'source_type': 'volume',
    'destination_type': 'volume',
    'delete_on_termination': False,
    'volume_size': volume.size,  # Optional 
}]

# Get the reserved flavor - use the baremetal flavor name from the reservation
reserved_flavor = l.get_reserved_flavors()[0]
reservation_id = reserved_flavor.id

s = nova().servers.create(
    name=server_name,
    # image=image.uuid,  # Empty string when booting from volume,
    image="",  # Empty string when booting from volume,
    flavor=reserved_flavor.name,  # e.g: reservation:<reservation_id>
    nics=[{"net-id": network_id}],
    key_name=key_name,  # The Key Pairs name you have in your server
    block_device_mapping_v2=block_device_mapping_v2,
)

print(f"Server created: {s.id}")

The python binding code in neutronclient is deprecated in favor of OpenstackSDK, please use that as this will be removed in a future release.


Server created: f8f1dd74-f1ae-4ca1-8f28-056f15cd9b9c


In [64]:
# Get free floating ip
neutron_client = neutron()
floating_ips = neutron_client.list_floatingips()['floatingips']
available = [fip for fip in floating_ips if fip['port_id'] is None]
floating_ip = available[0]
floating_ip

The python binding code in neutronclient is deprecated in favor of OpenstackSDK, please use that as this will be removed in a future release.


{'id': '23b06ca2-8751-4b95-946c-93499c193790',
 'tenant_id': '13a1ac1ce275484caedc3394339486a1',
 'floating_ip_address': '129.114.25.14',
 'floating_network_id': '69adad42-e10e-4e34-ab68-62cbe7fc23b1',
 'router_id': None,
 'port_id': None,
 'fixed_ip_address': None,
 'status': 'DOWN',
 'description': 'Cloud IP for arn8147',
 'port_details': None,
 'tags': [],
 'created_at': '2026-01-28T23:00:34Z',
 'updated_at': '2026-01-30T21:27:28Z',
 'revision_number': 4,
 'project_id': '13a1ac1ce275484caedc3394339486a1'}

Once the resource is allocated and ready, we will associate a network address to it, so that we can log in to the resource over the Internet using the SSH protocol.

There’s one more step before we can log in to the resource - by default, all connections to VM resources are blocked, as a security measure. We will need to add a “security group” that permits SSH connections to our project (if it does not already exist), then attach this security group to our VM resource.

In [65]:
# Bind associated floating ip
port_id = s.interface_list()[0].port_id
neutron_client.update_floatingip(floating_ip['id'], {
    'floatingip': {
        'port_id': port_id,
    }
})

{'floatingip': {'id': '23b06ca2-8751-4b95-946c-93499c193790',
  'tenant_id': '13a1ac1ce275484caedc3394339486a1',
  'floating_ip_address': '129.114.25.14',
  'floating_network_id': '69adad42-e10e-4e34-ab68-62cbe7fc23b1',
  'router_id': '078757ca-5b95-4d54-a671-fe35d5ac37e3',
  'port_id': 'e548748b-b64c-492a-beea-db6e8a39b399',
  'fixed_ip_address': '10.56.2.41',
  'status': 'DOWN',
  'description': 'Cloud IP for arn8147',
  'port_details': {'name': '',
   'network_id': '50073c73-5817-49c3-8e3a-69b8c357e158',
   'mac_address': 'fa:16:3e:ff:4a:65',
   'admin_state_up': True,
   'status': 'ACTIVE',
   'device_id': 'f8f1dd74-f1ae-4ca1-8f28-056f15cd9b9c',
   'device_owner': 'compute:nova'},
  'tags': [],
  'created_at': '2026-01-28T23:00:34Z',
  'updated_at': '2026-01-30T21:35:08Z',
  'revision_number': 5,
  'project_id': '13a1ac1ce275484caedc3394339486a1'}}

In [66]:
nova_client = nova()
attached_sgs = [sg["name"] for sg in nova_client.servers.get(s.id).security_groups]  # Get all existing security groups

In [67]:
sg_list = network.list_security_groups(name_filter="allow-ssh")
if sg_list: # allow-ssh already exists
    sg = sg_list[0]
else:       # create allow-ssh
    sg = network.SecurityGroup({"name": "allow-ssh", "description": "Enable SSH traffic on TCP port 22"})
    sg.add_rule("ingress", "tcp", 22)
    sg.submit()
if sg.name not in attached_sgs:
    nova_client.servers.add_security_group(s.id, sg.name)

The python binding code in neutronclient is deprecated in favor of OpenstackSDK, please use that as this will be removed in a future release.


In [68]:
# Add HTTP
sg_http = network.list_security_groups(name_filter="allow-http")[0]
if sg_http and sg_http.name not in attached_sgs:
    nova_client.servers.add_security_group(s.id, sg_http.name)

The python binding code in neutronclient is deprecated in favor of OpenstackSDK, please use that as this will be removed in a future release.


In [69]:
# Extra Ports that are needed
PORTS = [8080, 5900, 8501, 6080]
for PORT in PORTS:
    sg_extra_name = f"allow-{str(PORT)}"
    sg_list = network.list_security_groups(name_filter=sg_extra_name)
    if sg_list: # allow-{PORT} already exists
        sg = sg_list[0]
    else:       # create allow-{PORT}
        sg = network.SecurityGroup({"name": sg_extra_name, "description": f"Enable traffic on TCP port {str(PORT)}"})
        sg.add_rule("ingress", "tcp", PORT)
        sg.submit()
    if sg.name not in attached_sgs:
        nova_client.servers.add_security_group(s.id, sg.name)

The python binding code in neutronclient is deprecated in favor of OpenstackSDK, please use that as this will be removed in a future release.
The python binding code in neutronclient is deprecated in favor of OpenstackSDK, please use that as this will be removed in a future release.
The python binding code in neutronclient is deprecated in favor of OpenstackSDK, please use that as this will be removed in a future release.
The python binding code in neutronclient is deprecated in favor of OpenstackSDK, please use that as this will be removed in a future release.


> [!NOTE]
> To run the demo, we can access it using port 6080 (for the VNC server), 8501 (for streamlit interface)

That’s all we need to do to prepare a resource to log in! Run the following cell - when it returns, it means that the VM resource is ready for you to log in.

### Log in over SSH from Jupyter environment

One of the easiest ways to log in to your VM is to open a shell inside the Jupyter environment, and log in over SSH from that shell.

In the Chameleon JupyterHub environment, click File \> New \> Terminal. This will open another tab in the Jupyter environment, with a shell session.

Now, run this cell to get the SSH login command. Copy the output of the cell:

In [70]:
print(f"ssh cc@{floating_ip.get("floating_ip_address")}")

ssh cc@129.114.25.14


then switch to your terminal shell tab, paste the SSH login command, and hit Enter.

The first time you log in to each new host, you may see a warning similar to the following:

``` shell
The authenticity of host "129.114.26.xx (129.114.26.xx)" cannot be established.
ED25519 key fingerprint is SHA256:1fcbGrgLDdOeorauhz3CTyhmFqOHsrEWlu0TZ6yGoDM.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])?
```

and you will have to type the word *yes* and hit Enter to continue.

Then, you’ll be logged in! To validate that you are logged in to the remote host, and not running commands directly in the Jupyter shell environment, run

``` shell
hostname
```

and verify that the output starts with “hello-chameleon”. (This is the hostname we assigned to our VM resource!)

### Log in over SSH from local terminal

To log in to the VM over SSH from your local terminal, you will follow a similar process:

-   open the terminal application *installed on your computer*,
-   run the cell below, which will print an SSH login command,
-   copy this command and make any necessary modifications (if needed, as described in the following cell),
-   paste it into your terminal and hit Enter.

In this case, you will specify the key location as part of the SSH command. These instructions assume that, as described in the previous steps, you have created a key pair named `id_rsa_chameleon`, put it in the default `.ssh` subdirectory in your home directory, and uploaded it to the KVM@TACC web interface.

In [71]:
print(f"ssh -i ~/.ssh/{key_name} cc@{floating_ip.get("floating_ip_address")}")

ssh -i ~/.ssh/ cc@129.114.25.14


If your Chameleon key is in a different location, or has a different name, then you may need to modify the `~/.ssh/{key_name}` part of this command to point to *your* key.

The first time you log in to each new host, your computer may display a warning similar to the following:

``` shell
The authenticity of host "129.114.26.xx (129.114.26.xx)" cannot be established.
ED25519 key fingerprint is SHA256:1fcbGrgLDdOeorauhz3CTyhmFqOHsrEWlu0TZ6yGoDM.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])?
```

and you will have to type the word *yes* and hit Enter to continue.

If you have specified your key path and other details correctly, it won’t ask you for a password when you log in to the resource. (It may ask for the passphrase for your private key if you’ve set one.)

# Install Docker

- At first ssh into the instance: `ssh -i {key-location} cc@<floating-ip>`
- After you've entered the instance run the following command

```bash
curl -sSL https://get.docker.com/ | sudo sh
sudo groupadd -f docker; sudo usermod -aG docker $USER
```

- Restart the instance
- To check if docker has been installed, run:
```bash 
docker run hello-world
```

### Installing NVIDIA container toolkit

Run the following command in the container

```bash
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
    
sudo apt update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
```
To test if everything is working 
```bash
docker run --rm --gpus all ubuntu nvidia-smi
```
This should show the GPU information of the system

This installs the NVIDIA toolkit and NVIDIA container toolkit so that the docker environment can access NVIDIA drivers for GPU processing

# Clone the Git Repository

The repo is located at
```bash
git clone https://github.com/AguLeon/MCPWorld

# To get all the submodules
cd MCPWorld
git submodule update --init PC-Canary

git submodule update --init --recursive
```

## The Test Environment
Please follow the README.md present in https://github.com/AguLeon/MCPWorld

## Delete resources

Chameleon is a shared facility, and it is important to be mindful of your resource usage and to “free” resources for use by other experimenters when you are finished with them. Your resource will be deleted automatically at the end of your lease, but if you finish sooner, you should delete the compute instance and the lease.

In the cell below, uncomment both lines of code, then run the cell to free

-   the VM and the network address you attached to it.
-   and the reservation.

Note that removing the resources will revoke your access to them, and all the information stored on them will be erased. Therefore, ensure that you have saved all your work before deleting the resources.

In [None]:
# s.delete()
# l.delete()

Alternatively, you can delete your instance using the GUI:

-   From the [Chameleon website](https://chameleoncloud.org/), click on “Experiment \> KVM@TACC” in the menu (since that is the site that our instance is on).
-   Select “Instances” from the menu on the left side.
-   Find your instance in the list. If the project that you are part of has many instances, you can filter by name to make it easier to find yours: change the filter criteria to “Instance Name”, put part of your instance name in the text input field, and click “Filter”.
-   Check the box next to *your* instance (make sure not to select someone else’s!)
-   and press the red “Delete Instances” button.

and you can similarly delete a lease using the GUI:

-   Select “Leases” from the menu on the left side.
-   Find your lease in the list.
-   Check the box next to *your* lease (make sure not to select someone else’s!)
-   and press the red “Delete Lease” button.