#Launching a Jupyter Notebook Manually on Chameleon
1. Go to the Chameleon Dashboard
Visit: https://www.chameleoncloud.org/

Sign in and choose your project allocation (if you’re in a project group, select the correct one)

2. Launch a Jupyter Environment
In the top menu, click "User Interfaces" → "Jupyter"

Choose your site (e.g., UC/TACC/CHI@TACC)

Click "Launch Jupyter"

Select a CUDA-enabled environment (like CC-Ubuntu24.04-CUDA or CC-Ubuntu20.04-CUDA)

Under Extra Options, you can specify:

Number of vCPUs (4–8 is fine)

RAM (16GB+ recommended)

GPU (A100 if available and you plan to fine-tune)

3. Once It’s Launched
You’ll be redirected to a JupyterLab interface

Open a terminal inside JupyterLab

Clone your repo (or download notebooks directly):

bash
Copy
Edit
git clone https://github.com/your-username/your-repo.git
cd your-repo


## Bring up a GPU server

At the beginning of the lease time, we will bring up our GPU server. We will use the `python-chi` Python API to Chameleon to provision our server.

We will execute the cells in this notebook inside the Chameleon Jupyter environment.

Run the following cell, and make sure the correct project is selected:

In [4]:
from chi import server, context, lease
import os

context.version = "1.0"
context.choose_project()
context.choose_site(default="CHI@UC")


VBox(children=(Dropdown(description='Select Project', options=('CHI-251409',), value='CHI-251409'), Output()))

VBox(children=(Dropdown(description='Select Site', index=1, options=('CHI@TACC', 'CHI@UC', 'CHI@EVL', 'CHI@NCA…

Change the string in the following cell to reflect the name of *your* lease (**with your own net ID**), then run it to get your lease:


In [5]:
l = lease.get_lease(f"project32_finetune") # or llm_single_netID, or llm_multi_netID
l.show()

HTML(value='\n        <h2>Lease Details</h2>\n        <table>\n            <tr><th>Name</th><td>project32_fine…

Lease Details:
Name: project32_finetune
ID: 76e1d149-5f25-4ad2-8f5c-7c8be8175e87
Status: ACTIVE
Start Date: 2025-04-30 14:00:00
End Date: 2025-04-30 23:00:00
User ID: 56d5c544b0f21fe6899983b50641e15cc9362aa5af4f0452f1c4818715e10e3f
Project ID: 7c0a7a1952e44c94aa75cae1ff5dc9b4

Node Reservations:
ID: a5680ee8-bca7-45a6-8247-99744be87fda, Status: active, Min: 1, Max: 1

Floating IP Reservations:

Network Reservations:

Events:



The status should show as “ACTIVE” now that we are past the lease start time.

The rest of this notebook can be executed without any interactions from you, so at this point, you can save time by clicking on this cell, then selecting Run \> Run Selected Cell and All Below from the Jupyter menu.

As the notebook executes, monitor its progress to make sure it does not get stuck on any execution error, and also to see what it is doing!
We will use the lease to bring up a server with the `CC-Ubuntu24.04-CUDA` disk image. (Note that the reservation information is passed when we create the instance!) This will take up to 10 minutes.

In [6]:
username = os.getenv('USER') # all exp resources will have this prefix
s = server.Server(
    f"node-llm-{username}",
    reservation_id=l.node_reservations[0]["id"],
    image_name="CC-Ubuntu24.04-CUDA"
)
s.submit(idempotent=True)

Waiting for server node-llm-ks7406_nyu_edu's status to become ACTIVE. This typically takes 10 minutes, but can take up to 20 minutes.


HBox(children=(Label(value=''), IntProgress(value=0, bar_style='success')))

Server has moved to status ACTIVE


Attribute,node-llm-ks7406_nyu_edu
Id,650a117c-78ac-4158-a447-490c04587b81
Status,ACTIVE
Image Name,CC-Ubuntu24.04-CUDA
Flavor Name,baremetal
Addresses,sharednet1:  IP: 10.140.82.109 (v4)  Type: fixed  MAC: 14:23:f2:a3:f8:00
Network Name,sharednet1
Created At,2025-04-30T19:15:51Z
Keypair,ks7406_nyu_edu-jupyter
Reservation Id,a5680ee8-bca7-45a6-8247-99744be87fda
Host Id,c15c5d0cd98629a41c320d11364f137f4320899eed52f609fb88500c


Note: security groups are not used at Chameleon bare metal sites, so we do not have to configure any security groups on this instance.
Then, we’ll associate a floating IP with the instance, so that we can access it over SSH.


In [7]:
s.associate_floating_ip()
s.refresh()
s.check_connectivity()
s.refresh()
s.show(type="widget")

Checking connectivity to 192.5.87.113 port 22.


HBox(children=(Label(value=''), IntProgress(value=0, bar_style='success')))

Connection successful


Attribute,node-llm-ks7406_nyu_edu
Id,650a117c-78ac-4158-a447-490c04587b81
Status,ACTIVE
Image Name,CC-Ubuntu24.04-CUDA
Flavor Name,baremetal
Addresses,sharednet1:  IP: 10.140.82.109 (v4)  Type: fixed  MAC: 14:23:f2:a3:f8:00  IP: 192.5.87.113 (v4)  Type: floating  MAC: 14:23:f2:a3:f8:00
Network Name,sharednet1
Created At,2025-04-30T19:15:51Z
Keypair,ks7406_nyu_edu-jupyter
Reservation Id,a5680ee8-bca7-45a6-8247-99744be87fda
Host Id,c15c5d0cd98629a41c320d11364f137f4320899eed52f609fb88500c


## Set up Docker with NVIDIA container toolkit

To use common deep learning frameworks like Tensorflow or PyTorch, we can run containers that have all the prerequisite libraries necessary for these frameworks. Here, we will set up the container framework.

In [8]:
s.execute("curl -sSL https://get.docker.com/ | sudo sh")
s.execute("sudo groupadd -f docker; sudo usermod -aG docker $USER")
s.execute("docker run hello-world")



# Executing docker install script, commit: 53a22f61c0628e58e1d6680b49e82993d304b449


+ sh -c apt-get -qq update >/dev/null
+ sh -c DEBIAN_FRONTEND=noninteractive apt-get -y -qq install ca-certificates curl >/dev/null
+ sh -c install -m 0755 -d /etc/apt/keyrings
+ sh -c curl -fsSL "https://download.docker.com/linux/ubuntu/gpg" -o /etc/apt/keyrings/docker.asc
+ sh -c chmod a+r /etc/apt/keyrings/docker.asc
+ sh -c echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu noble stable" > /etc/apt/sources.list.d/docker.list
+ sh -c apt-get -qq update >/dev/null
+ sh -c DEBIAN_FRONTEND=noninteractive apt-get -y -qq install docker-ce docker-ce-cli containerd.io docker-compose-plugin docker-ce-rootless-extras docker-buildx-plugin >/dev/null

Running kernel seems to be up-to-date.

The processor microcode seems to be up-to-date.

No services need to be restarted.

No containers need to be restarted.

No user sessions are running outdated binaries.

No VM guests are running outdated hypervisor (qemu) binaries on this host.
+ sh -c doc

Client: Docker Engine - Community
 Version:           28.1.1
 API version:       1.49
 Go version:        go1.23.8
 Git commit:        4eba377
 Built:             Fri Apr 18 09:52:14 2025
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          28.1.1
  API version:      1.49 (minimum version 1.24)
  Go version:       go1.23.8
  Git commit:       01f442b
  Built:            Fri Apr 18 09:52:14 2025
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.7.27
  GitCommit:        05044ec0a9a75232cad458027ca83437aae3f4da
 runc:
  Version:          1.2.5
  GitCommit:        v1.2.5-0-g59923ef
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0


To run Docker as a non-privileged user, consider setting up the
Docker daemon in rootless mode for your user:

    dockerd-rootless-setuptool.sh install

Visit https://docs.docker.com/go/rootless/ to learn about rootless mode.


T

Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
e6590344b1a5: Pulling fs layer
e6590344b1a5: Download complete
e6590344b1a5: Pull complete
Digest: sha256:c41088499908a59aae84b0a49c70e86f4731e588a737f1637e73c8c09d995654
Status: Downloaded newer image for hello-world:latest



Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (amd64)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/get-started/



<Result cmd='docker run hello-world' exited=0>

We will also install the NVIDIA container toolkit, with which we can access GPUs from inside our containers.


In [9]:
# get NVIDIA container toolkit
s.execute("curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list")
s.execute("sudo apt update")
s.execute("sudo apt-get install -y nvidia-container-toolkit")
s.execute("sudo nvidia-ctk runtime configure --runtime=docker")
s.execute("sudo systemctl restart docker")


deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://nvidia.github.io/libnvidia-container/stable/deb/$(ARCH) /
#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://nvidia.github.io/libnvidia-container/experimental/deb/$(ARCH) /






Get:1 https://nvidia.github.io/libnvidia-container/stable/deb/amd64  InRelease [1477 B]
Hit:2 https://download.docker.com/linux/ubuntu noble InRelease
Hit:3 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64  InRelease
Get:4 https://nvidia.github.io/libnvidia-container/stable/deb/amd64  Packages [18.6 kB]
Hit:5 http://security.ubuntu.com/ubuntu noble-security InRelease
Get:6 http://nova.clouds.archive.ubuntu.com/ubuntu noble InRelease [256 kB]
Get:7 http://nova.clouds.archive.ubuntu.com/ubuntu noble-updates InRelease [126 kB]
Get:8 http://nova.clouds.archive.ubuntu.com/ubuntu noble-backports InRelease [126 kB]
Fetched 528 kB in 1s (495 kB/s)
Reading package lists...
Building dependency tree...
Reading state information...
58 packages can be upgraded. Run 'apt list --upgradable' to see them.
Reading package lists...
Building dependency tree...
Reading state information...
The following additional packages will be installed:
  libnvidia-container-tools libnvidia-c

debconf: unable to initialize frontend: Dialog
debconf: (Dialog frontend will not work on a dumb terminal, an emacs shell buffer, or without a controlling terminal.)
debconf: falling back to frontend: Readline
debconf: unable to initialize frontend: Readline
debconf: (This frontend requires a controlling tty.)
debconf: falling back to frontend: Teletype
dpkg-preconfigure: unable to re-open stdin: 


Fetched 5849 kB in 2s (2656 kB/s)
Selecting previously unselected package libnvidia-container1:amd64.
(Reading database ... 113552 files and directories currently installed.)
Preparing to unpack .../libnvidia-container1_1.17.6-1_amd64.deb ...
Unpacking libnvidia-container1:amd64 (1.17.6-1) ...
Selecting previously unselected package libnvidia-container-tools.
Preparing to unpack .../libnvidia-container-tools_1.17.6-1_amd64.deb ...
Unpacking libnvidia-container-tools (1.17.6-1) ...
Selecting previously unselected package nvidia-container-toolkit-base.
Preparing to unpack .../nvidia-container-toolkit-base_1.17.6-1_amd64.deb ...
Unpacking nvidia-container-toolkit-base (1.17.6-1) ...
Selecting previously unselected package nvidia-container-toolkit.
Preparing to unpack .../nvidia-container-toolkit_1.17.6-1_amd64.deb ...
Unpacking nvidia-container-toolkit (1.17.6-1) ...
Setting up nvidia-container-toolkit-base (1.17.6-1) ...
Setting up libnvidia-container1:amd64 (1.17.6-1) ...
Setting up lib

debconf: unable to initialize frontend: Dialog
debconf: (Dialog frontend will not work on a dumb terminal, an emacs shell buffer, or without a controlling terminal.)
debconf: falling back to frontend: Readline
debconf: unable to initialize frontend: Readline
debconf: (This frontend requires a controlling tty.)
debconf: falling back to frontend: Teletype

Running kernel seems to be up-to-date.

The processor microcode seems to be up-to-date.

No services need to be restarted.

No containers need to be restarted.

No user sessions are running outdated binaries.

No VM guests are running outdated hypervisor (qemu) binaries on this host.
time="2025-04-30T19:28:58Z" level=info msg="Config file does not exist; using empty config"
time="2025-04-30T19:28:58Z" level=info msg="Wrote updated config to /etc/docker/daemon.json"
time="2025-04-30T19:28:58Z" level=info msg="It is recommended that docker daemon be restarted."


<Result cmd='sudo systemctl restart docker' exited=0>

In the following cell, we will verify that we can see our NVIDIA GPUs from inside a container, by passing `--gpus-all`. (The `-rm` flag says to clean up the container and remove its filesystem when it finishes running.)


In [10]:
s.execute("docker run --rm --gpus all ubuntu nvidia-smi")

Unable to find image 'ubuntu:latest' locally
latest: Pulling from library/ubuntu
2726e237d1a3: Pulling fs layer
2726e237d1a3: Verifying Checksum
2726e237d1a3: Download complete
2726e237d1a3: Pull complete
Digest: sha256:1e622c5f073b4f6bfad6632f2616c7f59ef256e96fe78bf6a595d1dc4376ac02
Status: Downloaded newer image for ubuntu:latest


Wed Apr 30 19:29:23 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.05              Driver Version: 560.35.05      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA A100 80GB PCIe          Off |   00000000:27:00.0 Off |                    0 |
| N/A   63C    P0             65W /  300W |       1MiB /  81920MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
                                                

<Result cmd='docker run --rm --gpus all ubuntu nvidia-smi' exited=0>

## Pull and start container

Let’s pull the container:


In [11]:
s.execute("docker pull quay.io/jupyter/pytorch-notebook:cuda12-pytorch-2.5.1")


cuda12-pytorch-2.5.1: Pulling from jupyter/pytorch-notebook
54609b48ebc1: Pulling fs layer
1bf84e16ee78: Pulling fs layer
70c528583d48: Pulling fs layer
4f4fb700ef54: Pulling fs layer
d20e2bbc3444: Pulling fs layer
e1236ab05074: Pulling fs layer
eb55310ee8e5: Pulling fs layer
247e1eb593d5: Pulling fs layer
1ac7d388a98d: Pulling fs layer
f27656f139fb: Pulling fs layer
ac0c7c38d6ad: Pulling fs layer
c48c65e32957: Pulling fs layer
d79a028c87d5: Pulling fs layer
1ac7d388a98d: Waiting
f27656f139fb: Waiting
e1236ab05074: Waiting
ac0c7c38d6ad: Waiting
eb55310ee8e5: Waiting
4f4fb700ef54: Waiting
d20e2bbc3444: Waiting
c48c65e32957: Waiting
97452dad12c2: Pulling fs layer
aad099df5a3e: Pulling fs layer
aad099df5a3e: Waiting
41e13f849bef: Pulling fs layer
3734b2c4ea95: Pulling fs layer
5d39b32f56c0: Pulling fs layer
41e13f849bef: Waiting
dfe267ba30ed: Pulling fs layer
92e65b90a905: Pulling fs layer
bc6b6634a9a2: Pulling fs layer
35eb378f4751: Pulling fs layer
3734b2c4ea95: Waiting
be517295261b: Pu

<Result cmd='docker pull quay.io/jupyter/pytorch-notebook:cuda12-pytorch-2.5.1' exited=0>

and get it running:

In [12]:
s.execute("docker run -d -p 8888:8888 --gpus all --name torchnb quay.io/jupyter/pytorch-notebook:cuda12-pytorch-2.5.1")


2682a5a2dfcb94ee58af188afccd2c48753293d93f28ee6c5a5e596b6d61afde


<Result cmd='docker run -d -p 8888:8888 --gpus all --name torchnb quay.io/jupyter/pytorch-notebook:cuda12-pytorch-2.5.1' exited=0>

There’s one more thing we must do before we can start out Jupyter server. Rather than expose the Jupyter server to the Internet, we are going to set up an SSH tunnel from our local terminal to our server, and access the service through that tunnel.

Here’s how it works: In your *local* terminal, run

    ssh -L 8888:127.0.0.1:8888 -i ~/.ssh/id_rsa_chameleon cc@A.B.C.D

where,

-   instead of `~/.ssh/id_rsa_chameleon`, substitute the path to your key
-   and instead of `A.B.C.D`, substitute the floating IP associated with your server

This will configure the SSH session so that when you connect to port 8888 locally, it will be forwarded over the SSH tunnel to port 8888 on the host at the other end of the SSH connection.

SSH tunneling is a convenient way to access services on a remote machine when you don’t necessarily want to expose those services to the Internet (for example: if they are not secured from unauthorized access).
Finally, run


In [15]:
s.execute("docker logs torchnb")

Entered start.sh with args: start-notebook.py
Running hooks in: /usr/local/bin/start-notebook.d as uid: 1000 gid: 100
Done running hooks in: /usr/local/bin/start-notebook.d
Running hooks in: /usr/local/bin/before-notebook.d as uid: 1000 gid: 100
Sourcing shell script: /usr/local/bin/before-notebook.d/10activate-conda-env.sh
Done running hooks in: /usr/local/bin/before-notebook.d
Executing the command: start-notebook.py


[I 2025-04-30 19:33:23.965 ServerApp] jupyter_lsp | extension was successfully linked.
[I 2025-04-30 19:33:23.968 ServerApp] jupyter_server_mathjax | extension was successfully linked.
[I 2025-04-30 19:33:23.970 ServerApp] jupyter_server_terminals | extension was successfully linked.
[I 2025-04-30 19:33:23.973 ServerApp] jupyterlab | extension was successfully linked.
[I 2025-04-30 19:33:23.973 ServerApp] jupyterlab_git | extension was successfully linked.
[I 2025-04-30 19:33:23.975 ServerApp] nbclassic | extension was successfully linked.
[I 2025-04-30 19:33:23.975 ServerApp] nbdime | extension was successfully linked.
[I 2025-04-30 19:33:23.978 ServerApp] notebook | extension was successfully linked.
[I 2025-04-30 19:33:23.980 ServerApp] Writing Jupyter server cookie secret to /home/jovyan/.local/share/jupyter/runtime/jupyter_cookie_secret
[I 2025-04-30 19:33:24.131 ServerApp] notebook_shim | extension was successfully linked.
[I 2025-04-30 19:33:24.145 ServerApp] notebook_shim | ext

<Result cmd='docker logs torchnb' exited=0>

Look for the line of output in the form:

    http://127.0.0.1:8888/lab?token=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

and copy it for use.