# Launch and set up a VM instance- with python-chi

In [10]:
from chi import server, context
import chi, os, time, datetime

context.version = "1.0" 
context.choose_project()
context.choose_site(default="KVM@TACC")

VBox(children=(Dropdown(description='Select Project', options=('CHI-251409',), value='CHI-251409'), Output()))

VBox(children=(Dropdown(description='Select Site', index=7, options=('CHI@TACC', 'CHI@UC', 'CHI@EVL', 'CHI@NCA…

### We will bring up an m1.xxlarge flavor server with the CC-Ubuntu24.04 disk image.

In [11]:
username = os.getenv('USER') 
s = server.Server(
    f"node-data-pipeline-project38", 
    image_name="CC-Ubuntu24.04",
    flavor_name="m1.xxlarge",
    key_name="id_rsa_chameleon_project_g38"
)
s.submit(idempotent=True)

Waiting for server node-data-pipeline-project38's status to become ACTIVE. This typically takes 10 minutes, but can take up to 20 minutes.


HBox(children=(Label(value=''), IntProgress(value=0, bar_style='success')))

Server has moved to status ACTIVE


Attribute,node-data-pipeline-project38
Id,2d4674c4-7a91-4cbb-9e51-97c2911d0dc8
Status,ACTIVE
Image Name,CC-Ubuntu24.04
Flavor Name,m1.xxlarge
Addresses,sharednet1:  IP: 10.56.2.66 (v4)  Type: fixed  MAC: fa:16:3e:99:ad:52  IP: 129.114.25.229 (v4)  Type: floating  MAC: fa:16:3e:99:ad:52
Network Name,sharednet1
Created At,2025-05-08T01:41:20Z
Keypair,id_rsa_chameleon_project_g38
Reservation Id,
Host Id,be7935f5d797055105aa0c531a23cd1ba738cda919948be6fb1dfe55


<chi.server.Server at 0x7ff025570520>

Here we are associating a floating ip with the server we just created.

In [13]:
s.associate_floating_ip()

ResourceError: None of the ports can route to floating ip 129.114.24.223 on server 2d4674c4-7a91-4cbb-9e51-97c2911d0dc8

In the output below, we will make a note of the floating IP that has been assigned to our instance (in the “Addresses” row).

In [4]:
s.refresh()
s.show(type="widget")

Attribute,node-data-pipeline-project38
Id,2d4674c4-7a91-4cbb-9e51-97c2911d0dc8
Status,ACTIVE
Image Name,CC-Ubuntu24.04
Flavor Name,m1.xxlarge
Addresses,sharednet1:  IP: 10.56.2.66 (v4)  Type: fixed  MAC: fa:16:3e:99:ad:52  IP: 129.114.25.229 (v4)  Type: floating  MAC: fa:16:3e:99:ad:52
Network Name,sharednet1
Created At,2025-05-08T01:41:20Z
Keypair,id_rsa_chameleon_project_g38
Reservation Id,
Host Id,be7935f5d797055105aa0c531a23cd1ba738cda919948be6fb1dfe55


The following security groups will be created (if they do not already exist in our project) and then added to our server:

In [5]:
security_groups = [
  {'name': "allow-ssh", 'port': 22, 'description': "Enable SSH traffic on TCP port 22"},
  {'name': "allow-8888", 'port': 8888, 'description': "Enable TCP port 8888 (used by Jupyter)"},
  {'name': "allow-8000", 'port': 8000, 'description': "Enable TCP port 8000 (used by MLFlow)"},
  {'name': "allow-9000", 'port': 9000, 'description': "Enable TCP port 9000 (used by MinIO API)"},
  {'name': "allow-9001", 'port': 9001, 'description': "Enable TCP port 9001 (used by MinIO Web UI)"}
]

In [6]:
# configure openstacksdk for actions unsupported by python-chi
os_conn = chi.clients.connection()
nova_server = chi.nova().servers.get(s.id)

for sg in security_groups:

  if not os_conn.get_security_group(sg['name']):
      os_conn.create_security_group(sg['name'], sg['description'])
      os_conn.create_security_group_rule(sg['name'], port_range_min=sg['port'], port_range_max=sg['port'], protocol='tcp', remote_ip_prefix='0.0.0.0/0')

  nova_server.add_security_group(sg['name'])

print(f"updated security groups: {[group.name for group in nova_server.list_security_group()]}")


updated security groups: ['allow-8000', 'allow-8888', 'allow-9000', 'allow-9001', 'allow-ssh', 'default']


In [14]:
s.refresh()
s.check_connectivity()

Checking connectivity to 129.114.25.229 port 22.


HBox(children=(Label(value=''), IntProgress(value=0, bar_style='success')))

Connection successful


### Retrieve code and notebooks on the instance

In [8]:
s.execute("git clone https://github.com/exploring-curiosity/MLOps.git")

Cloning into 'MLOps'...


<Result cmd='git clone https://github.com/exploring-curiosity/MLOps.git' exited=0>

### Set up Docker

In [9]:
s.execute("curl -sSL https://get.docker.com/ | sudo sh")
s.execute("sudo groupadd -f docker; sudo usermod -aG docker $USER")

# Executing docker install script, commit: 53a22f61c0628e58e1d6680b49e82993d304b449


+ sh -c apt-get -qq update >/dev/null
+ sh -c DEBIAN_FRONTEND=noninteractive apt-get -y -qq install ca-certificates curl >/dev/null

Running kernel seems to be up-to-date.

Restarting services...
 systemctl restart packagekit.service

No containers need to be restarted.

No user sessions are running outdated binaries.

No VM guests are running outdated hypervisor (qemu) binaries on this host.
+ sh -c install -m 0755 -d /etc/apt/keyrings
+ sh -c curl -fsSL "https://download.docker.com/linux/ubuntu/gpg" -o /etc/apt/keyrings/docker.asc
+ sh -c chmod a+r /etc/apt/keyrings/docker.asc
+ sh -c echo "deb [arch=amd64 signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu noble stable" > /etc/apt/sources.list.d/docker.list
+ sh -c apt-get -qq update >/dev/null
+ sh -c DEBIAN_FRONTEND=noninteractive apt-get -y -qq install docker-ce docker-ce-cli containerd.io docker-compose-plugin docker-ce-rootless-extras docker-buildx-plugin >/dev/null

Running kernel seems to be up-to

Client: Docker Engine - Community
 Version:           28.1.1
 API version:       1.49
 Go version:        go1.23.8
 Git commit:        4eba377
 Built:             Fri Apr 18 09:52:14 2025
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          28.1.1
  API version:      1.49 (minimum version 1.24)
  Go version:       go1.23.8
  Git commit:       01f442b
  Built:            Fri Apr 18 09:52:14 2025
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.7.27
  GitCommit:        05044ec0a9a75232cad458027ca83437aae3f4da
 runc:
  Version:          1.2.5
  GitCommit:        v1.2.5-0-g59923ef
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0


To run Docker as a non-privileged user, consider setting up the
Docker daemon in rootless mode for your user:

    dockerd-rootless-setuptool.sh install

Visit https://docs.docker.com/go/rootless/ to learn about rootless mode.


T

<Result cmd='sudo groupadd -f docker; sudo usermod -aG docker $USER' exited=0>

### Open an SSH session

Open an SSH sesson on your server. From your local terminal, run

```
ssh -i ~/.ssh/id_rsa_chameleon_project_g38 cc@A.B.C.D
```
where

in place of ~/.ssh/id_rsa_chameleon_g38, substitute the path to your own key that you had uploaded to KVM@TACC
in place of A.B.C.D, use the floating IP address you just associated to your instance.

# Object storage using the Horizon GUI
To create an object storage container from the OpenStack Horizon GUI.

Open the GUI for CHI@UC. From the Chameleon website
- click “Experiment” > “CHI@UC”
- log in if prompted to do so
- check the project drop-down menu near the top left (which shows e.g. “CHI-XXXXXX”), and make sure the correct project is selected.
- In the menu sidebar on the left side, click on “Object Store” > “Containers” and then, “Create Container”. You will be prompted to set up your container step by step using a graphical “wizard”.

Specify the name as object-persist-project38.
Leave other settings at their defaults, and click “Submit”.

### Use ```rclone``` and authenticate to object store from a compute instance

On the site where you have created your container you will create an application credential and download the openrc file for future use. Now we can use it to allow an application to authenticate to the Chameleon object store service (We are using rclone).

On the compute instance, install ```rclone```:

``` bash
#run on node-data-pipeline-project38
curl https://rclone.org/install.sh | sudo bash
```

### To modify the configuration file for FUSE

``` bash
#run on node-data-pipeline-project38
#this line makes sure user_allow_other is un-commented in /etc/fuse.conf
sudo sed -i '/^#user_allow_other/s/^#//' /etc/fuse.conf
```

Next, create a configuration file for rclone with the ID and secret from the application credential you just generated:

``` bash
#Run on node-data-pipeline-project38
mkdir -p ~/.config/rclone
nano  ~/.config/rclone/rclone.conf
```

Create a config file that looks like below substituting the app-id and app-secret for the project.

``` bash
[chi_uc]
type = swift
user_id = YOUR_USER_ID
application_credential_id = APP_CRED_ID
application_credential_secret = APP_CRED_SECRET
auth = https://chi.uc.chameleoncloud.org:5000/v3
region = CHI@UC
```

To test it, run

``` bash
#Run on node-data-pipeline-project38
rclone lsd chi_uc:
```

 and verify that you see your container listed. This confirms that rclone can authenticate to the object store.

### Upload ```kaggle.json``` to the Instance

``` bash
#Run on local terminal
scp -i ~/.ssh/id_rsa_chameleon_project_g38 path/to/kaggle.json cc@your-floating-ip:~/kaggle.json
#ex: scp -i ~/.ssh/id_rsa_chameleon_project_g38  ~/.kaggle/kaggle.json cc@129.114.25.185:~
```

### Move and Secure the File on the Instance

``` bash
#Run on node-data-pipeline-project38
mkdir -p ~/.kaggle
mv ~/kaggle.json ~/.kaggle/kaggle.json
chmod 600 ~/.kaggle/kaggle.json
```

### Install Python + Kaggle CLI

``` bash
#Run on node-data-pipeline-project38
sudo apt-get update && sudo apt-get install -y python3-pip
```

### Create a virtual envirnment and install kaggle CLI
``` bash
#Run on node-data-pipeline-project38
python3 -m venv ~/.venv/kaggle
source ~/.venv/kaggle/bin/activate
pip install kaggle
```

### Test Kaggle CLI
Assuming you already placed kaggle.json in ~/.kaggle/kaggle.json:
``` bash
#Run on node-data-pipeline-project38
kaggle competitions list | head -n 10
```

You should see a list of Kaggle competitions.

### Download & unzip dataset
``` bash
#Run on node-data-pipeline-project38
mkdir -p ~/Data && cd ~/Data
kaggle competitions download -c birdclef-2025
unzip -qq birdclef-2025.zip -d birdclef-2025
cd birdclef-2025
```

### Sample 10% of train_soundscapes (for simulated online data later)
``` bash 
#Run on node-data-pipeline-project38
mkdir -p production_sample
total=$(ls train_soundscapes/*.ogg | wc -l)
sample_count=$((total / 10))
ls -1 train_soundscapes/*.ogg | sort | tail -n $sample_count | while read f; do
    cp "$f" production_sample/
done
```

### Upload data to object store (using ```rclone```)

``` bash
#Run on node-data-pipeline-project38
rclone copy train_audio chi_uc:object-persist-project38/raw/train_audio --progress
rclone copy train.csv chi_uc:object-persist-project38/raw/
rclone copy taxonomy.csv chi_uc:object-persist-project38/raw/
rclone copy production_sample chi_uc:object-persist-project38/raw/production/train_soundscapes_subset --progress
```