# Runpod Dev Environment Setup

SSH implements public-key cryptography to establish trust between client and server systems. In this paradigm, we have the **public key** (`.pub`) that functions as an access *verifier*, installed on target systems (like Runpod nodes), and the **private key** serves as the unique authentication factor for mathematically proving identity.

Below, we create public keys which we install onto Runpod and GitHub systems. The corresponding private key for Runpod is stored locally, while that for GitHub is securely copied via `scp`. This gives every new pod read and write access to GitHub repositories.

![](./img/runpod-ssh.png)

## Give local SSH access to pods

1. Generate SSH keys for Runpod (no passphrase):
```{.bash filename="$ (local)"}
ssh-keygen -t ed25519 -C "runpod" -f ~/.ssh/runpod -N ""
cat ~/.ssh/runpod.pub
```
2. Add this to **Settings**> **SSH Public Keys** in Runpod (separated by newline).

## Give pods SSH access to GitHub
1. Generate SSH keys to access GitHub from each pod (no passphrase):
```{.bash filename="$ (local)"}
ssh-keygen -t ed25519 -C "runpod-github" -f ~/.ssh/runpod_github -N ""
cat ~/.ssh/runpod_github.pub
```
2. Add the public SSH key to your GitHub account under **Settings**> **SSH and GPG Keys**:

![](./img/tooling-github-ssh.png)

3. Run the following script. Then, copy your pod's **SSH over exposed TCP** [connection string]{.underline}[^connstring]. Paste the connection string in the resulting input prompt. You should verify the auto-filled values for **pod IP** and **port number** before proceeding.
```{.bash filename="$ (local)"}
printf "\nSSH over exposed TCP connection string:\n"
read RUNPOD_SSH

RUNPOD_IP=$(echo "$RUNPOD_SSH" | sed -E 's/.*@([0-9.]+).*/\1/')
RUNPOD_PORT=$(echo "$RUNPOD_SSH" | sed -E 's/.*-p ([0-9]+).*/\1/')
echo ""
echo "IP:   $RUNPOD_IP"
echo "Port: $RUNPOD_PORT"
```

4. Run the following script. This copies the GitHub private SSH key, adds it to the SSH agent, and logs in to the instance. You should now be connected to your pod! 🫛 Also, you can check that you have access to your repositories.
```{.bash filename="$ (local)"}
ssh-keyscan -p $RUNPOD_PORT $RUNPOD_IP >> ~/.ssh/known_hosts                    # <0>

scp -P $RUNPOD_PORT -i ~/.ssh/runpod \
    ~/.ssh/runpod_github root@$RUNPOD_IP:/root/.ssh/runpod_github

ssh -t -p $RUNPOD_PORT -i ~/.ssh/runpod root@$RUNPOD_IP '
    # Set secure permissions for SSH directory and key
    chmod 700 ~/.ssh                                                            # <1>
    chmod 600 ~/.ssh/runpod_github                                              # <2>

    # Add GitHub to known_hosts and set permissions
    ssh-keyscan github.com >> ~/.ssh/known_hosts                                # <3>
    chmod 644 ~/.ssh/known_hosts                                                # <4>

    # Start SSH agent and load the key
    eval "$(ssh-agent -s)" > /dev/null;                                         # <5>
    ssh-add ~/.ssh/runpod_github                                                # <6>

    # Launch interactive shell to keep the session alive
    cd ~ && exec bash -l                                                        # <7>
'
```
0. Append pod's public SSH host key to local `known_hosts` to skip authenticity prompts.
1. Permissions: only owner can read, write, or execute directory.
2. Permissions: only owner can read or write on the private key file.
3. Append GitHub's public SSH host key to pod `known_hosts` to avoid authenticity prompts.
4. Permissions: `known_hosts` owner can read/write, others have read-only access.
5. Run the SSH agent process in the background.
6. Add GitHub private key to SSH agent. 
7. Once the setup completes, `cd ~ && exec bash -l` changes the working directory to home (`~`, i.e. `/root`) and then replaces the current process with a new login shell (`-l`) which reinitializes the environmental variables of the last one. The `-t` flag on the `ssh` command allows `exec` to work interactively.

:::{.callout-note}
The `ssh-agent` process persists in the background once started, holding your loaded keys in memory. What does not persist are the environment variables (`SSH_AUTH_SOCK`, `SSH_AGENT_PID`) that let your shell know how to communicate with that agent. When you open a new shell or SSH session, those env vars are not automatically set, so your shell can’t find the running agent unless you restore them (e.g. doing `ssh-add` again). This is why we want all setup and SSH connection to happen in one continuous session.
:::

[^connstring]: e.g. `ssh root@63.141.33.33 -p 22011 -i ~/.ssh/id_ed25519`


Collecting all commands in a single script ([-@lst-runpod]) that you can run to connect to a pod:

```{.bash filename="$ (local)"}
chmod +x ./scripts/runpod.sh
./scripts/runpod.sh
```

## Virtual environment

Inside the pod, we can now clone repositories. For example:

```{.bash filename="$ (root@runpod)"}
git clone git@github.com:particle1331/ai-notebooks.git
```

Our recommended approach is to use `uv` to build a venv at `.venv` that is synced using `uv.lock`. This environment can then be used as Jupyter kernel [in vscode](https://code.visualstudio.com/docs/datascience/jupyter-kernel-management). Or run scripts using [`uv run`](https://docs.astral.sh/uv/concepts/projects/run/#running-scripts). The required Python version is also specified ([-@lst-Makefile]):

```{.bash filename="$ (root@runpod)"}
make venv
```

Alternatively, you can use `pip` to install packages (without using `uv`):
```{.bash filename="$ (root@runpod)"}
# install required python version, activate venv
make uv
uv python install 3.13
uv venv .venv && source .venv/bin/activate
curl -sS https://bootstrap.pypa.io/get-pip.py | .venv/bin/python

# install requirements on venv
make requirements
pip install -r requirements.txt
pip install -e .
```

:::{.callout-note}
Some environments, like AzureML where compute instances have network-mounted or ephemeral filesystems, can cause `uv` failures. In the case of short-lived pods, you don't necessarily need to perfectly setup an environment or maintain a clean state. On the other hand, reproducibility benefits from a controlled, well-defined environment. As usual, the level of precision needed depends on the scope of the project.
:::

## Quarto docs

Check [here](https://github.com/particle1331/ai-notebooks/blob/main/.github/workflows/publish.yml) for the Quatro version that we're using. Adjust the following variable accordingly:
```{.bash filename="$ (root@runpod)"}
export QUARTO_VERSION=1.7.32  # may be outdated
wget -q https://github.com/quarto-dev/quarto-cli/releases/download/v${QUARTO_VERSION}/quarto-${QUARTO_VERSION}-linux-amd64.deb
dpkg -i quarto-${QUARTO_VERSION}-linux-amd64.deb && rm quarto-${QUARTO_VERSION}-linux-amd64.deb
```

Preview then [port-forward](https://code.visualstudio.com/docs/remote/ssh#_forwarding-a-port-creating-ssh-tunnel) using vscode:
```{.bash filename="$ (root@runpod)"}
make docs
```

## Appendix: Tmux

Tmux allows you to open multiple windows in a **single SSH session** to a remote server, without needing to log in separately for each window. You can also detach from a tmux session and reconnect later to resume your work exactly where you left off, without having to log in again or restart your processes. This means you can run several tasks in parallel, switch between them easily, and keep them running even if your connection drops.

For convenience, I use [byobu](https://www.byobu.org/downloads). In this section, I'll list commands and workflows I found useful.

| Command | Function |
| :--: | :-- |
| Shift + F1, F1 | Display help |
| Ctrl + F2 | Create new split vertically |
| Shift + F2 | Create new split horizontally |
| F2 | Create new window |
| F8 | Rename the current window |
| Ctrl + F6 | Kill a focused split |
| F3 / F4 | Switch between windows |
| Shift + F3 / F4 | Switch between splits |
| F6 | Detach session |
| Shift + F9 | Run command on all splits. See @fig-byobu-command-splits. |
: Byobu commands {tbl-colwidths="[30,70]"}

<br>

Running the same command on multiple splits (very useful):

![Running a dynamic command on 3 splits.](./img/byobu-commands.png){#fig-byobu-command-splits}

**Resuming.** Working on a remote server, you might lose your connection unexpectedly or intentionally disconnect while wanting to keep processes running. In such cases, you can press F6 in Byobu to detach the session and logout. Once you SSH again to the server, the session will be restored by running `byobu`. This is very useful!

![Resuming a detached session.](./img/byobu-detached.png){#fig-byobu-command-splits}

## Appendix: Code listings

::: {#lst-Makefile lst-cap="Makefile for the project."}
```{.bash filename=Makefile}
{{< include "../../Makefile" >}}
```
:::

::: {#lst-runpod lst-cap="Script to SSH to Runpod instance."}
```{.bash filename="./scripts/runpod.sh"}
{{< include "../../scripts/runpod.sh" >}}
```
:::