<a href="https://colab.research.google.com/github/SugarC21/colab_Open_WebUI/blob/main/Colab_Open_WebUI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Open WebUI + Ollama in Colab**

This notebook installs and starts **Open WebUI** (requires Python 3.11) and **Ollama** in Google Colab, using **ngrok** for public tunneling.

### **Key Features**
1. **Optionally** mount Google Drive to install both **Open WebUI** and **Ollama** in a persistent location, so you don’t have to reinstall them each session.
2. **Optional** ngrok authentication token for stable subdomains or other premium ngrok features.
3. We only expose **Open WebUI** (port 8081) publicly; **Ollama** (port 11422) remains hidden/local.

### **Important Notes**
- **GPU Usage**: Currently, **Ollama** on Linux/Colab runs on **CPU** only.
- **Colab Resource Constraints**: CPU-based inference can be slow for large models. Consider smaller or more efficient models.
- **ngrok Auth**: If you need stable subdomains, set `use_ngrok_auth = True` and provide your token in `os.environ['ngrok_auth_token']`.


In [None]:
#@title **Setup**
#@markdown 1. Toggle below to enable using (and saving) **both** Open WebUI and Ollama to Google Drive.
#@markdown 2. Toggle below to enable an ngrok auth token (stable subdomain, etc.).

use_gdrive = True #@param {type:"boolean"}
use_ngrok_auth = False #@param {type:"boolean"}

import os

BASE_PATH = "/content"
if use_gdrive:
    from google.colab import drive
    drive.mount('/content/drive')
    BASE_PATH = "/content/drive/MyDrive/Open-WebUI"  # We'll store the environment & Ollama here
    os.makedirs(BASE_PATH, exist_ok=True)

print("Using base path:", BASE_PATH)
print("Using ngrok auth token:", use_ngrok_auth)

### **Brief Instruction for ngrok Authentication Token**
- If you want a **stable** or **custom subdomain** (or other features), you need an [ngrok](https://ngrok.com/) account with an auth token.
- **Storing the Token in a Colab Secret**:
  1. Go to **Runtime** > **RunTime manager** > **Secrets** (UI can vary).
  2. Create a new secret named `ngrok_auth_token` with your token as the value.
  3. Once saved, you can retrieve it with `os.environ.get('ngrok_auth_token')`.
- Alternatively, you can do:
  ```bash
  %env ngrok_auth_token=YOUR_TOKEN_HERE
  ```
- Toggle `use_ngrok_auth = True` in the form above if you want to use the token.

If no token is provided (or if `use_ngrok_auth` is **False**), you'll still get a random temporary tunnel for testing.

## **Install Dependencies**
1. Update apt.
2. Install Python 3.11, venv, and dev packages.
3. Install system tools. (pciutils, lshw, etc.)

*(This step might take a few minutes.)*

In [None]:
!sudo apt-get update -y
!sudo apt-get install -y python3.11 python3.11-venv python3.11-dev pciutils lshw

## **Install Ollama**
- If **use_gdrive** is **True**, we will download the Linux x86_64 tarball into Drive and run it directly from there, so it persists.
- Otherwise, we run the **official install script** (`curl -fsSL https://ollama.com/install.sh | sh`), installing Ollama system-wide in the ephemeral Colab VM.

**Note**: On Linux, Ollama currently runs **CPU-only**.

In [None]:
if use_gdrive:
    print("Persisting Ollama to Drive...")
    import os
    ollama_dir = os.path.join(BASE_PATH, 'ollama')
    ollama_bin = os.path.join(ollama_dir, 'ollama')

    if os.path.exists(ollama_bin):
        print("Ollama binary already exists in Drive.")
        print("To update, remove or overwrite the folder or manually download a newer version.")
    else:
        print("Downloading Ollama (Linux x86_64) into Drive...")
        os.makedirs(ollama_dir, exist_ok=True)

        # Download a pinned version or latest stable
        OLLAMA_VERSION = "v0.0.16"  # Example pinned version
        DOWNLOAD_URL = f"https://github.com/jmorganca/ollama/releases/download/{OLLAMA_VERSION}/ollama-{OLLAMA_VERSION}-Linux-x86_64.tar.gz"

        # Download & extract
        !wget -q "$DOWNLOAD_URL" -O /tmp/ollama.tar.gz
        !tar -xzf /tmp/ollama.tar.gz -C "$ollama_dir" --strip-components 1
        !chmod +x "$ollama_bin"
        print("Ollama persisted to Drive at:", ollama_bin)

else:
    print("Installing Ollama system-wide (ephemeral)")
    !curl -fsSL https://ollama.com/install.sh | sh
    print("Ollama installed for this session.")

## **Set Up Virtual Environment & Install Open WebUI**
We’ll use Python 3.11 in a virtual environment. If you enabled Google Drive, everything goes into `/content/drive/MyDrive/Open-WebUI`. Otherwise, it goes into `/content`.

This includes the **Open WebUI** Python package and its dependencies.

In [None]:
import os

venv_path = os.path.join(BASE_PATH, "venv")

if not os.path.exists(venv_path):
    print("Creating a new Python 3.11 virtual environment...")
    !python3.11 -m venv "$venv_path"

print("Upgrading pip in the virtual environment...")
!"$venv_path/bin/python" -m pip install --upgrade pip

print("Installing Open WebUI...")
!"$venv_path/bin/pip" install open-webui

print("Open WebUI installation complete.")

## **Create a Script to Start Both Servers (Ollama & Open WebUI)**
We will run:
- **Ollama** on port **11422** in one thread
- **Open WebUI** on port **8081** in another

### Where does Ollama run?
- If `use_gdrive` is **True**, we run the binary from `BASE_PATH/ollama/ollama`.
- Otherwise, we call the system-wide `ollama` command installed via the shell script.

All output is suppressed for a cleaner notebook.

In [None]:
import os

server_script_path = os.path.join(BASE_PATH, 'start_servers.py')
if use_gdrive:
    ollama_bin = os.path.join(BASE_PATH, 'ollama', 'ollama')
else:
    ollama_bin = 'ollama'  # ephemeral system-wide

script_content = f'''\
import subprocess
import threading
import time

OLLAMA_CMD = "{ollama_bin}"

def start_ollama():
    subprocess.run([OLLAMA_CMD, 'serve'], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)

def start_open_webui():
    subprocess.run(['./venv/bin/open-webui', 'serve', '--port', '8081'], cwd='.', stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)

threading.Thread(target=start_ollama).start()
time.sleep(5)  # Wait a bit to ensure Ollama is up
threading.Thread(target=start_open_webui).start()
'''

with open(server_script_path, 'w') as f:
    f.write(script_content)

print(f"Created script at: {server_script_path}")

## **Start Servers & Expose the Open WebUI Port via ngrok**
1. **Run `start_servers.py`** in the background.
2. Wait ~20 seconds to ensure the processes have started.
3. **Connect** to port **8081** (Open WebUI) using **ngrok**.
4. Ollama runs locally on port **11422** and is **not** shown (no public URL).

### **Output**
- Only the **Open WebUI** tunnel link is printed.

In [None]:
!pip install pyngrok --quiet

import time
from pyngrok import ngrok
import os

# If user wants to use an auth token, set it from the environment.
if use_ngrok_auth:
    token = os.environ.get('ngrok_auth_token', '')
    if token:
        ngrok.set_auth_token(token)
        print("ngrok authentication token set!")
    else:
        print("No 'ngrok_auth_token' found in environment. Proceeding without an auth token.")

print("Starting servers...")
!"$venv_path/bin/python" "$server_script_path" &
time.sleep(20)  # give them time to start

print("\nConnecting to Open WebUI on port 8081 via ngrok...")
webui_tunnel = ngrok.connect(8081, "http")
print("Open WebUI URL:", webui_tunnel.public_url)

# Create the Ollama tunnel but do NOT print it.
ollama_tunnel = ngrok.connect(11422, "http")

print("\nAll set! Use the Open WebUI URL above to access your interface.")