# Ollama Gradio WebUI with vLLM - API Server Setup

This notebook demonstrates how to run the Ollama Gradio WebUI as an API server using ngrok in Kaggle.

## 1. Setup Environment

In [None]:
# Clone the repository
!git clone https://github.com/infovpcs/ollama-gradio-webui.git
%cd ollama-gradio-webui

In [None]:
# Install dependencies
!sh install_deps.sh

# Install ngrok
!pip install pyngrok

## 2. Start the API Server with ngrok

In [None]:
from pyngrok import ngrok, conf
import os

# Set environment variables for Kaggle
os.environ["GRADIO_SERVER_NAME"] = "0.0.0.0"
os.environ["GRADIO_SERVER_PORT"] = "7860"
os.environ["OLLAMA_USE_VLLM"] = "true"
os.environ["KAGGLE_API_MODE"] = "true"

# Optional: Set your ngrok authtoken if you have one
# conf.get_default().auth_token = "your_ngrok_authtoken"

# Create ngrok tunnel
ngrok_tunnel = ngrok.connect(addr="localhost:7860")
print(f"\n🔗 API Server is accessible at: {ngrok_tunnel.public_url}")
print(f"\n✅ API Endpoints:")
print(f"   - Chat API: {ngrok_tunnel.public_url}/api/chat")
print(f"   - React Agent API: {ngrok_tunnel.public_url}/api/react_agent")

In [None]:
# Start the Ollama server and the vLLM app in the background
!sh start_vllm.sh &

## 3. Using the API Endpoints

Now you can use the API endpoints to interact with the Ollama models. Here are examples using `requests`:

In [None]:
import requests
import json

# Replace with your actual ngrok URL
API_BASE_URL = ngrok_tunnel.public_url

# Example: Using the Chat API
def chat_with_ollama(message, model="qwen3:vpcs-vllm", enable_context=True):
    response = requests.post(
        f"{API_BASE_URL}/api/chat",
        json={
            "message": message,
            "model": model,
            "enable_context": enable_context
        }
    )
    return response.json()

# Example: Using the React Agent API
def react_agent_query(query, odoo_version="18.0", model="qwen3:vpcs-vllm"):
    response = requests.post(
        f"{API_BASE_URL}/api/react_agent",
        json={
            "query": query,
            "odoo_version": odoo_version,
            "model": model
        }
    )
    return response.json()

In [None]:
# Test the Chat API
response = chat_with_ollama("Tell me about Odoo 18 features")
print(json.dumps(response, indent=2))

In [None]:
# Test the React Agent API
response = react_agent_query("Create a simple Odoo module for task management")
print(json.dumps(response, indent=2))

## 4. Integration with External Applications

You can now use the ngrok URL to integrate with external applications such as:
- Mobile apps
- Web applications
- Other notebooks or scripts
- Automation tools

The API will remain accessible as long as this notebook is running (up to 72 hours in Kaggle).

In [None]:
# Keep the notebook running
import time
while True:
    print("API server is running... Press Ctrl+C to stop")
    time.sleep(600)  # Check every 10 minutes