# Chartelier Test with Small Models on Google Colab

This notebook tests Chartelier with smaller, faster-loading models for quick validation.

## Prerequisites
- Google Colab (Free tier is sufficient)
- GPU runtime enabled (any GPU type)

## Step 1: Clone Repository and Quick Setup

In [None]:
# Clone the Chartelier repository
!git clone https://github.com/sog4be/chartelier.git
%cd chartelier

# Check current branch
!git checkout feature/gpt-oss-20b-colab-support

In [None]:
# Quick setup - install only essential dependencies
!pip install -q 'numpy<2.0'
!pip install -q vllm
!pip install -q transformers accelerate
!pip install -q -e .

print("✅ Dependencies installed")

# Verify vLLM
import vllm

print(f"vLLM version: {vllm.__version__}")

## Step 2: Start vLLM Server with Small Model

We'll use **Qwen/Qwen2-0.5B-Instruct** - a tiny but capable model that loads quickly.

In [None]:
# Start vLLM server with small model
import subprocess
import time
import requests
from threading import Thread

# Kill any existing server
!pkill -f vllm.entrypoints.openai.api_server || true

# Server configuration
MODEL = "Qwen/Qwen2-0.5B-Instruct"  # Small 0.5B parameter model
# Alternative options:
# MODEL = "facebook/opt-125m"  # Even smaller (125M)
# MODEL = "microsoft/phi-2"  # Small but powerful (2.7B)


def start_server():
    cmd = [
        "python",
        "-m",
        "vllm.entrypoints.openai.api_server",
        "--model",
        MODEL,
        "--host",
        "0.0.0.0",
        "--port",
        "8000",
        "--max-model-len",
        "2048",
        "--dtype",
        "half",
        "--trust-remote-code",
    ]
    subprocess.run(cmd)


# Start server in background thread
server_thread = Thread(target=start_server, daemon=True)
server_thread.start()

print(f"🚀 Starting vLLM server with {MODEL}...")
print("⏳ This should take less than 1 minute...")

# Wait for server to be ready
max_wait = 60  # 1 minute max
for i in range(max_wait):
    try:
        response = requests.get("http://localhost:8000/health", timeout=1)
        if response.status_code == 200:
            print(f"\n✅ Server is ready! (took {i + 1} seconds)")
            break
    except:
        pass

    if i % 10 == 9:
        print(f"Still waiting... ({i + 1}s)")
    time.sleep(1)
else:
    print("❌ Server failed to start within 60 seconds")

# Check if model is loaded
try:
    response = requests.get("http://localhost:8000/v1/models")
    if response.status_code == 200:
        models = response.json().get("data", [])
        if models:
            print(f"✅ Model loaded: {models[0]['id']}")
except:
    print("⚠️ Could not verify model")

## Step 3: Test the Server

In [None]:
# Quick test of the server
import requests
import json

url = "http://localhost:8000/v1/chat/completions"
headers = {"Content-Type": "application/json"}
data = {
    "model": MODEL,
    "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is 2+2?"},
    ],
    "temperature": 0.0,
    "max_tokens": 50,
}

try:
    response = requests.post(url, headers=headers, json=data, timeout=10)
    if response.status_code == 200:
        result = response.json()
        print("✅ Server test successful!")
        print(f"Response: {result['choices'][0]['message']['content']}")
    else:
        print(f"❌ Server error: {response.status_code}")
        print(response.text)
except Exception as e:
    print(f"❌ Failed to connect: {e}")

## Step 4: Configure Environment for Chartelier

In [None]:
# Set environment variables
import os

os.environ["CHARTELIER_LLM_MODEL"] = MODEL
os.environ["CHARTELIER_LLM_API_BASE"] = "http://localhost:8000/v1"
os.environ["CHARTELIER_LLM_API_KEY"] = "dummy"
os.environ["CHARTELIER_LLM_TIMEOUT"] = "30"

print("✅ Environment configured:")
print(f"   Model: {MODEL}")
print(f"   API Base: http://localhost:8000/v1")
print(f"   Timeout: 30s")

## Step 5: Run Chartelier Test

In [None]:
# Chartelier test
import json
import sys
from pathlib import Path

# Add src to path
sys.path.insert(0, "/content/chartelier/src")

from chartelier.interfaces.mcp.handler import MCPHandler
from chartelier.interfaces.mcp.protocol import JSONRPCRequest, MCPMethod
from chartelier.infra.llm_client import LLMSettings


def test_chartelier():
    print("=" * 60)
    print("🧪 Chartelier Test with Small Model")
    print("=" * 60)

    # Configuration
    settings = LLMSettings()
    print(f"\n✅ Configuration:")
    print(f"   Model: {settings.model}")
    print(f"   API Base: {settings.api_base}")
    print(f"   Timeout: {settings.timeout}s")

    # Create handler
    handler = MCPHandler()
    print("✅ MCP handler created")

    # Simple test data
    csv_data = """month,sales,category
2024-01,1000,Product A
2024-02,1200,Product A
2024-03,1100,Product A
2024-04,1300,Product A
2024-01,800,Product B
2024-02,900,Product B
2024-03,950,Product B
2024-04,1050,Product B"""

    # Create request
    request = JSONRPCRequest(
        id=1,
        method=MCPMethod.TOOLS_CALL,
        params={
            "name": "chartelier_visualize",
            "arguments": {
                "data": csv_data,
                "query": "Show monthly sales trends for Product A and Product B as a line chart",
                "options": {
                    "format": "svg",
                    "width": 800,
                    "height": 600,
                },
            },
        },
    )

    print("\n✅ Request prepared")
    print(f"📈 Data: {len(csv_data.splitlines()) - 1} rows")
    print(f"📝 Query: 'Show monthly sales trends'")
    print(f"🎨 Format: SVG (800x600)")

    print("\n" + "=" * 60)
    print(f"⚠️  Using small model: {MODEL}")
    print("Note: Results may vary with smaller models")
    print("=" * 60)

    print("\n⏳ Processing visualization request...")

    try:
        response_str = handler.handle_message(json.dumps(request.model_dump()))
        response = json.loads(response_str)

        if response.get("result", {}).get("isError"):
            print("\n❌ Visualization failed")
            error_msg = response["result"]["content"][0]["text"]
            print(f"   Error: {error_msg}")

            if "structuredContent" in response["result"]:
                error = response["result"]["structuredContent"].get("error", {})
                print(f"   Code: {error.get('code')}")
                if error.get("hint"):
                    print(f"   Hint: {error.get('hint')}")
            return None
        else:
            print("\n✅ Visualization successful!")

            result = response["result"]

            # Check content
            if "content" in result and len(result["content"]) > 0:
                content = result["content"][0]
                if content["type"] == "image":
                    print(f"\n📊 Chart generated:")
                    print(f"   MIME type: {content.get('mimeType', 'unknown')}")
                    print(f"   Data size: {len(content.get('data', ''))} characters")

                    # Show metadata
                    if "structuredContent" in result and "metadata" in result["structuredContent"]:
                        metadata = result["structuredContent"]["metadata"]
                        print(f"\n📊 Metadata:")
                        print(f"   Pattern: {metadata.get('pattern_id')}")
                        print(f"   Template: {metadata.get('template_id')}")

                        if metadata.get("stats", {}).get("duration_ms"):
                            duration = metadata["stats"]["duration_ms"]
                            total = duration.get("total", 0)
                            print(f"   Processing time: {total:.0f}ms")

                    # Return SVG
                    if "svg" in content.get("mimeType", ""):
                        return content["data"]

    except Exception as e:
        print(f"\n❌ Unexpected error: {e}")
        import traceback

        traceback.print_exc()
        return None

    return None


# Run test
svg_data = test_chartelier()

# Display result
if svg_data:
    from IPython.display import SVG, display

    print("\n📊 Displaying generated chart:")
    display(SVG(data=svg_data))

    # Save to file
    with open("/content/output.svg", "w") as f:
        f.write(svg_data)
    print("\n💾 Chart saved to: /content/output.svg")
else:
    print("\n❌ Chart generation failed")
    print("\nTroubleshooting:")
    print("1. Check if server is running (Step 2)")
    print("2. Try a different model")
    print("3. Check server logs for errors")

## Alternative Models to Try

If the above doesn't work, try these alternative models:

In [None]:
# List of small models you can try
models = [
    "facebook/opt-125m",  # 125M - Tiny, fast
    "facebook/opt-350m",  # 350M - Still small
    "microsoft/phi-2",  # 2.7B - Small but capable
    "Qwen/Qwen2-0.5B-Instruct",  # 0.5B - Good balance
    "Qwen/Qwen2-1.5B-Instruct",  # 1.5B - Better quality
]

print("Available small models for testing:")
for i, model in enumerate(models, 1):
    print(f"{i}. {model}")

print("\nTo use a different model:")
print("1. Stop the current server (Runtime → Interrupt)")
print("2. Change MODEL variable in Step 2")
print("3. Re-run from Step 2")

## Cleanup

In [None]:
# Stop the server
!pkill -f vllm.entrypoints.openai.api_server || true
print("✅ Server stopped")

## Troubleshooting

### Common Issues:

1. **Server won't start**: Try a smaller model (opt-125m)
2. **Low quality results**: Expected with small models - they may struggle with complex reasoning
3. **Out of memory**: Restart runtime and use smaller model
4. **Connection refused**: Make sure server cell completed successfully

### Tips:
- Small models work best with simple, clear queries
- For production use, larger models (7B+) are recommended
- This notebook is for testing Chartelier's functionality, not model quality