# Open vs Closed Models: Gemini vs Local Comparison

## What you will learn
- How to compare model outputs with a simple rubric.
- Tradeoffs: quality, latency, cost proxy, and privacy.

Expected runtime: 30-45 minutes
Expected cost: Gemini free tier plus local runtime resources.


In [None]:
%pip -q install -U google-genai

In [None]:
!sudo apt -qq update
!sudo apt -qq install -y pciutils zstd
!curl -fsSL https://ollama.com/install.sh | sh

import time
import subprocess
import pandas as pd

LOCAL_MODEL = 'qwen3:4b'
MODEL_ID = 'gemini-3-flash-preview'

def start_ollama_service():
    subprocess.Popen(['ollama', 'serve'])
    time.sleep(5)

def pull_local_model(model_name: str):
    try:
        subprocess.run(['ollama', 'pull', model_name], check=True)
        return True
    except Exception as e:
        print('Could not pull local model:', e)
        return False

start_ollama_service()
LOCAL_AVAILABLE = pull_local_model(LOCAL_MODEL)
print('LOCAL_AVAILABLE =', LOCAL_AVAILABLE)


In [None]:
GEMINI_AVAILABLE = False
GEMINI_ERROR = None

try:
    from google import genai
    import os
    HARDCODED_GOOGLE_API_KEY = None
    api_key = HARDCODED_GOOGLE_API_KEY or os.getenv('GOOGLE_API_KEY')
    if not api_key:
        from google.colab import userdata
        api_key = userdata.get('GOOGLE_API_KEY')
    if not api_key:
        raise ValueError('Set HARDCODED_GOOGLE_API_KEY or GOOGLE_API_KEY')
    client = genai.Client(api_key=api_key)
    GEMINI_AVAILABLE = True
except Exception as e:
    GEMINI_ERROR = str(e)
    print('Gemini unavailable:', GEMINI_ERROR)


In [None]:
def run_local(prompt: str):
    start = time.perf_counter()
    if not LOCAL_AVAILABLE:
        return {'ok': False, 'model': LOCAL_MODEL, 'output': None, 'latency_s': None, 'error': 'Local model unavailable'}
    try:
        proc = subprocess.run(['ollama', 'run', LOCAL_MODEL, prompt], capture_output=True, text=True, check=True)
        return {'ok': True, 'model': LOCAL_MODEL, 'output': proc.stdout.strip(), 'latency_s': round(time.perf_counter() - start, 2), 'error': None}
    except Exception as e:
        return {'ok': False, 'model': LOCAL_MODEL, 'output': None, 'latency_s': None, 'error': str(e)}


def run_gemini(prompt: str):
    start = time.perf_counter()
    if not GEMINI_AVAILABLE:
        return {'ok': False, 'model': MODEL_ID, 'output': None, 'latency_s': None, 'error': GEMINI_ERROR}
    try:
        response = client.models.generate_content(model=MODEL_ID, contents=prompt)
        return {'ok': True, 'model': MODEL_ID, 'output': response.text, 'latency_s': round(time.perf_counter() - start, 2), 'error': None}
    except Exception as e:
        return {'ok': False, 'model': MODEL_ID, 'output': None, 'latency_s': None, 'error': str(e)}


In [None]:
TASKS = [
    'A Danish mid-sized B2B company wants to deploy a customer-support AI assistant. Give a one-paragraph recommendation on whether to start with an open-weight model or a closed model, and why.',
    'You are advising an enterprise buyer in Denmark. In one paragraph, compare GPU vs CPU choices for running AI inference in production, focusing on cost, latency, and scalability tradeoffs.',
    'In one paragraph, propose a simple pilot plan for evaluating two LLM vendors for internal knowledge search in a regulated enterprise setting (success metrics, timeline, and key risks).'
]

local_rows = []
for task in TASKS:
    local_res = run_local(task)
    local_rows.append({
        'task': task,
        'model': local_res['model'],
        'output': local_res['output'],
        'latency_s': local_res['latency_s'],
        'error': local_res['error']
    })

local_df = pd.DataFrame(local_rows)
print(f'Local run completed for {len(local_df)} tasks.')
local_df



In [None]:
if 'TASKS' not in globals() or 'local_df' not in globals():
    raise RuntimeError('Run the local-task cell first to create TASKS and local_df.')

gemini_rows = []
failed_gemini_rows = []

for task in TASKS:
    gem_res = run_gemini(task)
    if gem_res['ok']:
        gemini_rows.append({
            'task': task,
            'model': gem_res['model'],
            'output': gem_res['output'],
            'latency_s': gem_res['latency_s'],
            'error': gem_res['error']
        })
    else:
        failed_gemini_rows.append({
            'task': task,
            'model': MODEL_ID,
            'error': gem_res['error']
        })

gemini_df = pd.DataFrame(gemini_rows)
failed_gemini_df = pd.DataFrame(failed_gemini_rows)

if not failed_gemini_df.empty:
    print('Some Gemini calls failed. Re-run this Gemini cell to retry failed tasks.')
    failed_gemini_df
else:
    print('Gemini run completed for all tasks.')

comparison_df = pd.concat([local_df, gemini_df], ignore_index=True)
print(f'Rows included in comparison_df: {len(comparison_df)} (Local: {len(local_df)}, Gemini successful: {len(gemini_df)})')
comparison_df



In [None]:
comparison_df['quality_score_1_to_5'] = ''
comparison_df['factuality_score_1_to_5'] = ''
comparison_df['notes'] = ''

export_path = 'comparison_df_student_ratings.xlsx'
comparison_df.to_excel(export_path, index=False)
print(f'Saved {export_path}')

if 'failed_gemini_df' in globals() and not failed_gemini_df.empty:
    print('Note: Failed Gemini rows were excluded from the exported XLSX. Re-run the Gemini cell if needed.')

try:
    from google.colab import files
    files.download(export_path)
except Exception:
    print('If not running in Colab, download the file from the notebook working directory.')

comparison_df



## Checkpoint
- Fill in the scoring columns for each output.
- Identify one task where local wins and one where Gemini wins.

## Reflection
- Which model would you choose for: (a) sensitive data, (b) fastest setup, (c) predictable cost?

## Troubleshooting
- If local model pull fails, verify runtime and available memory.
- If Gemini fails, re-run the Gemini cell; failed Gemini rows are not exported to XLSX.

