## Pydantic Ai Adapter in Action

#### Setup Path and Reload Modules

In [1]:
import os
import sys

# Add src to path
sys.path.append(os.path.abspath("../src"))

# Load environment variables
from dotenv import load_dotenv
load_dotenv('/home/timmy/RepoAI_AI/.env', override=True)

# Enable autoreload
%load_ext autoreload
%autoreload 2

print("✓ Environment configured")

✓ Environment configured


In [21]:
# Force reload modules
import sys
import importlib

# Remove cached modules
if 'repoai.llm.pydantic_ai_adapter' in sys.modules:
    del sys.modules['repoai.llm.pydantic_ai_adapter']
if 'repoai.llm.router' in sys.modules:
    del sys.modules['repoai.llm.router']

print("✓ Modules cleared from cache")

✓ Modules cleared from cache


#### Setup Logging

In [3]:
from repoai.utils.logger import setup_logging
import logging

# Setup with DEBUG level to see all the fallback attempts and model selections
setup_logging(level=logging.INFO, use_colors=True)

print("✓ Logging configured")

✓ Logging configured


#### Import Required Modules

In [22]:
from repoai.llm.pydantic_ai_adapter import PydanticAIAdapter
from repoai.llm.model_roles import ModelRole
from pydantic import BaseModel
from typing import List

print("✓ Modules imported")

✓ Modules imported


#### Test Schema for Structured

In [23]:
class CodeSuggestion(BaseModel):
    language: str
    code: str
    explanation: str

class TaskList(BaseModel):
    tasks: List[str]
    priority: str

class APIExplanation(BaseModel):
    concept: str
    definition: str
    key_features: List[str]
    example_use_case: str

print("✓ Schemas defined")

✓ Schemas defined


#### Initialize Adapter

In [24]:
adapter = PydanticAIAdapter()
print("✓ PydanticAIAdapter initialized")
print("\nRouter configuration:")
print(adapter.router.get_config_summary())

2025-10-24 20:15:32 | INFO     | repoai.llm.router | ModelRouter Initialized.
✓ PydanticAIAdapter initialized

Router configuration:
{'INTAKE': {'primary': 'deepseek/deepseek-chat-v3.1', 'fallbacks': ['alibaba/qwen-max', 'claude-sonnet-4-5-20250929'], 'total_models': 3}, 'PLANNER': {'primary': 'deepseek/deepseek-reasoner-v3.1', 'fallbacks': ['alibaba/qwen3-next-80b-a3b-thinking', 'claude-opus-4-20250514'], 'total_models': 3}, 'PR_NARRATOR': {'primary': 'deepseek/deepseek-chat-v3.1', 'fallbacks': ['claude-haiku-4-5-20251001', 'alibaba/qwen3-235b-a22b-thinking-2507'], 'total_models': 3}, 'CODER': {'primary': 'alibaba/qwen3-coder-480b-a35b-instruct', 'fallbacks': ['Qwen/Qwen2.5-Coder-32B-Instruct', 'deepseek/deepseek-chat-v3.1', 'claude-opus-4-1-20250805'], 'total_models': 4}, 'EMBEDDING': {'primary': 'bge-small', 'fallbacks': [], 'total_models': 1}}
✓ PydanticAIAdapter initialized

Router configuration:
{'INTAKE': {'primary': 'deepseek/deepseek-chat-v3.1', 'fallbacks': ['alibaba/qwen-max

#### Raw Completion

In [17]:
messages = [
    {"role": "system", "content": "You are a helpful AI assistant."},
    {"role": "user", "content": "Explain what Python is in 2 sentences."}
]

output = await adapter.run_raw_async(
    ModelRole.INTAKE, 
    messages, 
    max_output_tokens=100
)

print(f"\nRaw Output: \n{output}\n✓ Length: {len(output)} characters")

2025-10-24 20:00:38 | INFO     | repoai.llm.pydantic_ai_adapter | Starting raw completion: role=INTAKE, fallback=True
2025-10-24 20:00:38 | INFO     | repoai.llm.pydantic_ai_adapter | Attempting raw completion 1/3: model=deepseek/deepseek-chat-v3.1
2025-10-24 20:00:38 | INFO     | repoai.llm.pydantic_ai_adapter | Attempting raw completion 1/3: model=deepseek/deepseek-chat-v3.1
2025-10-24 20:00:43 | INFO     | repoai.llm.pydantic_ai_adapter | Raw completion succeeded on attempt 1: 4785.54 ms, output length=266

Raw Output: 
Python is a high-level, general-purpose programming language known for its clear and readable syntax. It is widely used for web development, data analysis, artificial intelligence, and scientific computing due to its extensive collection of libraries and frameworks.
✓ Length: 266 characters
2025-10-24 20:00:43 | INFO     | repoai.llm.pydantic_ai_adapter | Raw completion succeeded on attempt 1: 4785.54 ms, output length=266

Raw Output: 
Python is a high-level, genera

#### Structured Output (JSON)

In [17]:
messages = [
    {"role": "system", "content": "You are a coding expert."},
    {"role": "user", "content": "Write a Python function to calculate the nth Fibonacci number using recursion."}
]

result = await adapter.run_json_async(
    role=ModelRole.CODER,
    schema=CodeSuggestion,
    messages=messages,
    max_output_tokens=500
)

print(f"\nResult Type: {type(result)}")
print(f"✓ Is CodeSuggestion? {isinstance(result, CodeSuggestion)}")
print(f"\nLanguage: {result.language}")
print(f"\nCode:\n{result.code}")
print(f"\nExplanation:\n{result.explanation}")

2025-10-24 16:45:35 | INFO     | repoai.llm.pydantic_ai_adapter | Starting JSON completion: role= CODER, Schema= CodeSuggestion, use_fallback= True
2025-10-24 16:45:35 | INFO     | repoai.llm.pydantic_ai_adapter | Attempting JSON completion 1/4: Model: OpenAIChatModel(), role: CODER
2025-10-24 16:45:45 | INFO     | repoai.llm.pydantic_ai_adapter | JSON completion succeeded on attempt 1: 9548.47 ms

Result Type: <class '__main__.CodeSuggestion'>
✓ Is CodeSuggestion? True

Language: Python

Code:
def fibonacci(n):
    if n <= 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fibonacci(n - 1) + fibonacci(n - 2)

Explanation:
This recursive function calculates the nth Fibonacci number by breaking down the problem into smaller subproblems. The base cases handle n=0 (returns 0) and n=1 (returns 1). For larger values, it recursively calls itself to compute the sum of the two preceding Fibonacci numbers. Note that this implementation has exponential time complexity

#### Raw Streaming

In [18]:
messages = [
    {"role": "system", "content": "You are a creative writer."},
    {"role": "user", "content": "Write a haiku about artificial intelligence."}
]

print("\nStreaming output:")

chunk_count = 0
async for chunk in adapter.stream_raw_async(
    ModelRole.INTAKE, 
    messages, 
    max_output_tokens=100
):
    print(chunk, end='', flush=True)
    chunk_count += 1

print("\n" + "-" * 40)
print(f"✓ Received {chunk_count} chunks")


Streaming output:
2025-10-24 16:46:22 | INFO     | repoai.llm.pydantic_ai_adapter | Starting raw streaming completion: role=INTAKE
2025-10-24 16:46:22 | INFO     | repoai.llm.pydantic_ai_adapter | Attempting streaming 1/3: model=OpenAIChatModel()
2025-10-24 16:46:22 | INFO     | repoai.llm.pydantic_ai_adapter | Attempting streaming 1/3: model=OpenAIChatModel()
SilSiliconicon mind dreams,
On a stream mind dreams,
On a stream of ones and zeros of ones and zeros,
New,
New thoughts take their thoughts take their first breath.2025-10-24 16:46:25 | INFO     | repoai.llm.pydantic_ai_adapter | Streaming succeeded on attempt 1

----------------------------------------
✓ Received 7 chunks
 first breath.2025-10-24 16:46:25 | INFO     | repoai.llm.pydantic_ai_adapter | Streaming succeeded on attempt 1

----------------------------------------
✓ Received 7 chunks


#### Different Model Roles

In [19]:
test_cases = [
    (ModelRole.INTAKE, "What is machine learning?"),
    (ModelRole.PLANNER, "Create a plan for building a todo app"),
    (ModelRole.CODER, "Write a hello world function in Python"),
]

for i, (role, prompt) in enumerate(test_cases, 1):
    print(f"\nTest {i}/{len(test_cases)}: {role.name}")
    print("-" * 40)
    
    output = await adapter.run_raw_async(
        role, 
        [{"role": "user", "content": prompt}],
        max_output_tokens=80
    )
    
    print(f"Prompt: {prompt}")
    print(f"Response: {output[:100]}...")
    print()


Test 1/3: INTAKE
----------------------------------------
2025-10-24 16:48:26 | INFO     | repoai.llm.pydantic_ai_adapter | Starting raw completion: role=INTAKE, fallback=True
2025-10-24 16:48:26 | INFO     | repoai.llm.pydantic_ai_adapter | Attempting raw completion 1/3: model=OpenAIChatModel()
2025-10-24 16:48:26 | INFO     | repoai.llm.pydantic_ai_adapter | Attempting raw completion 1/3: model=OpenAIChatModel()
2025-10-24 16:49:12 | INFO     | repoai.llm.pydantic_ai_adapter | Raw completion succeeded on attempt 1: 46062.03 ms, output length=5683
Prompt: What is machine learning?
Response: Of course! Here is a comprehensive yet easy-to-understand explanation of what machine learning is.

...


Test 2/3: PLANNER
----------------------------------------
2025-10-24 16:49:12 | INFO     | repoai.llm.pydantic_ai_adapter | Starting raw completion: role=PLANNER, fallback=True
2025-10-24 16:49:12 | INFO     | repoai.llm.pydantic_ai_adapter | Attempting raw completion 1/3: model=OpenAIChatMod

#### Complex Schema

In [20]:
messages = [
    {"role": "system", "content": "You are a technical educator."},
    {"role": "user", "content": "Explain REST APIs"}
]

result = await adapter.run_json_async(
    role=ModelRole.PLANNER,
    schema=APIExplanation,
    messages=messages,
    max_output_tokens=400
)

print(f"\nConcept: {result.concept}")
print(f"\nDefinition:\n{result.definition}")
print(f"\nKey Features:")
for i, feature in enumerate(result.key_features, 1):
    print(f"  {i}. {feature}")
print(f"\nExample Use Case:\n{result.example_use_case}")

2025-10-24 16:53:59 | INFO     | repoai.llm.pydantic_ai_adapter | Starting JSON completion: role= PLANNER, Schema= APIExplanation, use_fallback= True
2025-10-24 16:53:59 | INFO     | repoai.llm.pydantic_ai_adapter | Attempting JSON completion 1/3: Model: OpenAIChatModel(), role: PLANNER
2025-10-24 16:53:59 | INFO     | repoai.llm.pydantic_ai_adapter | Attempting JSON completion 1/3: Model: OpenAIChatModel(), role: PLANNER
2025-10-24 16:54:32 | INFO     | repoai.llm.pydantic_ai_adapter | JSON completion succeeded on attempt 1: 33076.75 ms

Concept: REST APIs

Definition:
REST (Representational State Transfer) is an architectural style for designing networked applications that uses standard HTTP methods to interact with resources identified by URIs, enabling stateless and scalable communication between clients and servers.

Key Features:
  1. Stateless: Each request from client to server must contain all information needed, with no session state stored on the server.
  2. Client-Server: 

#### FallBack Behaviour

In [21]:
messages = [
    {"role": "user", "content": "Count from 1 to 5"}
]

print("\nWith fallback enabled (default):")
output = await adapter.run_raw_async(
    ModelRole.INTAKE,
    messages,
    max_output_tokens=50,
    use_fallback=True
)
print(f"✓ Response: {output}")

print("\n" + "=" * 60)
print("Without fallback (primary model only):")
output = await adapter.run_raw_async(
    ModelRole.INTAKE,
    messages,
    max_output_tokens=50,
    use_fallback=False
)
print(f"✓ Response: {output}")


With fallback enabled (default):
2025-10-24 16:56:00 | INFO     | repoai.llm.pydantic_ai_adapter | Starting raw completion: role=INTAKE, fallback=True
2025-10-24 16:56:00 | INFO     | repoai.llm.pydantic_ai_adapter | Attempting raw completion 1/3: model=OpenAIChatModel()
2025-10-24 16:56:00 | INFO     | repoai.llm.pydantic_ai_adapter | Attempting raw completion 1/3: model=OpenAIChatModel()
2025-10-24 16:56:03 | INFO     | repoai.llm.pydantic_ai_adapter | Raw completion succeeded on attempt 1: 2815.28 ms, output length=13
✓ Response: 1, 2, 3, 4, 5

Without fallback (primary model only):
2025-10-24 16:56:03 | INFO     | repoai.llm.pydantic_ai_adapter | Starting raw completion: role=INTAKE, fallback=False
2025-10-24 16:56:03 | INFO     | repoai.llm.pydantic_ai_adapter | Using primary model: OpenAIChatModel()
2025-10-24 16:56:03 | INFO     | repoai.llm.pydantic_ai_adapter | Raw completion succeeded on attempt 1: 2815.28 ms, output length=13
✓ Response: 1, 2, 3, 4, 5

Without fallback (pri

#### Performance Test

In [22]:
import time

test_cases = [
    (ModelRole.INTAKE, {"role": "user", "content": "What is Python?"}),
    (ModelRole.CODER, {"role": "user", "content": "Write a hello function"}),
    (ModelRole.PLANNER, {"role": "user", "content": "Plan a simple web app"}),
]

print("\nRunning performance test...\n")
start = time.time()

for i, (role, message) in enumerate(test_cases, 1):
    call_start = time.time()
    output = await adapter.run_raw_async(role, [message], max_output_tokens=60)
    call_duration = (time.time() - call_start) * 1000
    
    print(f"Call {i} ({role.name}):")
    print(f"  Duration: {call_duration:.0f}ms")
    print(f"  Output: {output[:60]}...")
    print()

elapsed = time.time() - start
print(f"✓ Completed {len(test_cases)} calls in {elapsed:.2f}s")
print(f"  Average: {elapsed/len(test_cases):.2f}s per call")


Running performance test...

2025-10-24 16:56:48 | INFO     | repoai.llm.pydantic_ai_adapter | Starting raw completion: role=INTAKE, fallback=True
2025-10-24 16:56:48 | INFO     | repoai.llm.pydantic_ai_adapter | Attempting raw completion 1/3: model=OpenAIChatModel()
2025-10-24 16:56:48 | INFO     | repoai.llm.pydantic_ai_adapter | Attempting raw completion 1/3: model=OpenAIChatModel()
2025-10-24 16:57:33 | INFO     | repoai.llm.pydantic_ai_adapter | Raw completion succeeded on attempt 1: 44443.97 ms, output length=5408
Call 1 (INTAKE):
  Duration: 44447ms
  Output: Of course! Here is a comprehensive overview of what Python i...

2025-10-24 16:57:33 | INFO     | repoai.llm.pydantic_ai_adapter | Starting raw completion: role=CODER, fallback=True
2025-10-24 16:57:33 | INFO     | repoai.llm.pydantic_ai_adapter | Attempting raw completion 1/4: model=OpenAIChatModel()
2025-10-24 16:57:33 | INFO     | repoai.llm.pydantic_ai_adapter | Raw completion succeeded on attempt 1: 44443.97 ms, outpu

#### All Model Information

In [25]:
# Get model for custom agent creation
spec = adapter.get_spec(ModelRole.CODER)
print(f"\nModel for CODER role: {spec.model_id}")

# Get all model IDs with fallback
model_ids = adapter.get_model_ids_with_fallback(ModelRole.INTAKE)
print(f"\nAll model IDs for INTAKE role:")
for i, model_id in enumerate(model_ids, 1):
    print(f"  {i}. {model_id}")

# Get model settings
settings = adapter.get_model_settings(ModelRole.PLANNER)
print(f"\nSettings for PLANNER role:")
print(f"  Temperature: {settings['temperature']}")
print(f"  Max Tokens: {settings['max_tokens']}")

# Get full spec
spec = adapter.get_spec(ModelRole.CODER)
print(f"\nFull spec for CODER:")
print(f"  Model ID: {spec.model_id}")
print(f"  Provider: {spec.provider}")
print(f"  Temperature: {spec.temperature}")
print(f"  Max Output Tokens: {spec.max_output_tokens}")
print(f"  JSON Mode: {spec.json_mode}")


Model for CODER role: alibaba/qwen3-coder-480b-a35b-instruct

All model IDs for INTAKE role:
  1. deepseek/deepseek-chat-v3.1
  2. alibaba/qwen-max
  3. claude-sonnet-4-5-20250929

Settings for PLANNER role:
  Temperature: 0.3
  Max Tokens: 4096

Full spec for CODER:
  Model ID: alibaba/qwen3-coder-480b-a35b-instruct
  Provider: Qwen
  Temperature: 0.2
  Max Output Tokens: 2048
  JSON Mode: False
