# Sophisticated Content-Aware Model Selection

This notebook demonstrates an advanced approach that analyzes not just message count, but also content complexity, conversation length, and context clues.

## Key Concepts
- **Multi-Factor Analysis**: Content complexity, length, and message count
- **Keyword Detection**: Automatic complexity recognition
- **Smart Decision Making**: Context-aware model selection

## Selection Factors
1. **Keyword Analysis**: Looks for words like "analysis", "research", "comprehensive"
2. **Content Length**: Long conversations often need better reasoning
3. **Message Count**: Many exchanges suggest complex discussion

## Algorithm Logic
Use GPT-OSS if ANY of these conditions are true:
- Total characters > 3000
- Contains complexity keywords
- Message count > 8

Otherwise use Qwen3

## Prerequisites

Make sure you have the required packages installed and both models available:

```bash
pip install langchain langchain-community langchain-core langgraph pydantic
ollama pull qwen3
ollama pull gpt-oss
ollama serve
```

In [1]:
# Import required modules
from typing import TypedDict, Dict, Any, List
from langchain_ollama import ChatOllama
from langchain.agents import create_agent, AgentState
from langgraph.runtime import Runtime
import tools

## Intelligent Model Selection Function

This function analyzes multiple factors to make smart decisions about which model to use:

In [2]:
# Define tool list
tool_list = [tools.web_search]

def intelligent_model_select(state: AgentState, runtime: Runtime) -> ChatOllama:
    """Intelligent model selection based on multiple factors."""
    messages = state["messages"]
    message_count = len(messages)
    
    # Factor 1: Calculate total content length
    total_length = sum(
        len(str(msg.content)) 
        for msg in messages 
        if hasattr(msg, 'content') and msg.content
    )
    
    # Factor 2: Check for complexity keywords
    complex_keywords = [
        "analysis", "research", "detailed", "comprehensive", 
        "complex", "strategy", "evaluate", "compare",
        "investigate", "thorough", "in-depth", "sophisticated"
    ]
    
    has_complex_content = any(
        keyword in str(msg.content).lower() 
        for msg in messages 
        for keyword in complex_keywords 
        if hasattr(msg, 'content') and msg.content
    )
    
    # Multi-factor decision logic
    if total_length > 3000 or has_complex_content or message_count > 8:
        print(f"  GPT-OSS selected: {message_count} msgs, {total_length} chars, complex_keywords: {has_complex_content}")
        return ChatOllama(
            model="gpt-oss", 
            temperature=0.0, 
            num_predict=2500
        ).bind_tools(tool_list)
    else:
        print(f"  Qwen3 selected: {message_count} msgs, {total_length} chars, complex_keywords: {has_complex_content}")
        return ChatOllama(
            model="qwen3", 
            temperature=0.1, 
            num_predict=1000
        ).bind_tools(tool_list)

print("Intelligent model selection function defined!")
print("Factors: content length, complexity keywords, message count")

Intelligent model selection function defined!
Factors: content length, complexity keywords, message count


## Creating the Smart Agent

Create an agent that uses our intelligent model selection:

In [3]:
# Create agent with intelligent model selection
agent = create_agent(intelligent_model_select, tools=tool_list)

print("Smart agent created successfully!")
print("This agent analyzes content complexity to choose the best model")

Smart agent created successfully!
This agent analyzes content complexity to choose the best model


## Test 1: Simple Query (Should Use Qwen3)

Let's test with a simple query that should trigger Qwen3:

In [4]:
print("=== Testing Simple Query (Should Use Qwen3) ===")

result1 = agent.invoke({
    "messages": "Hello there"
})

print(f"\nSimple query result processed successfully")

=== Testing Simple Query (Should Use Qwen3) ===
  Qwen3 selected: 1 msgs, 11 chars, complex_keywords: False

Simple query result processed successfully


## Test 2: Complex Query with Keywords (Should Use GPT-OSS)

Now let's test with a query containing complexity keywords:

In [5]:
print("=== Testing Complex Query with Keywords (Should Use GPT-OSS) ===")

complex_query = "I need a comprehensive analysis and detailed research on market strategies for AI companies"

result2 = agent.invoke({
    "messages": complex_query
})

print("\nNotice: Complex keywords triggered GPT-OSS even for a single message")

=== Testing Complex Query with Keywords (Should Use GPT-OSS) ===
  GPT-OSS selected: 1 msgs, 91 chars, complex_keywords: True
  GPT-OSS selected: 3 msgs, 1677 chars, complex_keywords: True
  GPT-OSS selected: 5 msgs, 3310 chars, complex_keywords: True
  GPT-OSS selected: 7 msgs, 4847 chars, complex_keywords: True

Notice: Complex keywords triggered GPT-OSS even for a single message


## Test 3: Long Content (Should Use GPT-OSS)

Test with a message that exceeds the character threshold:

In [6]:
print("=== Testing Long Content (Should Use GPT-OSS) ===")

# Create a long message that exceeds 3000 characters
long_message = "Please help me understand this topic. " * 200  # Creates a long message

result3 = agent.invoke({
    "messages": long_message
})

print(f"\nLong content (>{len(long_message)} chars) triggered GPT-OSS")

=== Testing Long Content (Should Use GPT-OSS) ===
  GPT-OSS selected: 1 msgs, 7600 chars, complex_keywords: False

Long content (>7600 chars) triggered GPT-OSS


## Real-World Benefits

### Immediate Complex Query Handling
- A single complex question immediately gets the better model
- No need to wait for conversation to grow
- Better user experience for demanding tasks

### Efficient Resource Usage
- Long simple conversations don't waste expensive model usage
- Smart detection prevents unnecessary upgrades
- Cost optimization with quality assurance

### Context-Aware Decisions
- Content analysis provides better context than just message count
- Keyword detection catches complexity early
- Multi-factor analysis reduces false positives/negatives

## Customization Examples

### Industry-Specific Keywords

In [None]:
# Example: Legal industry keywords
legal_keywords = [
    "contract", "litigation", "compliance", "regulation",
    "precedent", "statute", "brief", "discovery"
]

# Example: Medical industry keywords
medical_keywords = [
    "diagnosis", "treatment", "symptoms", "pathology",
    "clinical", "therapeutic", "pharmaceutical", "protocol"
]

# Example: Financial industry keywords
financial_keywords = [
    "portfolio", "investment", "risk", "valuation",
    "derivatives", "compliance", "audit", "forecast"
]

print("Industry-specific keyword sets:")
print(f"Legal: {', '.join(legal_keywords[:4])}...")
print(f"Medical: {', '.join(medical_keywords[:4])}...")
print(f"Financial: {', '.join(financial_keywords[:4])}...")

### Advanced Thresholds

In [None]:
def advanced_model_selector(state: AgentState, runtime: Runtime) -> ChatOllama:
    """Example of more sophisticated selection logic."""
    messages = state["messages"]
    
    # Calculate various metrics
    message_count = len(messages)
    total_length = sum(len(str(msg.content)) for msg in messages if hasattr(msg, 'content') and msg.content)
    avg_message_length = total_length / message_count if message_count > 0 else 0
    
    # Advanced keyword categories
    complexity_keywords = ["analysis", "research", "comprehensive"]
    urgency_keywords = ["urgent", "immediate", "asap", "critical"]
    technical_keywords = ["algorithm", "implementation", "architecture", "optimization"]
    
    # Check for different types of complexity
    content_text = " ".join(str(msg.content) for msg in messages if hasattr(msg, 'content') and msg.content).lower()
    
    has_complexity = any(kw in content_text for kw in complexity_keywords)
    has_urgency = any(kw in content_text for kw in urgency_keywords)
    has_technical = any(kw in content_text for kw in technical_keywords)
    
    # Scoring system
    score = 0
    score += message_count * 0.5  # Each message adds 0.5 points
    score += total_length / 1000  # Each 1000 chars adds 1 point
    score += 3 if has_complexity else 0  # Complexity adds 3 points
    score += 2 if has_technical else 0   # Technical adds 2 points
    score += 1 if has_urgency else 0     # Urgency adds 1 point
    
    # Decision based on score
    if score >= 5.0:  # Threshold for GPT-OSS
        print(f"  GPT-OSS selected (score: {score:.1f})")
        return ChatOllama(model="gpt-oss", temperature=0.0, num_predict=2500).bind_tools(tool_list)
    else:
        print(f"  Qwen3 selected (score: {score:.1f})")
        return ChatOllama(model="qwen3", temperature=0.1, num_predict=1000).bind_tools(tool_list)

print("Advanced scoring-based model selector defined!")
print("Uses weighted scoring: messages + length + keyword categories")

## Best Practices

### 1. Monitor and Adjust
- Track model selection decisions
- Analyze which factors trigger switches most often
- Adjust thresholds based on usage patterns

### 2. Domain-Specific Tuning
- Create keyword sets for your specific domain
- Adjust character thresholds based on typical content
- Consider user behavior patterns

### 3. Performance Optimization
- Cache keyword lookups for frequently used terms
- Use efficient string matching algorithms
- Consider preprocessing for better performance

### 4. User Experience
- Provide feedback about model switches (if appropriate)
- Ensure smooth transitions between models
- Handle edge cases gracefully

### 5. Cost Management
- Monitor actual cost savings vs. performance gains
- Track false positives (unnecessary upgrades)
- Balance cost optimization with user satisfaction

## Conclusion

This sophisticated approach provides:
- **Better accuracy** in model selection
- **Immediate response** to complex queries
- **Cost efficiency** with quality assurance
- **Flexibility** for domain-specific customization
- **Scalability** for various use cases