# Dynamic Model Selection (Qwen3 → GPT-OSS)

This notebook demonstrates a cost-optimization strategy where the agent automatically switches between models based on conversation complexity.

## Key Concepts
- **Cost Optimization**: Use cheaper model when possible
- **Performance Scaling**: Better model for complex tasks
- **Automatic Decision-Making**: No manual switching required

## Selection Logic
- **< 10 messages**: Use Qwen3 (fast, efficient)
- **≥ 10 messages**: Use GPT-OSS (better reasoning, longer context)

## Real-World Applications
- Customer service bots (simple queries → Qwen3, complex issues → GPT-OSS)
- Research assistants (quick facts → Qwen3, analysis → GPT-OSS)

In [None]:
# ollama pull qwen3 -< this is the simplest one
# ollama pull gpt-oss -< this one is the higher end model | good one

In [1]:
# Import required modules
from langchain_ollama import ChatOllama
from langchain.agents import create_agent, AgentState
from langgraph.runtime import Runtime
import tools

## Model Selection Function

This function automatically chooses between Qwen3 and GPT-OSS based on conversation length:

In [2]:
# Define tool list for both models
tool_list = [tools.web_search, tools.analyze_data]

def select_model(state: AgentState, runtime: Runtime) -> ChatOllama:
    """Choose between Qwen3 and GPT-OSS based on conversation length."""
    messages = state["messages"]
    message_count = len(messages)
    
    if message_count < 10:
        print(f"  Using Qwen3 for {message_count} messages (efficient)")
        return ChatOllama(model="qwen3", temperature=0.1).bind_tools(tool_list)
    else:
        print(f"  Switching to GPT-OSS for {message_count} messages (advanced)")
        return ChatOllama(model="gpt-oss", temperature=0.0, num_predict=2000).bind_tools(tool_list)

print("Model selection function defined!")
print("Logic: < 10 messages = Qwen3, >= 10 messages = GPT-OSS")

Model selection function defined!
Logic: < 10 messages = Qwen3, >= 10 messages = GPT-OSS


## Creating the Dynamic Agent

Create an agent that uses our dynamic model selection function:

In [3]:
# Create agent with dynamic model selection
agent = create_agent(select_model, tools=tool_list)

print("Dynamic agent created successfully!")
print("This agent will automatically switch models based on conversation complexity")

Dynamic agent created successfully!
This agent will automatically switch models based on conversation complexity


## Test 1: Short Conversation (Qwen3)

Let's test with a simple query that should use Qwen3:

In [4]:
print("=== Testing Short Conversation (Should Use Qwen3) ===")

result1 = agent.invoke({
    "messages": "Search for AI news"
})

print(f"\nShort conversation result: {result1['messages'][-1].content}")

=== Testing Short Conversation (Should Use Qwen3) ===
  Using Qwen3 for 1 messages (efficient)
  Using Qwen3 for 3 messages (efficient)

Short conversation result: <think>
Okay, let me process these search results for AI news. The user asked for AI news, so I need to filter out any irrelevant links. The first result is about Air New Zealand, which doesn't seem related to AI. The other four links are from MIT News, Nature, Reuters, and CNBC, which are all credible sources.

Looking at the second result, MIT News has an article about a new generative AI approach for predicting chemical reactions. That's definitely relevant. The third link from Nature discusses AI in materials discovery, which is a hot topic. The fourth one is about Bollywood stars and AI, which might be more about AI's impact on entertainment and privacy. The fifth link from CNBC talks about Microsoft's AI chips, which is tech news related to AI infrastructure.

I should summarize each of these, making sure to highlight 

## Test 2: Long Conversation (GPT-OSS)

Now let's simulate a longer conversation that should trigger GPT-OSS:

In [5]:
print("=== Testing Long Conversation (Should Use GPT-OSS) ===")

# Simulate conversation state with many messages
long_messages = "This is message number 12 in our conversation. I need complex analysis."

# Create a new agent instance for this test
agent_with_history = create_agent(select_model, tools=tool_list)

result2 = agent_with_history.invoke({
    "messages": [f"Message {i}" for i in range(12)] + [long_messages]
})

print("\nLong conversation triggered model switch to GPT-OSS")

=== Testing Long Conversation (Should Use GPT-OSS) ===
  Switching to GPT-OSS for 13 messages (advanced)

Long conversation triggered model switch to GPT-OSS


## Interactive Demo

Let's create an interactive demo where you can see the model switching in real-time:

In [7]:
def demo_conversation_progression():
    """Demonstrate how the agent switches models as conversation grows."""
    conversation_messages = []
    
    # Simulate a growing conversation
    test_messages = [
        "Hello", "How are you?", "What's the weather?", "Tell me about AI",
        "Explain machine learning", "What about deep learning?", "Show me examples",
        "How does this work?", "Give me more details", "I need comprehensive analysis",
        "Please provide research data", "Analyze this thoroughly"
    ]
    
    for i, message in enumerate(test_messages, 1):
        conversation_messages.append(message)
        
        print(f"\n=== Message {i}: '{message}' ===")
        
        # Create a mock state to test model selection
        mock_state = {"messages": conversation_messages}
        
        # Show which model would be selected
        if len(conversation_messages) < 10:
            print(f"Would use Qwen3 ({len(conversation_messages)} messages)")
            agent = create_agent(select_model, tools=tool_list)
            agent.invoke(mock_state)
        else:
            print(f"Would use GPT-OSS ({len(conversation_messages)} messages) - SWITCHED!")
            agent = create_agent(select_model, tools=tool_list)
            agent.invoke(mock_state)
            break  # Stop demo after switch

demo_conversation_progression()


=== Message 1: 'Hello' ===
Would use Qwen3 (1 messages)
  Using Qwen3 for 1 messages (efficient)

=== Message 2: 'How are you?' ===
Would use Qwen3 (2 messages)
  Using Qwen3 for 2 messages (efficient)

=== Message 3: 'What's the weather?' ===
Would use Qwen3 (3 messages)
  Using Qwen3 for 3 messages (efficient)

=== Message 4: 'Tell me about AI' ===
Would use Qwen3 (4 messages)
  Using Qwen3 for 4 messages (efficient)

=== Message 5: 'Explain machine learning' ===
Would use Qwen3 (5 messages)
  Using Qwen3 for 5 messages (efficient)
  Using Qwen3 for 9 messages (efficient)

=== Message 6: 'What about deep learning?' ===
Would use Qwen3 (6 messages)
  Using Qwen3 for 6 messages (efficient)

=== Message 7: 'Show me examples' ===
Would use Qwen3 (7 messages)
  Using Qwen3 for 7 messages (efficient)
  Using Qwen3 for 9 messages (efficient)

=== Message 8: 'How does this work?' ===
Would use Qwen3 (8 messages)
  Using Qwen3 for 8 messages (efficient)
  Switching to GPT-OSS for 10 messages