# Day 11: Advanced Patterns - Reasoning and Thinking Models

## Unlocking Advanced Capabilities

Today you'll learn about advanced agent patterns including reasoning models!

### What's Special About Reasoning Models?

**Regular LLMs**: Think â†’ Answer (fast but less accurate)

**Reasoning models** (QwQ, Qwen3 with thinking): Think â†’ Reason â†’ Verify â†’ Answer (slower but more accurate)

**Example:**
```
Regular: "What's 15 factorial?"
â†’ "1307674368000" (guessed)

Reasoning: "What's 15 factorial?"
â†’ <think>I need to calculate 15! step by step...
   15 * 14 = 210
   210 * 13 = 2730...
   Final: 1307674368000</think>
â†’ "The answer is 1,307,674,368,000" (verified)
```

### Today's Topics:
1. **QwQ-32B reasoning model** - Deep thinking for complex tasks
2. **enable_thinking parameter** - Control thinking mode
3. **thought_in_content** - Handling thinking in different APIs
4. **fncall_prompt_type** - Tool calling templates
5. **Real examples** - From official assistant_qwq.py and assistant_qwen3.py

Let's explore advanced patterns! ðŸ§ 

---
## Part 1: Setup

In [None]:
import os

os.environ['FIREWORKS_API_KEY'] = 'fw_3ZTLPrnEtuscTUPYy3sYx3ag'

llm_cfg = {
    'model': 'accounts/fireworks/models/qwen3-235b-a22b-thinking-2507',
    'model_server': 'https://api.fireworks.ai/inference/v1',
    'api_key': os.environ['FIREWORKS_API_KEY'],
    'generate_cfg': {'max_tokens': 32768}
}

print('âœ… Configured')

---
## Part 2: QwQ-32B Reasoning Model

### What is QwQ?

**QwQ-32B** is a reasoning model that shows its thinking process.

From official `assistant_qwq.py`:

```python
llm_cfg = {
    'model': 'qwq-32b',
    'model_type': 'qwen_dashscope',
    'generate_cfg': {
        'fncall_prompt_type': 'nous',  # Recommended for QwQ
    }
}
```

### Key Configuration for Reasoning Models

1. **fncall_prompt_type: 'nous'**
   - Better for Qwen2.5+ and QwQ
   - Supports parallel function calls
   - Default in newer versions

2. **thought_in_content**
   - Use when old vLLM doesn't support `reasoning_content` field
   - When thinking is mixed with answer: `<think>...</think>answer`
   - Don't use when thinking is separate field

In [None]:
from qwen_agent.agents import Assistant

# Our Fireworks model supports reasoning
# It's Qwen3-235B-A22B-Thinking-2507 which has thinking capability

reasoning_bot = Assistant(
    llm=llm_cfg,
    name='Reasoning Assistant',
    system_message='You solve problems step-by-step. Show your reasoning.'
)

print("âœ… Created reasoning assistant")

In [None]:
# Test with a problem that benefits from reasoning
messages = [{
    'role': 'user',
    'content': 'If Alice has twice as many apples as Bob, and Bob has 3 more apples than Charlie, and Charlie has 5 apples, how many apples does Alice have?'
}]

print("Question: Complex word problem\n")
for response in reasoning_bot.run(messages):
    if response:
        answer = response[-1].get('content', '')
        print(f"Answer: {answer}\n")
        break

---
## Part 3: enable_thinking Parameter

### Different API Approaches

From official `assistant_qwen3.py`, there are 3 ways to configure thinking:

#### Approach 1: DashScope Native API
```python
llm_cfg = {
    'model': 'qwen3-235b-a22b',
    'model_type': 'qwen_dashscope',
    'generate_cfg': {
        'enable_thinking': True,  # Enable thinking mode
    }
}
```

#### Approach 2: DashScope OpenAI-Compatible API
```python
llm_cfg = {
    'model': 'qwen3-235b-a22b',
    'model_server': 'https://dashscope.aliyuncs.com/compatible-mode/v1',
    'generate_cfg': {
        'extra_body': {
            'enable_thinking': False  # Control via extra_body
        }
    }
}
```

#### Approach 3: vLLM/SGLang Self-Hosted
```python
llm_cfg = {
    'model': 'Qwen/Qwen3-32B',
    'model_server': 'http://localhost:8000/v1',
    'generate_cfg': {
        'extra_body': {
            'chat_template_kwargs': {'enable_thinking': False}
        }
    }
}
```

In [None]:
# Example: Using our Fireworks Qwen3 thinking model
# It already has thinking enabled by default in the model name

print("Our model configuration:")
print(f"Model: {llm_cfg['model']}")
print("\nThis model has built-in thinking capability!")
print("The '2507' in the name indicates it's a thinking-enhanced version.")
print("\nNo need to explicitly enable thinking - it's already optimized for reasoning tasks.")

---
## Part 4: thought_in_content Parameter

### When to Use It?

From `assistant_qwq.py` comments:

```python
'generate_cfg': {
    # This parameter needs to be passed when:
    # 1. Using reasoning model (e.g. qwq-32b)
    # 2. Deployed with old vLLM that doesn't support reasoning_content field
    # 3. Content format is: `<think>thought</think>answer`
    
    # 'thought_in_content': True,  # Uncomment if needed
}
```

### Two Formats:

**Format 1: Separate fields** (modern)
```json
{
  "reasoning_content": "Let me think...",
  "content": "The answer is 42"
}
```
â†’ Don't use `thought_in_content`

**Format 2: Mixed in content** (old vLLM)
```json
{
  "content": "<think>Let me think...</think>The answer is 42"
}
```
â†’ Use `thought_in_content: True`

---
## Part 5: fncall_prompt_type - Tool Calling Templates

### Available Templates

1. **'nous'** (recommended for Qwen2.5+, QwQ)
   - Better tool calling
   - Parallel function calls
   - Default in newer versions

2. **'qwen'** (legacy)
   - Original Qwen format
   - Still works but 'nous' is better

From official examples:
```python
# QwQ and Qwen3 examples use:
'generate_cfg': {
    'fncall_prompt_type': 'nous'
}
```

In [None]:
# Example: Assistant with tools using 'nous' template
tool_bot = Assistant(
    llm=llm_cfg,
    function_list=['code_interpreter'],
    system_message='You are a helpful coding assistant.'
)

print("âœ… Created assistant with code_interpreter")
print("   Using default fncall_prompt_type (optimized for Qwen3)\n")

# Test it
messages = [{'role': 'user', 'content': 'Calculate the sum of squares from 1 to 10'}]

for response in tool_bot.run(messages):
    for msg in response:
        if msg.get('function_call'):
            print(f"ðŸ”§ Calling: {msg['function_call']['name']}")
        elif msg.get('content'):
            print(f"Result: {msg['content'][:150]}\n")
            break

---
## Part 6: Real Example from Official Code

### QwQ Image Generation Demo

From `assistant_qwq.py` - complete working example:

In [None]:
# Adapted from assistant_qwq.py
import urllib.parse
import json
from qwen_agent.tools.base import BaseTool, register_tool

# Register image generation tool (from official example)
@register_tool('image_gen')
class ImageGen(BaseTool):
    description = 'AI painting (image generation) service'
    parameters = [{
        'name': 'prompt',
        'type': 'string',
        'description': 'Image description in English',
        'required': True
    }]
    
    def call(self, params, **kwargs):
        prompt = json.loads(params)['prompt']
        prompt = urllib.parse.quote(prompt)
        return json.dumps({'image_url': f'https://image.pollinations.ai/prompt/{prompt}'})

print("âœ… Registered image_gen tool")

In [None]:
# Create QwQ-style assistant (adapted for Fireworks)
qwq_bot = Assistant(
    llm=llm_cfg,
    function_list=['image_gen'],
    name='Reasoning Image Generator',
    description='I think through image generation requests carefully'
)

print("âœ… Created reasoning image generator\n")

# Test (from official example pattern)
messages = [{'role': 'user', 'content': 'Draw a cat and a dog playing together'}]

for response in qwq_bot.run(messages):
    for msg in response:
        if msg.get('function_call'):
            print(f"ðŸŽ¨ Generating image with: {msg['function_call']['name']}")
            print(f"   Prompt: {msg['function_call']['arguments'][:100]}...\n")
        elif msg.get('content'):
            print(f"Response: {msg['content'][:200]}\n")
            break

---
## Part 7: Qwen3 with MCP Tools

### MCP Integration Example

From `assistant_qwen3.py`:

In [None]:
# Example MCP configuration from assistant_qwen3.py
mcp_config = {
    'mcpServers': {
        'time': {
            'command': 'uvx',
            'args': ['mcp-server-time', '--local-timezone=Asia/Shanghai']
        },
        'fetch': {
            'command': 'uvx',
            'args': ['mcp-server-fetch']
        }
    }
}

print("MCP Tools Configuration (from official example):")
print(json.dumps(mcp_config, indent=2))
print("\nâœ… This shows how to integrate external MCP servers")
print("   (Requires Node.js/Python MCP servers installed)")

# With MCP, you could create:
# bot = Assistant(
#     llm=llm_cfg,
#     function_list=[mcp_config, 'code_interpreter']
# )

---
## Part 8: Comparison Table

### When to Use What?

| Feature | Regular Assistant | Reasoning Assistant |
|---------|------------------|--------------------|
| **Speed** | Fast | Slower |
| **Accuracy** | Good | Excellent |
| **Best for** | Simple tasks | Complex reasoning |
| **Cost** | Lower | Higher |
| **Thinking shown** | No | Yes (optional) |
| **Use when** | Speed matters | Accuracy critical |

### Configuration Quick Reference

```python
# For most tasks (fast)
llm_cfg = {'model': 'qwen3-32b'}

# For reasoning tasks (accurate)
llm_cfg = {
    'model': 'qwq-32b',
    'generate_cfg': {'fncall_prompt_type': 'nous'}
}

# Enable thinking (DashScope)
llm_cfg = {
    'model': 'qwen3-235b',
    'generate_cfg': {'enable_thinking': True}
}
```

---
## Summary

âœ… **QwQ-32B** - Reasoning model for complex tasks
âœ… **enable_thinking** - Control thinking mode (3 API approaches)
âœ… **thought_in_content** - For old vLLM deployments
âœ… **fncall_prompt_type** - 'nous' recommended for Qwen2.5+/QwQ
âœ… **All from official examples** - assistant_qwq.py & assistant_qwen3.py
âœ… **MCP integration** - External tool servers

### Key Takeaways:

1. **Reasoning models think step-by-step** - Better for complex tasks
2. **enable_thinking** - Different for each API (DashScope, OAI, vLLM)
3. **fncall_prompt_type: 'nous'** - Use for modern models
4. **thought_in_content** - Only for old vLLM
5. **MCP tools** - Extend with external servers

### From Official Examples:
- `assistant_qwq.py` - QwQ reasoning model
- `assistant_qwen3.py` - Qwen3 with MCP and thinking
- All configurations are production-tested!

**Tomorrow**: GUI Development with WebUI! ðŸŽ¨