# Day 2: Message Schema & Communication

## Understanding How Agents Communicate

### Today's Learning Objectives:
1. Deep dive into the `Message` class structure
2. Understand different role types and their purposes
3. Work with `ContentItem` for multimodal messages
4. Learn about `FunctionCall` messages (preview for Day 6)
5. Build complex, multi-modal conversation histories
6. Understand reasoning_content for advanced models

### Prerequisites:
- Completed Day 1
- Qwen-Agent installed and configured
- API key set up

### Time Required: 1.5-2 hours

---

## Part 1: Why Messages Matter

### The Communication Protocol

Everything in Qwen-Agent flows through **Messages**. They are:
- The **input** to agents (what you say)
- The **output** from agents (what they respond)
- The **history** of conversations (context)
- The **mechanism** for tool calls (function execution)

### Message Flow Diagram:

```
User Creates Message
        |
        v
[Message(role='user', content='...')]
        |
        v
    Agent.run(messages)
        |
        v
    LLM processes
        |
        v
[Message(role='assistant', content='...'),
 Message(role='assistant', function_call=...)]
        |
        v
    User receives response
```

### Key Insight:
Messages are **NOT** just strings. They are structured data with:
- **role**: Who is speaking
- **content**: What is being said (can be text, image, etc.)
- **metadata**: Additional information (name, function calls, etc.)

## Part 2: Message Class Structure

### Source Code Location:
`/qwen_agent/llm/schema.py` (lines 132-165)

### Message Class Definition:

```python
class Message(BaseModelCompatibleDict):
    role: str                                      # Required: 'user', 'assistant', 'system', 'function'
    content: Union[str, List[ContentItem]]         # Required: Message content
    reasoning_content: Optional[Union[str, List[ContentItem]]] = None  # For reasoning models
    name: Optional[str] = None                     # Agent/tool name
    function_call: Optional[FunctionCall] = None   # Tool invocation data
    extra: Optional[dict] = None                   # Additional metadata
```

Let's explore each field in detail.

## Part 3: The `role` Field

### Four Role Types (from schema.py lines 26-29):

```python
SYSTEM = 'system'      # Instructions for the agent
USER = 'user'          # Input from human/user
ASSISTANT = 'assistant'  # Response from agent/LLM
FUNCTION = 'function'   # Tool execution results
```

### Role Descriptions:

| Role | Purpose | Who Creates It | Example |
|------|---------|----------------|----------|
| `system` | Give instructions/context to agent | Developer | "You are a helpful assistant" |
| `user` | User queries and inputs | User/Application | "What's the weather?" |
| `assistant` | Agent responses | Agent/LLM | "The weather is sunny" |
| `function` | Tool execution results | Agent (after running tool) | `{"temperature": 72}` |

In [None]:
# ================================================
# FIREWORKS API CONFIGURATION
# ================================================
import os

# Set API credentials
os.environ['FIREWORKS_API_KEY'] = 'fw_3ZTLPrnEtuscTUPYy3sYx3ag'

# Standard configuration for Fireworks Qwen3-235B-A22B-Thinking
llm_cfg_fireworks = {
    'model': 'accounts/fireworks/models/qwen3-235b-a22b-thinking-2507',
    'model_server': 'https://api.fireworks.ai/inference/v1',
    'api_key': os.environ['FIREWORKS_API_KEY'],
    'generate_cfg': {
        'max_tokens': 32768,
        'temperature': 0.6,
    }
}

# Use this as default llm_cfg
llm_cfg = llm_cfg_fireworks

print('✅ Configured for Fireworks API')
print(f'   Model: Qwen3-235B-A22B-Thinking-2507')
print(f'   Max tokens: 32,768')


In [None]:
# Import the Message class
from qwen_agent.llm.schema import Message, SYSTEM, USER, ASSISTANT, FUNCTION

# Example 1: System message
system_msg = Message(
    role=SYSTEM,
    content='You are a helpful AI assistant specialized in Python programming.'
)

# Example 2: User message
user_msg = Message(
    role=USER,
    content='How do I read a file in Python?'
)

# Example 3: Assistant message
assistant_msg = Message(
    role=ASSISTANT,
    content='You can use the `open()` function with a context manager...'
)

# Example 4: Function message (we'll learn more about this on Day 6)
function_msg = Message(
    role=FUNCTION,
    content='{"result": "File read successfully"}',
    name='read_file'  # Name of the tool that was executed
)

print("System Message:")
print(system_msg)
print("\nUser Message:")
print(user_msg)
print("\nAssistant Message:")
print(assistant_msg)
print("\nFunction Message:")
print(function_msg)

### Message as Dict:

The `Message` class extends `BaseModelCompatibleDict`, which means you can use it like a dictionary:

In [None]:
# Create a message
msg = Message(role='user', content='Hello!')

# Access like a dict
print(f"Role (dict syntax): {msg['role']}")
print(f"Content (dict syntax): {msg['content']}")

# Access like an object
print(f"Role (object syntax): {msg.role}")
print(f"Content (object syntax): {msg.content}")

# Use .get() method (like dict)
print(f"Name (with default): {msg.get('name', 'Anonymous')}")

# Convert to dict
msg_dict = msg.model_dump()
print(f"\nAs dictionary: {msg_dict}")
print(f"Type: {type(msg_dict)}")

### Why Dict Compatibility?

This dual interface allows you to:
1. Use dictionary syntax when working with message lists
2. Use object syntax for cleaner code
3. Easily serialize to JSON
4. Maintain backward compatibility

**In practice, you'll see both styles:**
```python
# Simple dict style (common in examples)
messages = [{'role': 'user', 'content': 'Hi'}]

# Message object style (more features)
messages = [Message(role='user', content='Hi')]

# Both work! Qwen-Agent handles conversion internally
```

## Part 4: The `content` Field - Simple Text

### Content can be a string:

```python
content: Union[str, List[ContentItem]]
```

For simple text-only messages, `content` is just a string.

In [None]:
# Simple text content
simple_msg = Message(
    role='user',
    content='What is the capital of France?'
)

print(f"Content type: {type(simple_msg.content)}")
print(f"Content value: {simple_msg.content}")

# Multi-line text content
multiline_msg = Message(
    role='user',
    content="""Please help me with:
    1. Understanding Python decorators
    2. Writing better code
    3. Optimizing performance
    """
)

print(f"\nMultiline content:\n{multiline_msg.content}")

## Part 5: The `ContentItem` Class - Multimodal Content

### Source Code Location:
`/qwen_agent/llm/schema.py` (lines 80-130)

### ContentItem Definition:

```python
class ContentItem(BaseModelCompatibleDict):
    text: Optional[str] = None
    image: Optional[str] = None          # URL or base64
    file: Optional[str] = None           # File path or URL
    audio: Optional[Union[str, dict]] = None
    video: Optional[Union[str, list]] = None
```

### **Important Rule**: Exactly ONE field must be provided

The validator (lines 95-111) ensures mutual exclusivity:
```python
if provided_fields != 1:
    raise ValueError("Exactly one of 'text', 'image', 'file', 'audio', or 'video' must be provided.")
```

In [None]:
from qwen_agent.llm.schema import ContentItem

# Example 1: Text ContentItem
text_item = ContentItem(text='Hello, world!')
print(f"Text item: {text_item}")
print(f"Type: {text_item.type}")
print(f"Value: {text_item.value}")

# Example 2: Image ContentItem (URL)
image_item = ContentItem(
    image='https://example.com/image.jpg'
)
print(f"\nImage item: {image_item}")
print(f"Type: {image_item.type}")
print(f"Value: {image_item.value}")

# Example 3: File ContentItem
file_item = ContentItem(
    file='/path/to/document.pdf'
)
print(f"\nFile item: {file_item}")

# This would FAIL (can't have both text and image):
# bad_item = ContentItem(text='Hello', image='image.jpg')  # ValueError!

### Useful ContentItem Methods:

```python
item.type          # Returns 'text', 'image', 'file', 'audio', or 'video'
item.value         # Returns the actual value (string or dict)
item.get_type_and_value()  # Returns tuple (type, value)
```

In [None]:
# Demonstrate methods
item = ContentItem(text='Sample text')

print(f"Type property: {item.type}")
print(f"Value property: {item.value}")
print(f"Type and value method: {item.get_type_and_value()}")

# Practical usage: checking content type
def process_content(item: ContentItem):
    """Process different content types"""
    if item.type == 'text':
        return f"Processing text: {item.value[:50]}..."
    elif item.type == 'image':
        return f"Processing image from: {item.value}"
    elif item.type == 'file':
        return f"Processing file: {item.value}"
    else:
        return f"Processing {item.type}"

# Test with different types
text_content = ContentItem(text='This is a long piece of text that needs processing')
image_content = ContentItem(image='https://example.com/photo.jpg')

print(f"\n{process_content(text_content)}")
print(process_content(image_content))

## Part 6: Multimodal Messages - Text + Images

### When content is a List[ContentItem]:

For messages that combine text and images (like vision models), `content` becomes a list:

In [None]:
# Create a multimodal message (text + image)
multimodal_msg = Message(
    role='user',
    content=[
        ContentItem(text='What is in this image?'),
        ContentItem(image='https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen2-VL/demo_small.jpg')
    ]
)

print("Multimodal Message:")
print(f"Role: {multimodal_msg.role}")
print(f"Content type: {type(multimodal_msg.content)}")
print(f"Number of items: {len(multimodal_msg.content)}")

# Iterate through content items
for i, item in enumerate(multimodal_msg.content):
    print(f"\nItem {i}:")
    print(f"  Type: {item.type}")
    print(f"  Value: {item.value}")

### Testing with a Vision Model (if available):

**Note**: This requires a vision-language model like Qwen-VL.

In [None]:
# Example with Qwen-VL (uncomment if you have access)
from qwen_agent.agents import Assistant

# Configure for vision model
vl_cfg = {
    'model': 'qwen-vl-max',  # Vision-language model
    'model_type': 'qwenvl_dashscope'
}

# Create vision agent
# Uncomment to test (requires DashScope access to VL models):
# vision_bot = Assistant(llm=vl_cfg)

# Create multimodal message
# messages = [
#     Message(
#         role='user',
#         content=[
#             ContentItem(text='Describe this image in detail'),
#             ContentItem(image='https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen2-VL/demo_small.jpg')
#         ]
#     )
# ]

# Get response
# response = vision_bot.run_nonstream(messages=messages)
# for msg in response:
#     if msg['role'] == 'assistant':
#         print(msg['content'])

print("Vision model example code ready (commented out).")
print("Uncomment to test if you have access to qwen-vl-max.")

## Part 7: The `reasoning_content` Field

### What is reasoning_content?

Advanced reasoning models (like QwQ-32B) show their "thinking process" separately from the final answer.

```python
class Message:
    content: str                      # Final answer
    reasoning_content: Optional[str]  # Thinking process
```

### Example Response Structure:

```
Message(
    role='assistant',
    reasoning_content='Let me think... First I need to...',
    content='The answer is 42.'
)
```

### Visual Representation:

```
User: "Solve 2x + 5 = 15"

Agent:
  reasoning_content: "I need to isolate x.
                      First, subtract 5 from both sides: 2x = 10
                      Then divide by 2: x = 5"
  
  content: "x = 5"
```

In [None]:
# Simulate a reasoning model response
reasoning_msg = Message(
    role='assistant',
    reasoning_content="Let me analyze this step by step:\n1. The question asks about France\n2. France is in Europe\n3. The capital is Paris",
    content='The capital of France is Paris.'
)

print("Reasoning Model Response:")
print("=" * 60)
print(f"\n🤔 Thinking Process:\n{reasoning_msg.reasoning_content}")
print(f"\n💡 Final Answer:\n{reasoning_msg.content}")
print("=" * 60)

# Check if reasoning content exists
if reasoning_msg.get('reasoning_content'):
    print("\n✅ This message includes reasoning!")
else:
    print("\n❌ No reasoning content")

### When to use reasoning_content:

1. **QwQ-32B and other reasoning models**
2. **Complex problem-solving** where showing work is valuable
3. **Educational contexts** where explanation matters
4. **Debugging agent decisions**

For normal models (Qwen-Max, etc.), this field is usually `None`.

## Part 8: The `FunctionCall` Class (Preview)

### Source Code Location:
`/qwen_agent/llm/schema.py` (lines 69-78)

### FunctionCall Definition:

```python
class FunctionCall(BaseModelCompatibleDict):
    name: str        # Tool name to execute
    arguments: str   # JSON string of parameters
```

### When Agents Use Tools:

Instead of generating text, the LLM generates a function call:

```
User: "What's the weather in Tokyo?"

Agent generates:
  Message(
      role='assistant',
      content='',
      function_call=FunctionCall(
          name='weather_api',
          arguments='{"location": "Tokyo"}'
      )
  )
```

We'll dive deep into this on **Day 6: Function Calling**. For now, just understand the structure:

In [None]:
from qwen_agent.llm.schema import FunctionCall
import json

# Example 1: Simple function call
func_call = FunctionCall(
    name='get_weather',
    arguments='{"city": "Tokyo", "units": "celsius"}'
)

print("Function Call:")
print(f"Name: {func_call.name}")
print(f"Arguments (string): {func_call.arguments}")
print(f"Arguments (parsed): {json.loads(func_call.arguments)}")

# Example 2: Message with function call
tool_msg = Message(
    role='assistant',
    content='',  # Often empty when making function call
    function_call=func_call
)

print(f"\nMessage with function call:")
print(tool_msg)

# Check if message has function call
if tool_msg.get('function_call'):
    print(f"\n✅ Agent wants to use tool: {tool_msg.function_call.name}")
else:
    print("\n❌ No function call in this message")

## Part 9: The `name` Field

### Purpose:
The `name` field identifies the speaker in specific contexts:

1. **Function messages**: Which tool generated this result
2. **Multi-agent systems**: Which agent is speaking
3. **Named personas**: For role-playing scenarios

### Examples:

In [None]:
# Example 1: Function result with name
function_result = Message(
    role='function',
    name='web_search',
    content='{"results": ["Tokyo weather: Sunny, 25°C"]}'
)

print("Function Result:")
print(f"Role: {function_result.role}")
print(f"Name (tool): {function_result.name}")
print(f"Content: {function_result.content}")

# Example 2: Multi-agent conversation (preview for Day 10)
agent1_msg = Message(
    role='assistant',
    name='CodeExpert',
    content='I recommend using a list comprehension for efficiency.'
)

agent2_msg = Message(
    role='assistant',
    name='SecurityExpert',
    content='Make sure to validate all user inputs first.'
)

print(f"\n{agent1_msg.name}: {agent1_msg.content}")
print(f"{agent2_msg.name}: {agent2_msg.content}")

## Part 10: Building Complex Conversations

### Realistic Conversation Flow:

Let's build a conversation that demonstrates various message types:

In [None]:
# Build a realistic conversation
conversation = [
    # 1. System message sets the context
    Message(
        role='system',
        content='You are a helpful data analyst assistant.'
    ),
    
    # 2. User asks a question
    Message(
        role='user',
        content='I need to analyze sales data from Q1 2024.'
    ),
    
    # 3. Assistant asks for clarification
    Message(
        role='assistant',
        content='I can help with that! What specific metrics are you interested in?'
    ),
    
    # 4. User provides more details
    Message(
        role='user',
        content='Total revenue and top 5 products.'
    ),
    
    # 5. Assistant decides to use a tool (simulated)
    Message(
        role='assistant',
        content='',
        function_call=FunctionCall(
            name='analyze_sales',
            arguments='{"period": "Q1 2024", "metrics": ["revenue", "top_products"]}'
        )
    ),
    
    # 6. Tool returns results
    Message(
        role='function',
        name='analyze_sales',
        content='{"total_revenue": 1250000, "top_products": ["ProductA", "ProductB", "ProductC", "ProductD", "ProductE"]}'
    ),
    
    # 7. Assistant presents results
    Message(
        role='assistant',
        content='Based on the Q1 2024 data:\n- Total Revenue: $1,250,000\n- Top 5 Products: ProductA, ProductB, ProductC, ProductD, ProductE'
    )
]

# Display the conversation
print("Complete Conversation Flow:")
print("=" * 80)

for i, msg in enumerate(conversation, 1):
    role_emoji = {
        'system': '⚙️',
        'user': '👤',
        'assistant': '🤖',
        'function': '🔧'
    }
    
    emoji = role_emoji.get(msg.role, '❓')
    print(f"\n{i}. {emoji} {msg.role.upper()}", end='')
    
    if msg.get('name'):
        print(f" ({msg.name})", end='')
    print(":")
    
    if msg.get('function_call'):
        print(f"   [Calling tool: {msg.function_call.name}]")
        print(f"   [Arguments: {msg.function_call.arguments}]")
    elif msg.content:
        # Indent content
        for line in msg.content.split('\n'):
            print(f"   {line}")

print("\n" + "=" * 80)

## Part 11: Message Utilities

### Helper Functions for Working with Messages:

In [None]:
def count_messages_by_role(messages, role):
    """Count messages of a specific role"""
    return sum(1 for msg in messages if msg.get('role') == role)

def get_last_user_message(messages):
    """Get the most recent user message"""
    for msg in reversed(messages):
        if msg.get('role') == 'user':
            return msg
    return None

def extract_text_content(message):
    """Extract text from message content (handles both str and List[ContentItem])"""
    content = message.get('content', '')
    
    if isinstance(content, str):
        return content
    elif isinstance(content, list):
        # Extract text from ContentItems
        texts = [item.value for item in content if item.type == 'text']
        return ' '.join(texts)
    return ''

def has_function_call(message):
    """Check if message contains a function call"""
    return message.get('function_call') is not None

def is_multimodal(message):
    """Check if message contains non-text content"""
    content = message.get('content', '')
    if isinstance(content, list):
        return any(item.type != 'text' for item in content)
    return False

# Test the utilities
print("Message Utilities Demo:")
print(f"User messages: {count_messages_by_role(conversation, 'user')}")
print(f"Assistant messages: {count_messages_by_role(conversation, 'assistant')}")
print(f"Function messages: {count_messages_by_role(conversation, 'function')}")

last_user = get_last_user_message(conversation)
if last_user:
    print(f"\nLast user message: {extract_text_content(last_user)}")

# Check for function calls
func_calls = [msg for msg in conversation if has_function_call(msg)]
print(f"\nMessages with function calls: {len(func_calls)}")

# Create a multimodal message to test
mm_msg = Message(
    role='user',
    content=[
        ContentItem(text='Describe this'),
        ContentItem(image='image.jpg')
    ]
)
print(f"Multimodal message: {is_multimodal(mm_msg)}")
print(f"Text-only message: {is_multimodal(conversation[1])}")

## Part 12: Working with Real Agents

### Sending and Receiving Messages:

In [None]:
from qwen_agent.agents import Assistant

# Create agent using llm_cfg from cell 4 (Fireworks API)
bot = Assistant(llm=llm_cfg)

# Build messages using Message objects
messages = [
    Message(
        role='system',
        content='You are a concise assistant. Keep responses under 50 words.'
    ),
    Message(
        role='user',
        content='Explain what a REST API is.'
    )
]

print("Sending messages to agent...\n")

# Get response (collect all streaming responses)
response = None
for resp in bot.run(messages=messages):
    response = resp

# Examine the response structure
print("Response Analysis:")
print(f"Response type: {type(response)}")
print(f"Number of messages: {len(response)}")

for i, msg in enumerate(response):
    print(f"\nMessage {i}:")
    print(f"  Role: {msg.get('role')}")
    print(f"  Content type: {type(msg.get('content'))}")
    
    if msg.get('role') == 'assistant':
        content = extract_text_content(msg)
        # Show just the last 200 chars (thinking model may have long responses)
        if len(content) > 200:
            print(f"  Content (excerpt): ...{content[-200:]}")
        else:
            print(f"  Content: {content}")
        print(f"  Word count: {len(content.split())}")

## Part 13: Message Serialization

### Converting Messages to/from JSON:

In [None]:
import json

# Create a message
msg = Message(
    role='user',
    content='Hello, world!',
    extra={'timestamp': '2024-01-15T10:30:00'}
)

# Method 1: model_dump() - to dict
msg_dict = msg.model_dump()
print("As dictionary:")
print(msg_dict)

# Method 2: model_dump_json() - to JSON string
msg_json = msg.model_dump_json()
print("\nAs JSON string:")
print(msg_json)

# Method 3: Manual JSON serialization
msg_json_manual = json.dumps(msg.model_dump(), indent=2)
print("\nAs formatted JSON:")
print(msg_json_manual)

# Deserialize back to Message
loaded_dict = json.loads(msg_json)
reconstructed_msg = Message(**loaded_dict)
print("\nReconstructed message:")
print(reconstructed_msg)
print(f"Equal to original: {msg.model_dump() == reconstructed_msg.model_dump()}")

### Saving/Loading Conversations:

In [None]:
import json

# Create a conversation
convo = [
    Message(role='user', content='Hi there!'),
    Message(role='assistant', content='Hello! How can I help?'),
    Message(role='user', content='Tell me a joke.'),
]

# Save to file
def save_conversation(messages, filename):
    """Save conversation to JSON file"""
    with open(filename, 'w') as f:
        json.dump(
            [msg.model_dump() for msg in messages],
            f,
            indent=2
        )

def load_conversation(filename):
    """Load conversation from JSON file"""
    with open(filename, 'r') as f:
        data = json.load(f)
        return [Message(**msg_dict) for msg_dict in data]

# Save
save_conversation(convo, 'conversation.json')
print("✅ Conversation saved to conversation.json")

# Load
loaded_convo = load_conversation('conversation.json')
print(f"✅ Loaded {len(loaded_convo)} messages")

for msg in loaded_convo:
    print(f"  {msg.role}: {msg.content}")

---
## 🎯 Summary of Real Examples

You've now seen **ACTUAL working code** with **REAL outputs** showing:

### ✅ What We Covered:

1. **Real Fireworks API Calls**
   - Qwen3-235B-Thinking model in action
   - Understanding how reasoning appears in responses
   - API response structure analysis

2. **Multimodal Messages**
   - Combining text + images with `ContentItem`
   - Real image URLs from Qwen's demo dataset
   - Serialization/deserialization of complex messages

3. **Extra Field Metadata**
   - Storing custom data (timestamps, user IDs, etc.)
   - Real-world metadata patterns
   - Perfect serialization support

4. **Complete Multi-Turn Conversation**
   - Building conversation history
   - Passing full context to LLM
   - How assistant remembers previous messages

### 💡 Key Insights:

| Feature | With Fireworks API | With Native Qwen (DashScope) |
|---------|-------------------|------------------------------|
| **Thinking Model** | Reasoning in `content` field | Separate `reasoning_content` |
| **Message Format** | Standard OpenAI-compatible | Full Qwen schema support |
| **Multimodal** | Text + metadata only | Text + images + video + audio |

### 🚀 You're Now Ready To:

- ✅ Build complex conversations with proper message structure
- ✅ Handle multimodal content (text, images, files)
- ✅ Extract and use metadata from messages
- ✅ Work with thinking models and their reasoning output
- ✅ Serialize/deserialize messages for storage

---

In [38]:
# ========================================
# EXAMPLE 4: Complete Multi-Turn Conversation
# ========================================
# This shows how messages flow in a REAL conversation

print("🔄 COMPLETE MULTI-TURN CONVERSATION")
print("="*70)

# Start with empty conversation
conversation = []

# Turn 1: System message
conversation.append(Message(
    role='system',
    content='You are a helpful assistant. Keep responses under 30 words.'
))

# Turn 2: User asks first question
conversation.append(Message(
    role='user',
    content='What is 15 * 23?'
))

print("\n👤 USER: What is 15 * 23?")
print("⏳ Calling API...")

# Get response from LLM
llm_short = get_chat_model({
    'model': 'accounts/fireworks/models/qwen3-235b-a22b-thinking-2507',
    'model_server': 'https://api.fireworks.ai/inference/v1',
    'api_key': os.environ['FIREWORKS_API_KEY'],
    'generate_cfg': {'max_tokens': 200, 'temperature': 0.6}
})

responses = []
for resp in llm_short.chat(messages=conversation):
    responses = resp

# Add assistant response to conversation
conversation.extend(responses)

assistant_msg = responses[-1]
print(f"🤖 ASSISTANT: {assistant_msg['content'][:100]}...")

# Turn 3: User asks follow-up
conversation.append(Message(
    role='user',
    content='Now multiply that by 2'
))

print(f"\n👤 USER: Now multiply that by 2")
print("⏳ Calling API with full context...")

# Get second response (with full history)
responses2 = []
for resp in llm_short.chat(messages=conversation):
    responses2 = resp

conversation.extend(responses2)

assistant_msg2 = responses2[-1]
print(f"🤖 ASSISTANT: {assistant_msg2['content'][:100]}...")

# Show conversation structure
print(f"\n📊 FINAL CONVERSATION STATE:")
print(f"   Total messages: {len(conversation)}")
for i, msg in enumerate(conversation):
    role_emoji = {'system': '⚙️', 'user': '👤', 'assistant': '🤖'}
    emoji = role_emoji.get(msg.get('role'), '❓')
    content_preview = msg.get('content', '')[:50]
    print(f"   {i+1}. {emoji} {msg.get('role')}: {content_preview}...")

print("\n✅ Multi-turn conversation complete!")
print("💡 Key takeaway: Each API call receives the FULL conversation history")

🔄 COMPLETE MULTI-TURN CONVERSATION

👤 USER: What is 15 * 23?
⏳ Calling API...
🤖 ASSISTANT: Okay, the user is asking for the product of 15 and 23. Let me calculate that. 15 times 20 is 300, an...

👤 USER: Now multiply that by 2
⏳ Calling API with full context...
🤖 ASSISTANT: Okay, the user just asked to multiply the previous result by 2. Last time I calculated 15 * 23 = 345...

📊 FINAL CONVERSATION STATE:
   Total messages: 5
   1. ⚙️ system: You are a helpful assistant. Keep responses under ...
   2. 👤 user: What is 15 * 23?...
   3. 🤖 assistant: Okay, the user is asking for the product of 15 and...
   4. 👤 user: Now multiply that by 2...
   5. 🤖 assistant: Okay, the user just asked to multiply the previous...

✅ Multi-turn conversation complete!
💡 Key takeaway: Each API call receives the FULL conversation history



In [39]:
# ========================================
# EXAMPLE 3: Using the `extra` Field for Metadata
# ========================================
from datetime import datetime

# The `extra` field can store ANY additional metadata
msg_with_metadata = Message(
    role='user',
    content='What is the weather today?',
    extra={
        'timestamp': datetime.now().isoformat(),
        'user_id': 'user_12345',
        'session_id': 'session_abc',
        'ip_address': '192.168.1.1',
        'language': 'en',
        'app_version': '1.2.3',
        'custom_data': {
            'priority': 'high',
            'category': 'weather',
            'tags': ['urgent', 'forecast']
        }
    }
)

print("🏷️  EXTRA FIELD - METADATA EXAMPLE")
print("="*70)
print(f"\nMessage role: {msg_with_metadata.role}")
print(f"Message content: {msg_with_metadata.content}")
print(f"\n📋 Extra metadata:")
print(json.dumps(msg_with_metadata.extra, indent=2))

# Access extra fields
print(f"\n🔍 Accessing extra fields:")
print(f"   User ID: {msg_with_metadata.extra['user_id']}")
print(f"   Timestamp: {msg_with_metadata.extra['timestamp']}")
print(f"   Priority: {msg_with_metadata.extra['custom_data']['priority']}")

# Use cases for extra field:
print(f"\n💡 Common uses for `extra` field:")
print("   • Timestamps for logging")
print("   • User/session tracking")
print("   • A/B testing flags")
print("   • Custom application data")
print("   • Debugging information")
print("   • Analytics metadata")

# Serialize with extra
serialized = msg_with_metadata.model_dump()
print(f"\n✅ Serializes perfectly with all metadata intact")
print(f"   Total fields in serialized: {len(serialized)}")

🏷️  EXTRA FIELD - METADATA EXAMPLE

Message role: user
Message content: What is the weather today?

📋 Extra metadata:
{
  "timestamp": "2025-11-13T23:15:37.777698",
  "user_id": "user_12345",
  "session_id": "session_abc",
  "ip_address": "192.168.1.1",
  "language": "en",
  "app_version": "1.2.3",
  "custom_data": {
    "priority": "high",
    "category": "weather",
    "tags": [
      "urgent",
      "forecast"
    ]
  }
}

🔍 Accessing extra fields:
   User ID: user_12345
   Timestamp: 2025-11-13T23:15:37.777698
   Priority: high

💡 Common uses for `extra` field:
   • Timestamps for logging
   • User/session tracking
   • A/B testing flags
   • Custom application data
   • Debugging information
   • Analytics metadata

✅ Serializes perfectly with all metadata intact
   Total fields in serialized: 3



In [40]:
# ========================================
# EXAMPLE 2: Multimodal Messages with REAL Images
# ========================================
from qwen_agent.llm.schema import Message, ContentItem

# Using a REAL publicly accessible image
real_image_url = 'https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen2-VL/demo_small.jpg'

# Create multimodal message
multimodal_msg = Message(
    role='user',
    content=[
        ContentItem(text='What objects do you see in this image?'),
        ContentItem(image=real_image_url)
    ]
)

print("🖼️  MULTIMODAL MESSAGE EXAMPLE")
print("="*70)
print(f"\nRole: {multimodal_msg.role}")
print(f"Content is a list: {isinstance(multimodal_msg.content, list)}")
print(f"Number of content items: {len(multimodal_msg.content)}")

for i, item in enumerate(multimodal_msg.content):
    print(f"\n📌 Item {i}:")
    print(f"   Type: {item.type}")
    if item.type == 'text':
        print(f"   Text: {item.value}")
    elif item.type == 'image':
        print(f"   Image URL: {item.value}")
        print(f"   URL length: {len(item.value)} chars")

# Show how to serialize/deserialize
msg_dict = multimodal_msg.model_dump()
print(f"\n📦 Serialized to dict:")
print(json.dumps(msg_dict, indent=2)[:400] + "...")

# Reconstruct from dict
reconstructed = Message(**msg_dict)
print(f"\n✅ Successfully reconstructed from dict")
print(f"   Equal to original: {msg_dict == reconstructed.model_dump()}")

🖼️  MULTIMODAL MESSAGE EXAMPLE

Role: user
Content is a list: True
Number of content items: 2

📌 Item 0:
   Type: text
   Text: What objects do you see in this image?

📌 Item 1:
   Type: image
   Image URL: https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen2-VL/demo_small.jpg
   URL length: 71 chars

📦 Serialized to dict:
{
  "role": "user",
  "content": [
    {
      "text": "What objects do you see in this image?"
    },
    {
      "image": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen2-VL/demo_small.jpg"
    }
  ]
}...

✅ Successfully reconstructed from dict
   Equal to original: True



### 📝 Important Finding: reasoning_content with Fireworks API

**Key Discovery**: The Qwen3-235B-Thinking model on Fireworks API includes reasoning **within the `content` field**, NOT as separate `reasoning_content`.

```python
# What we get:
Message(
    role='assistant',
    content='Let me think... [reasoning] ... The answer is 4 books.'  # All in one
    reasoning_content=None  # ❌ Not populated
)

# vs. Native Qwen API (DashScope):
Message(
    role='assistant',
    reasoning_content='Let me think... [step by step]',  # ✅ Separate
    content='The answer is 4 books.'
)
```

This is an **API implementation difference**, not a bug. The thinking is still there - just formatted differently!

In [42]:
# ========================================
# EXAMPLE 1: Real API Call with Thinking Model
# ========================================
from qwen_agent.llm import get_chat_model
from qwen_agent.llm.schema import Message
import json

# Configure with SHORT max_tokens to see concise output
llm_cfg_short = {
    'model': 'accounts/fireworks/models/qwen3-235b-a22b-thinking-2507',
    'model_server': 'https://api.fireworks.ai/inference/v1',
    'api_key': os.environ['FIREWORKS_API_KEY'],
    'generate_cfg': {
        'max_tokens': 500,  # Shorter for demo
        'temperature': 0.6,
    }
}

# Get LLM instance
llm = get_chat_model(llm_cfg_short)

# Create a simple math problem
messages = [
    {'role': 'system', 'content': 'You are a helpful math tutor. Explain your reasoning step by step.'},
    {'role': 'user', 'content': 'Solve: If a book costs $12 and I have $50, how many books can I buy?'}
]

print("🔥 Making REAL API call to Fireworks...")
print("Question: If a book costs $12 and I have $50, how many books can I buy?")
print("\n" + "="*70)

# Call the API
responses = []
for response in llm.chat(messages=messages, stream=True):
    responses = response

# Extract the final message
if responses:
    final_msg = responses[-1]
    
    print("\n📊 RESPONSE STRUCTURE:")
    print(f"  Role: {final_msg.get('role')}")
    print(f"  Has content: {bool(final_msg.get('content'))}")
    print(f"  Has reasoning_content: {bool(final_msg.get('reasoning_content'))}")
    print(f"  Content type: {type(final_msg.get('content'))}")
    
    # Show reasoning if present
    if final_msg.get('reasoning_content'):
        reasoning = final_msg['reasoning_content']
        print(f"\n🤔 THINKING PROCESS ({len(reasoning)} chars):")
        print("─" * 70)
        print(reasoning[:500] + "..." if len(reasoning) > 500 else reasoning)
    
    # Show final answer
    if final_msg.get('content'):
        content = final_msg['content']
        print(f"\n💡 FINAL ANSWER ({len(content)} chars):")
        print("─" * 70)
        print(content[:300] + "..." if len(content) > 300 else content)
    
    print("\n" + "="*70)
    print("✅ Real API call complete!")

🔥 Making REAL API call to Fireworks...
Question: If a book costs $12 and I have $50, how many books can I buy?


📊 RESPONSE STRUCTURE:
  Role: assistant
  Has content: True
  Has reasoning_content: False
  Content type: <class 'str'>

💡 FINAL ANSWER (864 chars):
──────────────────────────────────────────────────────────────────────
Okay, let's see. The problem is: If a book costs $12 and I have $50, how many books can I buy? Alright, so I need to figure out how many times 12 goes into 50. That sounds like a division problem. Let me write that down.

First, total money I have is $50. Each book is $12. So the number of books I c...

✅ Real API call complete!

🔍 Full response keys: ['role', 'content']



---
## 🔥 Part 15: REAL EXAMPLES with Fireworks Thinking Model

### Now let's see ACTUAL API calls and REAL outputs!

This section demonstrates:
1. ✅ **Real Fireworks API calls** with Qwen3-235B-Thinking
2. ✅ **Actual reasoning extraction** from thinking models
3. ✅ **How to pass reasoning** back in conversations
4. ✅ **Multimodal messages** with real images
5. ✅ **Message extra fields** and metadata
6. ✅ **Complete conversation** with all message types

All cells below have **saved outputs** so you can see exactly what happens!

## Part 14: Practice Exercises

### Exercise 1: Create a Multimodal Message
Build a message that combines text and an image URL.

In [None]:
# TODO: Create a multimodal message
# Requirements:
# 1. Use ContentItem for both text and image
# 2. Text should ask a question about the image
# 3. Image URL can be any valid URL

# Your code here:
multimodal_exercise = None

# Test:
# print(multimodal_exercise)
# print(f"Is multimodal: {is_multimodal(multimodal_exercise)}")

### Exercise 2: Build a Conversation Parser
Create a function that analyzes a conversation and returns statistics.

In [None]:
# TODO: Implement conversation_stats()
# Should return:
# - Total messages
# - Messages per role
# - Number of function calls
# - Average message length
# - Has system message?

def conversation_stats(messages):
    """Analyze a conversation"""
    # Your code here
    pass

# Test with the conversation we built earlier:
# stats = conversation_stats(conversation)
# print(stats)

### Exercise 3: Message Filter
Filter a conversation to show only specific types of messages.

In [None]:
# TODO: Implement message filters

def filter_by_role(messages, role):
    """Return only messages with specific role"""
    # Your code here
    pass

def filter_function_calls(messages):
    """Return only messages with function calls"""
    # Your code here
    pass

def filter_multimodal(messages):
    """Return only multimodal messages"""
    # Your code here
    pass

# Test:
# user_msgs = filter_by_role(conversation, 'user')
# print(f"User messages: {len(user_msgs)}")

## Part 15: Key Takeaways

### What You Learned Today:

1. **Message Structure**
   - `role`: Who is speaking (system/user/assistant/function)
   - `content`: What is being said (str or List[ContentItem])
   - `reasoning_content`: Thinking process (for reasoning models)
   - `name`: Speaker identifier
   - `function_call`: Tool invocation data
   - `extra`: Additional metadata

2. **ContentItem for Multimodal**
   - Exactly ONE of: text, image, file, audio, video
   - Use `.type` and `.value` properties
   - Combine in lists for multimodal messages

3. **FunctionCall Structure**
   - `name`: Tool to execute
   - `arguments`: JSON string parameters
   - Enables agent tool use

4. **Message Utilities**
   - Dict-compatible interface
   - Serialization to/from JSON
   - Helper functions for analysis

### Common Patterns:

```python
# Pattern 1: Simple text message
msg = Message(role='user', content='Hello')

# Pattern 2: Multimodal message
msg = Message(
    role='user',
    content=[
        ContentItem(text='What is this?'),
        ContentItem(image='url')
    ]
)

# Pattern 3: Function call message
msg = Message(
    role='assistant',
    content='',
    function_call=FunctionCall(name='tool', arguments='{}')
)

# Pattern 4: Function result message
msg = Message(
    role='function',
    name='tool_name',
    content='result data'
)
```

## Part 16: Next Steps

### Tomorrow (Day 3): LLM Integration
We'll explore:
- BaseChatModel interface
- Different model backends (DashScope, vLLM, Ollama)
- Generation parameters
- Streaming internals
- Token management
- Direct LLM usage (without agents)

### Homework:
1. Create a conversation with at least 5 turns
2. Build a multimodal message with your own image
3. Implement the exercise functions above
4. Read the source: `/qwen_agent/llm/schema.py`
5. Experiment with reasoning_content (if you have QwQ access)

### Resources:
- [Schema Source Code](../qwen_agent/llm/schema.py)
- [Pydantic Documentation](https://docs.pydantic.dev/) - Message uses Pydantic
- [OpenAI Message Format](https://platform.openai.com/docs/api-reference/chat) - Similar structure

---

## 🎉 Day 2 Complete!

You now understand:
- ✅ Message structure and fields
- ✅ Role types and their purposes
- ✅ ContentItem for multimodal content
- ✅ FunctionCall basics
- ✅ Message serialization and utilities

See you tomorrow for Day 3! 🚀