# 09 — Thinking in Tool-Use (DeepSeek V3.2)

> **Purpose:** Implement the "reasoning while calling tools" paradigm. The model maintains its thinking thread across multiple tool invocations.

**Key insight:** Traditional models "reboot" their reasoning after each tool call. DeepSeek V3.2 keeps reasoning active throughout.

| Approach | After Tool Call | Reasoning Continuity |
|----------|-----------------|---------------------|
| Traditional | Model restarts | ❌ Lost |
| **Thinking-in-Tool-Use** | Resume reasoning | ✅ Preserved |

---

In [None]:
import json
import re
from typing import Dict, List, Optional, Callable, Any
from dataclasses import dataclass, field
from enum import Enum

## 1. DeepSeek V3.2 Architecture Overview

```
┌─────────────────────────────────────────────────────────────────┐
│                    DeepSeek V3.2                                 │
│                    671B params (37B active)                      │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│    ┌─────────────┐     ┌─────────────┐     ┌─────────────┐      │
│    │   THINK     │ ──► │   TOOL      │ ──► │   THINK     │      │
│    │  (reason)   │     │  (call API) │     │  (continue) │      │
│    └─────────────┘     └─────────────┘     └─────────────┘      │
│          │                   │                   │               │
│          └───────────────────┴───────────────────┘               │
│                    Reasoning Thread Preserved                    │
│                                                                  │
├─────────────────────────────────────────────────────────────────┤
│    Trained on: 1,800+ environments, 85,000+ instructions        │
│    Modes: Thinking (CoT) | Non-Thinking (Direct)                │
│    Context: 128K tokens (DSA attention)                         │
└─────────────────────────────────────────────────────────────────┘
```

## 2. Tool Definition and Registry

In [None]:
@dataclass
class ToolDefinition:
    """Definition of an available tool."""
    name: str
    description: str
    parameters: Dict[str, Any]
    required: List[str] = field(default_factory=list)
    
    def to_schema(self) -> Dict:
        """Convert to OpenAI-style function schema."""
        return {
            "type": "function",
            "function": {
                "name": self.name,
                "description": self.description,
                "parameters": {
                    "type": "object",
                    "properties": self.parameters,
                    "required": self.required,
                }
            }
        }


class ToolRegistry:
    """Registry of available tools."""
    
    def __init__(self):
        self.tools: Dict[str, ToolDefinition] = {}
        self.executors: Dict[str, Callable] = {}
    
    def register(self, definition: ToolDefinition, 
                 executor: Callable) -> None:
        """Register a tool with its executor."""
        self.tools[definition.name] = definition
        self.executors[definition.name] = executor
    
    def get_schemas(self) -> List[Dict]:
        """Get all tool schemas for LLM."""
        return [t.to_schema() for t in self.tools.values()]
    
    def execute(self, name: str, arguments: Dict) -> str:
        """Execute a tool and return result."""
        if name not in self.executors:
            return f"Error: Unknown tool '{name}'"
        try:
            result = self.executors[name](**arguments)
            return json.dumps(result) if not isinstance(result, str) else result
        except Exception as e:
            return f"Error executing {name}: {str(e)}"

In [None]:
# Register example tools

registry = ToolRegistry()

# Calculator tool
registry.register(
    ToolDefinition(
        name="calculator",
        description="Perform arithmetic calculations",
        parameters={
            "expression": {"type": "string", "description": "Math expression to evaluate"}
        },
        required=["expression"]
    ),
    lambda expression: {"result": eval(expression)}  # Simplified; use safe eval in prod
)

# Search tool
registry.register(
    ToolDefinition(
        name="search",
        description="Search for information on a topic",
        parameters={
            "query": {"type": "string", "description": "Search query"}
        },
        required=["query"]
    ),
    lambda query: {"results": [f"Result for: {query}"]}
)

# Get weather tool
registry.register(
    ToolDefinition(
        name="get_weather",
        description="Get current weather for a location",
        parameters={
            "location": {"type": "string", "description": "City name"}
        },
        required=["location"]
    ),
    lambda location: {"temperature": 22, "condition": "sunny", "location": location}
)

print(f"Registered {len(registry.tools)} tools:")
for name in registry.tools:
    print(f"  - {name}")

## 3. Thinking Thread Management

The key innovation: preserve reasoning state across tool calls.

In [None]:
class ThinkingMode(Enum):
    """Model operating mode."""
    THINKING = "thinking"      # Full CoT reasoning
    NON_THINKING = "direct"    # Direct response


@dataclass
class ThinkingState:
    """State of the reasoning thread."""
    thoughts: List[str] = field(default_factory=list)
    tool_calls: List[Dict] = field(default_factory=list)
    tool_results: List[Dict] = field(default_factory=list)
    is_complete: bool = False
    final_answer: Optional[str] = None
    
    def add_thought(self, thought: str) -> None:
        """Add a reasoning step."""
        self.thoughts.append(thought)
    
    def add_tool_call(self, tool_name: str, arguments: Dict) -> None:
        """Record a tool call."""
        self.tool_calls.append({
            "tool": tool_name,
            "arguments": arguments,
            "step": len(self.thoughts)
        })
    
    def add_tool_result(self, tool_name: str, result: str) -> None:
        """Record a tool result."""
        self.tool_results.append({
            "tool": tool_name,
            "result": result,
            "step": len(self.thoughts)
        })
    
    def get_context(self) -> str:
        """Get full reasoning context for continuation."""
        lines = []
        
        for i, thought in enumerate(self.thoughts):
            lines.append(f"[Thought {i+1}] {thought}")
            
            # Add any tool calls/results at this step
            for tc in self.tool_calls:
                if tc['step'] == i:
                    lines.append(f"  → Tool: {tc['tool']}({tc['arguments']})")
            
            for tr in self.tool_results:
                if tr['step'] == i:
                    lines.append(f"  ← Result: {tr['result']}")
        
        return "\n".join(lines)
    
    def __str__(self) -> str:
        return self.get_context()

## 4. Thinking-Aware Agent

Agent that maintains reasoning across tool invocations.

In [None]:
class ThinkingAgent:
    """
    Agent that maintains reasoning thread across tool calls.
    
    Inspired by DeepSeek V3.2's thinking-in-tool-use approach.
    """
    
    def __init__(self,
                 tool_registry: ToolRegistry,
                 mode: ThinkingMode = ThinkingMode.THINKING,
                 max_steps: int = 10):
        
        self.registry = tool_registry
        self.mode = mode
        self.max_steps = max_steps
    
    def think(self, query: str) -> ThinkingState:
        """
        Process a query with maintained reasoning.
        
        Args:
            query: User's question/task
        
        Returns:
            ThinkingState with full reasoning trace
        """
        state = ThinkingState()
        
        # Initial thought
        state.add_thought(f"I need to answer: {query}")
        
        # Simulate reasoning loop
        step = 0
        while step < self.max_steps and not state.is_complete:
            step += 1
            
            # Decide: think more, call tool, or answer
            action = self._decide_action(query, state)
            
            if action['type'] == 'think':
                state.add_thought(action['content'])
            
            elif action['type'] == 'tool_call':
                # Record the decision to call tool
                state.add_thought(
                    f"I need to use {action['tool']} to {action['reason']}"
                )
                state.add_tool_call(action['tool'], action['arguments'])
                
                # Execute tool
                result = self.registry.execute(
                    action['tool'], action['arguments']
                )
                state.add_tool_result(action['tool'], result)
                
                # Continue reasoning with result
                state.add_thought(
                    f"The {action['tool']} returned: {result}. "
                    f"I can now continue my reasoning."
                )
            
            elif action['type'] == 'answer':
                state.add_thought(f"I now have enough information to answer.")
                state.final_answer = action['content']
                state.is_complete = True
        
        return state
    
    def _decide_action(self, query: str, state: ThinkingState) -> Dict:
        """
        Decide next action based on current state.
        
        In production: this would be the LLM inference call.
        Here: simplified simulation.
        """
        num_thoughts = len(state.thoughts)
        num_tool_calls = len(state.tool_calls)
        
        # Simulation logic
        if "weather" in query.lower() and num_tool_calls == 0:
            # Extract location (simplified)
            location = "London"  # Would be extracted by LLM
            return {
                'type': 'tool_call',
                'tool': 'get_weather',
                'arguments': {'location': location},
                'reason': 'get current weather data'
            }
        
        elif "calculate" in query.lower() and num_tool_calls == 0:
            # Extract expression (simplified)
            expr = "2 + 2"  # Would be extracted by LLM
            return {
                'type': 'tool_call',
                'tool': 'calculator',
                'arguments': {'expression': expr},
                'reason': 'perform the calculation'
            }
        
        elif num_thoughts < 3:
            return {
                'type': 'think',
                'content': f"Analyzing the query (step {num_thoughts + 1})..."
            }
        
        else:
            return {
                'type': 'answer',
                'content': f"Based on my reasoning, here is the answer to '{query}'"
            }

In [None]:
# TEST: Thinking agent with tool use

agent = ThinkingAgent(registry, mode=ThinkingMode.THINKING)

# Query requiring tool use
state = agent.think("What's the weather in London?")

print("Reasoning Trace:")
print("=" * 50)
print(state.get_context())
print("=" * 50)
print(f"\nFinal Answer: {state.final_answer}")
print(f"Tool Calls Made: {len(state.tool_calls)}")
print(f"Reasoning Steps: {len(state.thoughts)}")

## 5. Multi-Step Tool Orchestration

Handle complex tasks requiring multiple tool calls.

In [None]:
class MultiStepOrchestrator:
    """
    Orchestrate complex multi-step tool interactions.
    
    DeepSeek V3.2 can handle 100+ sequential tool calls
    while maintaining reasoning coherence.
    """
    
    def __init__(self, agent: ThinkingAgent):
        self.agent = agent
    
    def execute_plan(self, plan: List[Dict]) -> ThinkingState:
        """
        Execute a multi-step plan.
        
        Args:
            plan: List of steps, each with 'action' and 'params'
        
        Returns:
            ThinkingState with full execution trace
        """
        state = ThinkingState()
        state.add_thought(f"Executing plan with {len(plan)} steps")
        
        for i, step in enumerate(plan):
            state.add_thought(f"Step {i+1}: {step.get('description', 'Executing...')}")
            
            if step['action'] == 'tool_call':
                tool_name = step['tool']
                arguments = step['arguments']
                
                state.add_tool_call(tool_name, arguments)
                result = self.agent.registry.execute(tool_name, arguments)
                state.add_tool_result(tool_name, result)
                
                # Integrate result into reasoning
                state.add_thought(
                    f"Step {i+1} complete. Result: {result[:100]}..."
                    if len(result) > 100 else f"Step {i+1} complete. Result: {result}"
                )
            
            elif step['action'] == 'reason':
                state.add_thought(step['content'])
        
        state.add_thought("Plan execution complete.")
        state.is_complete = True
        
        return state

In [None]:
# TEST: Multi-step orchestration

orchestrator = MultiStepOrchestrator(agent)

plan = [
    {
        'action': 'tool_call',
        'description': 'Search for Python tutorials',
        'tool': 'search',
        'arguments': {'query': 'Python tutorial 2025'}
    },
    {
        'action': 'reason',
        'content': 'Now I need to analyze the search results'
    },
    {
        'action': 'tool_call',
        'description': 'Calculate estimated learning time',
        'tool': 'calculator',
        'arguments': {'expression': '30 * 7'}  # 30 min/day for 7 days
    },
    {
        'action': 'reason',
        'content': 'Based on search and calculation, I can provide recommendations'
    },
]

result = orchestrator.execute_plan(plan)

print("Multi-Step Execution:")
print("=" * 50)
print(result.get_context())

## 6. API Integration Pattern

How to enable thinking mode with DeepSeek V3.2 API.

In [None]:
def create_deepseek_request(
    messages: List[Dict],
    tools: List[Dict],
    thinking_enabled: bool = True,
) -> Dict:
    """
    Create DeepSeek V3.2 API request with thinking mode.
    
    Args:
        messages: Conversation history
        tools: Tool definitions
        thinking_enabled: Whether to use thinking mode
    
    Returns:
        API request payload
    """
    request = {
        "model": "deepseek-v3.2",
        "messages": messages,
        "tools": tools,
        "tool_choice": "auto",
    }
    
    if thinking_enabled:
        # DeepSeek API thinking mode
        request["thinking"] = {"type": "enabled"}
    
    return request


def create_vllm_request(
    messages: List[Dict],
    tools: List[Dict],
    thinking_enabled: bool = True,
) -> Dict:
    """
    Create vLLM request with thinking mode.
    
    Args:
        messages: Conversation history
        tools: Tool definitions
        thinking_enabled: Whether to use thinking mode
    
    Returns:
        vLLM request payload
    """
    request = {
        "model": "deepseek-ai/DeepSeek-V3-0324",
        "messages": messages,
        "tools": tools,
    }
    
    if thinking_enabled:
        # vLLM thinking mode via chat template
        request["chat_template_kwargs"] = {"thinking": True}
    
    return request

In [None]:
# Example API request

messages = [
    {"role": "user", "content": "What's the weather in Tokyo and convert 25°C to °F?"}
]

tools = registry.get_schemas()

# DeepSeek API request
deepseek_request = create_deepseek_request(messages, tools, thinking_enabled=True)

print("DeepSeek V3.2 API Request:")
print(json.dumps(deepseek_request, indent=2))

## 7. Response Parsing

Parse DeepSeek V3.2 responses with separate reasoning and content.

In [None]:
@dataclass
class ThinkingResponse:
    """Parsed response with thinking content."""
    reasoning_content: str  # Internal chain-of-thought
    content: str            # Final answer
    tool_calls: List[Dict]  # Any tool calls made
    

def parse_thinking_response(response: Dict) -> ThinkingResponse:
    """
    Parse DeepSeek V3.2 response.
    
    The model separates:
    - reasoning_content: Internal thinking (can be exposed)
    - content: Final response to user
    """
    message = response.get("choices", [{}])[0].get("message", {})
    
    return ThinkingResponse(
        reasoning_content=message.get("reasoning_content", ""),
        content=message.get("content", ""),
        tool_calls=message.get("tool_calls", []),
    )


# Simulated response
simulated_response = {
    "choices": [{
        "message": {
            "reasoning_content": (
                "I need to get weather for Tokyo. "
                "Then convert temperature. "
                "Formula is °F = °C × 9/5 + 32."
            ),
            "content": "The weather in Tokyo is sunny, 25°C (77°F).",
            "tool_calls": [
                {"function": {"name": "get_weather", "arguments": '{"location": "Tokyo"}'}},
                {"function": {"name": "calculator", "arguments": '{"expression": "25 * 9/5 + 32"}'}},
            ]
        }
    }]
}

parsed = parse_thinking_response(simulated_response)

print("Parsed Response:")
print(f"  Reasoning: {parsed.reasoning_content}")
print(f"  Answer: {parsed.content}")
print(f"  Tool Calls: {len(parsed.tool_calls)}")

## 8. Summary: Thinking in Tool-Use

| Aspect | DeepSeek V3.2 Approach |
|--------|------------------------|
| **Reasoning Persistence** | Thread maintained across tool calls |
| **Mode Toggle** | `thinking: enabled/disabled` |
| **Output Structure** | Separate `reasoning_content` + `content` |
| **Training Data** | 1,800+ environments, 85K instructions |
| **Max Tool Calls** | 100+ sequential without context loss |

### Key Implementation Points

1. **Preserve state** between tool calls
2. **Separate reasoning** from final output
3. **Plan before action** using CoT
4. **Integrate results** back into reasoning thread

---
**Tier 3 Section 09 Complete!**