
# Stage 3-1: Function Calling Format & Minimal ReAct

**Goals:**
- Implement OpenAI-style function calling JSON output
- Build tool registry with pydantic schema validation
- Design minimal ReAct pattern (Thought→Action→Observation)
- Ensure safe tool execution with whitelist

**Prerequisites:**
- Stage 1 LLMAdapter working
- Basic understanding of JSON schema and pydantic


In [None]:
# Stage 3: Function Calling Format & Minimal ReAct
# nb20_function_calling_format.ipynb

# Cell1:  Shared Cache Bootstrap
import os, pathlib, torch
import sys
from datetime import datetime

# Shared cache configuration (複製到每本 notebook)
AI_CACHE_ROOT = os.getenv("AI_CACHE_ROOT", "../ai_warehouse/cache")

for k, v in {
    "HF_HOME": f"{AI_CACHE_ROOT}/hf",
    "TRANSFORMERS_CACHE": f"{AI_CACHE_ROOT}/hf/transformers",
    "HF_DATASETS_CACHE": f"{AI_CACHE_ROOT}/hf/datasets",
    "HUGGINGFACE_HUB_CACHE": f"{AI_CACHE_ROOT}/hf/hub",
    "TORCH_HOME": f"{AI_CACHE_ROOT}/torch",
}.items():
    os.environ[k] = v
    pathlib.Path(v).mkdir(parents=True, exist_ok=True)
print("[Cache]", AI_CACHE_ROOT, "| GPU:", torch.cuda.is_available())

In [None]:
# %% Cell 2: Import and Setup
import json, re, ast, operator
from typing import Dict, List, Any, Optional, Union
from pydantic import BaseModel, Field, validator
from dataclasses import dataclass
from transformers import AutoTokenizer, AutoModelForCausalLM

# Initialize LLM (lightweight for demo)
MODEL_ID = "Qwen/Qwen2.5-7B-Instruct"
print(f"Loading {MODEL_ID}...")

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, use_fast=True)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    device_map="auto",
    torch_dtype="auto",
    load_in_4bit=True,  # Low VRAM option
)
print(f"Model loaded on: {model.device}")

In [None]:
# %% Cell 3: Tool Schema Definition with Pydantic
class CalculatorArgs(BaseModel):
    """Safe arithmetic calculator tool arguments"""

    expression: str = Field(
        ..., description="Mathematical expression to evaluate (safe operators only)"
    )

    @validator("expression")
    def validate_expression(cls, v):
        # Basic safety: only allow numbers, basic operators, parentheses
        allowed_chars = set("0123456789+-*/().,e ")
        if not all(c in allowed_chars for c in v.replace(" ", "")):
            raise ValueError("Expression contains unsafe characters")
        return v


class SearchArgs(BaseModel):
    """Web search tool arguments"""

    query: str = Field(..., description="Search query string")
    max_results: int = Field(5, description="Maximum number of results to return")


class FileReadArgs(BaseModel):
    """File reading tool arguments"""

    filepath: str = Field(..., description="Path to file to read")

    @validator("filepath")
    def validate_filepath(cls, v):
        # Security: only allow certain paths
        if not v.startswith(("data/", "configs/", "outs/")):
            raise ValueError("File path not in allowed directories")
        return v


# Tool registry mapping
TOOL_SCHEMAS = {
    "calculator": CalculatorArgs,
    "web_search": SearchArgs,
    "file_read": FileReadArgs,
}

print("✓ Tool schemas registered:", list(TOOL_SCHEMAS.keys()))

In [None]:
# %% Cell 4: Tool Implementation Functions
def execute_calculator(expression: str) -> str:
    """Execute safe arithmetic calculation"""
    try:
        # Parse and validate AST for safety
        tree = ast.parse(expression, mode="eval")

        # Only allow safe operations
        allowed_nodes = (
            ast.Expression,
            ast.Num,
            ast.BinOp,
            ast.UnaryOp,
            ast.operator,
            ast.unaryop,
            ast.cmpop,
            ast.Constant,
        )
        allowed_ops = (ast.Add, ast.Sub, ast.Mult, ast.Div, ast.Pow, ast.USub, ast.UAdd)

        for node in ast.walk(tree):
            if not isinstance(node, allowed_nodes + allowed_ops):
                raise ValueError(f"Unsafe operation: {type(node).__name__}")

        # Evaluate safely
        result = eval(compile(tree, "<string>", "eval"))
        return f"Result: {result}"

    except Exception as e:
        return f"Error: {str(e)}"


def execute_web_search(query: str, max_results: int = 5) -> str:
    """Mock web search (replace with real implementation)"""
    # This is a placeholder - in real implementation use duckduckgo-search
    mock_results = [
        f"Result {i+1}: {query} related content..." for i in range(min(max_results, 3))
    ]
    return f"Search results for '{query}':\n" + "\n".join(mock_results)


def execute_file_read(filepath: str) -> str:
    """Read file content safely"""
    try:
        # Additional security check
        if not filepath.startswith(("data/", "configs/", "outs/")):
            return "Error: File path not allowed"

        # Mock file reading (replace with actual file operations)
        return f"Content of {filepath}: [Mock file content...]"

    except Exception as e:
        return f"Error reading file: {str(e)}"


# Tool execution registry
TOOL_EXECUTORS = {
    "calculator": execute_calculator,
    "web_search": execute_web_search,
    "file_read": execute_file_read,
}

print("✓ Tool executors registered")

In [None]:
# %% Cell 5: Function Calling Prompt Template
def create_function_calling_prompt(user_query: str, available_tools: List[str]) -> str:
    """Create prompt for function calling with specific JSON format"""

    tool_descriptions = []
    for tool_name in available_tools:
        if tool_name == "calculator":
            tool_descriptions.append(
                """
- calculator: Perform mathematical calculations
  Parameters: {"expression": "math expression using +,-,*,/,(),numbers"}"""
            )
        elif tool_name == "web_search":
            tool_descriptions.append(
                """
- web_search: Search the web for information
  Parameters: {"query": "search terms", "max_results": 5}"""
            )
        elif tool_name == "file_read":
            tool_descriptions.append(
                """
- file_read: Read file contents
  Parameters: {"filepath": "path/to/file"}"""
            )

    tools_text = "\n".join(tool_descriptions)

    prompt = f"""You are a helpful assistant that can use tools to answer questions.

Available tools:
{tools_text}

When you need to use a tool, respond ONLY with valid JSON in this exact format:
{{"tool": "tool_name", "args": {{"param1": "value1", "param2": "value2"}}}}

If you don't need tools, respond normally.

User question: {user_query}"""

    return prompt

In [None]:
# %% Cell 6: JSON Parsing and Validation
def parse_tool_call(response: str) -> Optional[Dict[str, Any]]:
    """Extract and validate tool call from LLM response"""

    # Try to find JSON in response
    json_match = re.search(r"\{.*\}", response, re.DOTALL)
    if not json_match:
        return None

    try:
        # Parse JSON
        tool_call = json.loads(json_match.group())

        # Validate structure
        if not isinstance(tool_call, dict):
            return None
        if "tool" not in tool_call or "args" not in tool_call:
            return None

        tool_name = tool_call["tool"]
        args = tool_call["args"]

        # Validate tool exists
        if tool_name not in TOOL_SCHEMAS:
            return None

        # Validate arguments with pydantic
        schema_class = TOOL_SCHEMAS[tool_name]
        validated_args = schema_class(**args)

        return {"tool": tool_name, "args": validated_args.dict()}

    except (json.JSONDecodeError, ValueError, TypeError) as e:
        print(f"JSON parsing error: {e}")
        return None

In [None]:
# %% Cell 7: Tool Execution with Safety
def execute_tool_safely(tool_call: Dict[str, Any]) -> str:
    """Execute tool call with safety checks"""

    tool_name = tool_call["tool"]
    args = tool_call["args"]

    # Check if tool exists in whitelist
    if tool_name not in TOOL_EXECUTORS:
        return f"Error: Tool '{tool_name}' not found in whitelist"

    # Get executor function
    executor = TOOL_EXECUTORS[tool_name]

    try:
        # Execute with unpacked arguments
        if tool_name == "calculator":
            result = executor(args["expression"])
        elif tool_name == "web_search":
            result = executor(args["query"], args.get("max_results", 5))
        elif tool_name == "file_read":
            result = executor(args["filepath"])
        else:
            result = "Error: Unknown tool execution pattern"

        return result

    except Exception as e:
        return f"Tool execution error: {str(e)}"

In [None]:
# %% Cell 8: Minimal ReAct Loop
def minimal_react_loop(user_query: str, max_iterations: int = 3) -> str:
    """Minimal ReAct: Thought → Action → Observation loop"""

    available_tools = ["calculator", "web_search", "file_read"]
    conversation_history = []

    for iteration in range(max_iterations):
        print(f"\n--- Iteration {iteration + 1} ---")

        # Create prompt with context
        if conversation_history:
            context = "\n".join(conversation_history)
            full_prompt = (
                f"Previous context:\n{context}\n\n"
                + create_function_calling_prompt(user_query, available_tools)
            )
        else:
            full_prompt = create_function_calling_prompt(user_query, available_tools)

        # Generate response
        messages = [{"role": "user", "content": full_prompt}]
        inputs = tokenizer.apply_chat_template(
            messages, return_tensors="pt", add_generation_prompt=True
        ).to(model.device)

        with torch.no_grad():
            outputs = model.generate(
                inputs,
                max_new_tokens=256,
                temperature=0.3,
                do_sample=True,
                pad_token_id=tokenizer.eos_token_id,
            )

        response = tokenizer.decode(
            outputs[0][inputs.shape[1] :], skip_special_tokens=True
        )
        print(f"LLM Response: {response}")

        # Check if response contains tool call
        tool_call = parse_tool_call(response)

        if tool_call:
            print(f"Tool Call Detected: {tool_call}")

            # Execute tool
            observation = execute_tool_safely(tool_call)
            print(f"Tool Result: {observation}")

            # Add to conversation history
            conversation_history.append(f"Action: {tool_call}")
            conversation_history.append(f"Observation: {observation}")

            # Continue loop for potential follow-up
            continue
        else:
            # No tool call - this might be the final answer
            print(f"Final Answer: {response}")
            return response

    return "Maximum iterations reached. Please try a simpler query."

In [None]:
# %% Cell 9: Smoke Test
print("=== Function Calling Smoke Test ===")

# Test 1: Simple calculation
print("\n1. Testing Calculator:")
calc_prompt = create_function_calling_prompt("What is 25 * 4 + 17?", ["calculator"])
print("Prompt:", calc_prompt[:100] + "...")

# Test tool call parsing
sample_json = '{"tool": "calculator", "args": {"expression": "25 * 4 + 17"}}'
parsed = parse_tool_call(sample_json)
print("Parsed tool call:", parsed)

if parsed:
    result = execute_tool_safely(parsed)
    print("Execution result:", result)

# Test 2: Invalid tool call handling
print("\n2. Testing Invalid JSON:")
invalid_json = '{"invalid": "format"}'
parsed_invalid = parse_tool_call(invalid_json)
print("Invalid parse result:", parsed_invalid)

# Test 3: Mini ReAct loop
print("\n3. Testing Mini ReAct Loop:")
try:
    react_result = minimal_react_loop(
        "Calculate 15 + 27 and tell me if it's greater than 40"
    )
    print(f"ReAct Final Result: {react_result}")
except Exception as e:
    print(f"ReAct Error: {e}")

print("\n✓ Smoke test completed")

## Key Parameters & Low-VRAM Options

**Model Loading:**
- `load_in_4bit=True` - Reduces VRAM by ~50%
- `device_map="auto"` - Automatic GPU/CPU distribution
- Alternative: Use `llama.cpp` with GGUF for even lower memory

**Generation Parameters:**
- `max_new_tokens=256` - Short responses for tool calls
- `temperature=0.3` - More deterministic for JSON generation
- `do_sample=True` - Slight randomness to avoid repetition

**Safety Features:**
- AST parsing for calculator expressions
- File path whitelist validation
- Tool name whitelist checking
- JSON schema validation with pydantic

## When to Use This

- **Function Calling Applications**: When LLM needs to interact with external tools
- **Agent Systems**: As building block for ReAct/Plan-Execute patterns  
- **API Integration**: Structured way to call external services
- **Code Generation**: When LLM needs to generate executable commands

## Next Steps

- Add more sophisticated tools (web search, file operations)
- Implement retry mechanisms for failed tool calls
- Add conversation memory and context management
- Build proper error handling and logging


## Core Code Highlights

**1. Pydantic Schema 驗證**
```python
class CalculatorArgs(BaseModel):
    expression: str = Field(..., description="Mathematical expression")
    
    @validator('expression')
    def validate_expression(cls, v):
        allowed_chars = set('0123456789+-*/().,e ')
        if not all(c in allowed_chars for c in v.replace(' ', '')):
            raise ValueError("Expression contains unsafe characters")
        return v
```

**2. 工具調用 JSON 解析**
```python
def parse_tool_call(response: str) -> Optional[Dict[str, Any]]:
    json_match = re.search(r'\{.*\}', response, re.DOTALL)
    if not json_match:
        return None
    
    tool_call = json.loads(json_match.group())
    schema_class = TOOL_SCHEMAS[tool_name]
    validated_args = schema_class(**args)
    return {"tool": tool_name, "args": validated_args.dict()}
```

**3. 最小 ReAct 循環**
```python
def minimal_react_loop(user_query: str, max_iterations: int = 3):
    for iteration in range(max_iterations):
        # Generate → Parse → Execute → Observe → Continue
        response = model.generate(...)
        tool_call = parse_tool_call(response)
        if tool_call:
            observation = execute_tool_safely(tool_call)
            conversation_history.append(f"Observation: {observation}")
```

## Smoke Test Cell

完整的工具調用流程測試：
1. 計算器工具調用與執行
2. 無效 JSON 處理測試  
3. 簡化版 ReAct 循環運行



## Stage 3-1 Summary

**✅ Completed:**
- OpenAI 風格 function calling JSON 格式實作
- Pydantic 工具參數驗證與安全檢查
- 基礎工具註冊表與執行框架
- 最小 ReAct 模式（思考→行動→觀察）

**🔑 Core Concepts:**
- **JSON Schema Validation**: 使用 pydantic 確保工具參數正確性
- **Tool Registry Pattern**: 可擴展的工具註冊與分派機制
- **Safe Execution**: AST 解析、路徑白名單等安全措施
- **ReAct Loop**: 循環式推理與行動模式

**⚠️ Pitfalls:**
- JSON 解析失敗處理：LLM 輸出格式不穩定
- 工具執行安全：惡意輸入可能導致系統風險
- 記憶體管理：ReAct 循環中的對話歷史累積

**➡️ Next Actions:**
- **nb21**: 實作安全計算器（AST 限制與數學函數）
- **nb22**: DuckDuckGo 搜尋工具與速率限制
- **nb23**: HTML 內容抽取（trafilatura 整合）

您希望繼續 **nb21_safe_calculator.ipynb** 還是先針對 nb20 進行調整或優化？