# Advanced Prompting with Reasoning Mode

Amazon Nova 2 supports a reasoning mode that enables the model to perform internal chain-of-thought before generating a response. This improves accuracy on tasks that require multi-step logic, planning, and complex tool orchestration.

This notebook covers techniques from the [Advanced Prompting Techniques](https://docs.aws.amazon.com/nova/latest/nova2-userguide/advanced-prompting-techniques.html) guide:

1. **Reasoning mode** ‚Äî when to use it, how to enable it, and the top-down approach
3. **Tool calling best practices** ‚Äî inference parameters, schema quality, system prompt structure
4. **Tool calling with reasoning mode** ‚Äî combining reasoning with tool use for complex workflows

## Setup

In [None]:
import boto3
import json
import time
from IPython.display import display, Markdown, HTML

In [None]:
%store -r MODEL_ID
%store -r region_name

client = boto3.client("bedrock-runtime", region_name=region_name)

## 1. Reasoning Mode

Reasoning mode lets the model allocate internal "thinking" tokens before producing a response. This is particularly effective for:

- **Multi-step reasoning** ‚Äî math proofs, algorithm design, logical deductions
- **Cross-referencing** ‚Äî comparing information across documents or data sources
- **Error-prone calculations** ‚Äî financial modeling, statistical analysis
- **Planning with constraints** ‚Äî resource allocation, scheduling, dependency management
- **Complex classifications** ‚Äî nuanced categorization requiring multiple criteria
- **Tool calling scenarios** ‚Äî deciding which tools to invoke and in what order

### Enabling Reasoning Mode
- Extended thinking OFF (default): Amazon Nova 2 operates with efficient latent reasoning, optimal for everyday tasks and high-volume applications.
- Assistant prefilling is not supported
- System prompts are supported
- Reasoning content is redacted in Nova 2 Lite (shown as `[REDACTED]`), but you are still charged for reasoning tokens

You can control reasoning depth with `maxReasoningEffort`: `low`, `medium`, or `high`.
- Temperature, topP and topK cannot be used with maxReasoningEffort set to high. Using these parameters together causes an error.

In [3]:
# Enable reasoning mode for a network capacity planning problem
response = client.converse(
    modelId=MODEL_ID,
    messages=[{
        "role": "user",
        "content": [{"text": """AnyCompany Telecom needs to allocate bandwidth across 4 regions.

Available bandwidth: 100 Gbps total
Region A: 15,000 subscribers, peak usage 8-10 PM, requires low latency for VoIP
Region B: 8,000 subscribers, steady usage, mostly streaming
Region C: 25,000 subscribers, business district, peak 9 AM-5 PM
Region D: 5,000 subscribers, rural, cost-sensitive

Constraints:
- Each region must get at least 10 Gbps
- VoIP regions need 2x the per-subscriber bandwidth
- Business districts need guaranteed 99.9% uptime allocation

What is the optimal bandwidth allocation?"""}]
    }],
    inferenceConfig={
        "temperature": 0.7,
        "topP": 0.9,
        "maxTokens": 10000
    },
    additionalModelRequestFields={
        "reasoningConfig": {
            "type": "enabled",
            "maxReasoningEffort": "low"
        }
    }
)

# Extract and display the response
for block in response["output"]["message"]["content"]:
    if "text" in block:
        display(Markdown(block["text"]))

usage = response["usage"]
print(f"\nInput tokens: {usage['inputTokens']} | Output tokens: {usage['outputTokens']}")

### Optimal Bandwidth Allocation for AnyCompany Telecom

#### **Step 1: Apply Minimum Allocation**
Each region must receive **at least 10 Gbps**.  
- **Total minimum allocation**: \(4 \text{ regions} \times 10 \text{ Gbps} = 40 \text{ Gbps}\)  
- **Remaining bandwidth**: \(100 \text{ Gbps} - 40 \text{ Gbps} = 60 \text{ Gbps}\)

---

#### **Step 2: Adjust for Special Constraints**
We allocate the remaining 60 Gbps based on **weighted subscriber needs**, prioritizing:
1. **Region A (VoIP)**: Requires **2√ó per-subscriber bandwidth** due to low-latency VoIP needs.
2. **Region C (Business District)**: Needs **guaranteed 99.9% uptime** (prioritized but not double-weighted).
3. **Region B (Streaming)**: Steady usage, normal weighting.
4. **Region D (Rural, Cost-Sensitive)**: Minimal extra allocation beyond the 10 Gbps floor.

##### **Weighted Subscriber Calculation**
- **Region A**: \(15,000 \text{ subscribers} \times 2 = 30,000\) (VoIP 2√ó factor)  
- **Region B**: \(8,000 \text{ subscribers} \times 1 = 8,000\)  
- **Region C**: \(25,000 \text{ subscribers} \times 1 = 25,000\) (business uptime = priority, but no extra multiplier)  
- **Region D**: **No extra allocation** (cost-sensitive; stays at minimum 10 Gbps)

**Total weight**: \(30,000 + 8,000 + 25,000 = 63,000\)

---

#### **Step 3: Allocate Remaining 60 Gbps**
Distribute based on weights (Region D gets nothing extra):

| Region | Weight | Allocation Formula | Bandwidth Added (Gbps) | **Total Allocation (Gbps)** |
|--------|--------|--------------------|-------------------------|------------------------------|
| **A**  | 30,000 | \(\frac{30,000}{63,000} \times 60\) | **28.57** | \(10 + 28.57 = \textbf{38.57}\) |
| **B**  | 8,000  | \(\frac{8,000}{63,000} \times 60\)  | **7.62**  | \(10 + 7.62 = \textbf{17.62}\) |
| **C**  | 25,000 | \(\frac{25,000}{63,000} \times 60\) | **23.81** | \(10 + 23.81 = \textbf{33.81}\) |
| **D**  | ‚Äî      | ‚Äî                  | **0**    | \(10 + 0 = \textbf{10.00}\)   |

---

#### **Step 4: Verify Totals and Constraints**
- **Total allocated**: \(38.57 + 17.62 + 33.81 + 10.00 = 100 \text{ Gbps}\) ‚úÖ  
- **Minimums met**: All regions ‚â•10 Gbps ‚úÖ  
- **VoIP (Region A)**: Received **2√ó weighting** in allocation ‚úÖ  
- **Business uptime (Region C)**: Guaranteed dedicated bandwidth (33.81 Gbps) ‚úÖ  
- **Cost-sensitive (Region D)**: Kept at minimum 10 Gbps to reduce costs ‚úÖ  

---

### **Final Allocation**
| Region | Subscribers | Allocation (Gbps) | Notes |
|--------|-------------|-------------------|-------|
| **A**  | 15,000      | **38.57**         | VoIP (low latency), 2√ó per-subscriber weighting |
| **B**  | 8,000       | **17.62**         | Steady streaming usage |
| **C**  | 25,000      | **33.81**         | Business district (99.9% uptime guaranteed) |
| **D**  | 5,000       | **10.00**         | Rural, cost-sensitive (minimum only) |

### **Key Rationale**
- **Region A** gets the largest share due to VoIP‚Äôs strict low-latency requirements (2√ó per-subscriber priority).  
- **Region C** (business district) receives substantial bandwidth to ensure uptime and productivity during peak hours.  
- **Region B** gets moderate allocation for streaming.  
- **Region D** stays at the minimum to respect cost sensitivity while ensuring basic connectivity.  

This allocation **optimally balances demand, constraints, and cost** across all regions.


Input tokens: 238 | Output tokens: 2573


## 2. Tool Calling Best Practices

When using tools with Amazon Nova 2, follow these guidelines for reliable results.

### Inference Parameters

| Mode | Temperature | Top P |
| --- | --- | --- |
| Non-reasoning | 0.7 | 0.9 |
| Reasoning enabled | 1.0 | 0.9 |

### Tool Schema Quality

Well-crafted tool schemas significantly improve tool selection and parameter extraction:

- **Tool description**: 20‚Äì50 words explaining what the tool does and when to use it
- **Parameter descriptions**: ~10 words each, with format hints and valid ranges
- Use the tool's actual name in prompts ‚Äî avoid XML tags or pythonic references
- Mark parameters as `required` only when truly necessary

### System Prompt Structure

For tool-heavy applications, structure your system prompt with dedicated sections:

```
## Role
Define the agent's persona and capabilities.

## Tool Usage
Explain when and how to use each tool. Include ordering preferences.

## Error Handling
Define retry logic, fallback behavior, and how to communicate failures.
```

### Tool Call Ordering

When both built-in tools (Code Interpreter, Web Grounding) and custom tools are available, Nova 2 calls built-in tools first. Design your workflows with this ordering in mind.

In [4]:
# Example: Well-structured tool schemas for a telecom support agent
telecom_tools = {
    "tools": [
        {
            "toolSpec": {
                "name": "check_account_status",
                "description": "Look up a customer account to retrieve current plan details, billing status, and service health. Use this when the customer asks about their account, plan, or billing.",
                "inputSchema": {
                    "json": {
                        "type": "object",
                        "properties": {
                            "account_id": {
                                "type": "string",
                                "description": "Customer account ID, format: AC-XXXXX"
                            }
                        },
                        "required": ["account_id"]
                    }
                }
            }
        },
        {
            "toolSpec": {
                "name": "run_network_diagnostic",
                "description": "Run a diagnostic test on the customer connection to check signal strength, latency, and packet loss. Use this when the customer reports connectivity or performance issues.",
                "inputSchema": {
                    "json": {
                        "type": "object",
                        "properties": {
                            "account_id": {
                                "type": "string",
                                "description": "Customer account ID, format: AC-XXXXX"
                            },
                            "test_type": {
                                "type": "string",
                                "enum": ["full", "quick", "latency_only"],
                                "description": "Diagnostic depth: full (3min), quick (30s), or latency_only (10s)"
                            }
                        },
                        "required": ["account_id"]
                    }
                }
            }
        },
        {
            "toolSpec": {
                "name": "create_support_ticket",
                "description": "Create a support ticket for issues that require technician follow-up or escalation. Use this after diagnostics confirm an issue that cannot be resolved remotely.",
                "inputSchema": {
                    "json": {
                        "type": "object",
                        "properties": {
                            "account_id": {
                                "type": "string",
                                "description": "Customer account ID, format: AC-XXXXX"
                            },
                            "priority": {
                                "type": "string",
                                "enum": ["low", "medium", "high", "critical"],
                                "description": "Ticket priority based on impact severity"
                            },
                            "issue_summary": {
                                "type": "string",
                                "description": "Brief description of the issue, max 200 chars"
                            }
                        },
                        "required": ["account_id", "priority", "issue_summary"]
                    }
                }
            }
        }
    ]
}

# System prompt with Tool Usage and Error Handling sections
system_prompt = """You are a technical support agent for AnyCompany Telecom.

## Tool Usage
- Always check the customer's account status first using check_account_status
- If the customer reports a connectivity issue, run run_network_diagnostic with test_type="quick" first
- Only create a support ticket via create_support_ticket if the diagnostic confirms an unresolvable issue
- Summarize findings to the customer after each tool call

## Error Handling
- If a tool call fails, inform the customer and suggest an alternative (e.g., call support at 1-800-555-0100)
- Never retry a failed tool call more than once
- If account_id is not provided, ask the customer for it before proceeding
"""

# Customer message that should trigger tool use
response = client.converse(
    modelId=MODEL_ID,
    system=[{"text": system_prompt}],
    messages=[{
        "role": "user",
        "content": [{"text": "Hi, my account is AC-78234 and my internet has been dropping every few hours for the past two days."}]
    }],
    toolConfig=telecom_tools,
    inferenceConfig={"maxTokens": 1024, "temperature": 0.7}
)

# Show what tool the model chose
for block in response["output"]["message"]["content"]:
    if "toolUse" in block:
        tool = block["toolUse"]
        print(f"Tool selected: {tool['name']}")
        print(f"Input: {json.dumps(tool['input'], indent=2)}")
    elif "text" in block:
        print(f"Text: {block['text']}")

Tool selected: check_account_status
Input: {
  "account_id": "AC-78234"
}


## 3. Tool Calling with Reasoning Mode

Combining reasoning mode with tool use is especially powerful for scenarios where the model needs to:
- Decide which of several tools to call
- Determine the correct order of operations
- Synthesize results from multiple tool calls

When reasoning is enabled, the model internally plans its tool strategy before making the first call. This reduces errors in multi-tool workflows.

Let's walk through a complete multi-turn tool use loop with reasoning enabled.

In [5]:
import random

# Simulated tool implementations
def check_account_status(account_id):
    return {
        "account_id": account_id,
        "plan": "Fiber Pro 500",
        "status": "active",
        "monthly_fee": 89.99,
        "last_payment": "2026-01-15",
        "equipment": "Router X500, ONT-2000"
    }

def run_network_diagnostic(account_id, test_type="quick"):
    return {
        "account_id": account_id,
        "test_type": test_type,
        "signal_strength": "-18 dBm (good)",
        "latency": "12ms",
        "packet_loss": "8.2%",
        "status": "degraded",
        "recommendation": "Elevated packet loss detected. Likely cause: faulty ONT or upstream splitter issue."
    }

def create_support_ticket(account_id, priority, issue_summary):
    ticket_id = f"TK-2026-02-{random.randint(1000,9999)}"
    return {
        "ticket_id": ticket_id,
        "account_id": account_id,
        "priority": priority,
        "summary": issue_summary,
        "status": "created",
        "eta": "24-48 hours for technician visit"
    }

# Map tool names to functions
tool_functions = {
    "check_account_status": check_account_status,
    "run_network_diagnostic": run_network_diagnostic,
    "create_support_ticket": create_support_ticket,
}

print("Tool functions registered.")

Tool functions registered.


In [6]:
# Multi-turn tool use loop with reasoning mode
messages = [{
    "role": "user",
    "content": [{"text": "My account is AC-78234. My internet keeps dropping and I need this fixed ASAP ‚Äî I work from home and have client calls all day."}]
}]

system_prompt = """You are a technical support agent for AnyCompany Telecom.

## Tool Usage
- Always check the customer's account status first using check_account_status
- If the customer reports a connectivity issue, run run_network_diagnostic
- Only create a support ticket via create_support_ticket if the diagnostic confirms an issue
- After all tool calls, provide a clear summary to the customer

## Error Handling
- If a tool fails, inform the customer and suggest calling 1-800-555-0100
"""

max_turns = 5
turn = 0

while turn < max_turns:
    turn += 1
    print(f"\n--- Turn {turn} ---")

    response = client.converse(
        modelId=MODEL_ID,
        system=[{"text": system_prompt}],
        messages=messages,
        toolConfig=telecom_tools,
        inferenceConfig={"maxTokens": 2048, "temperature": 1, "topP": 0.9},
        additionalModelRequestFields={
            "reasoningConfig": {
                "type": "enabled",
                "maxReasoningEffort": "medium"
            }
        }
    )

    assistant_message = response["output"]["message"]
    messages.append(assistant_message)

    # Check if the model wants to call a tool
    tool_use_blocks = [
        block["toolUse"]
        for block in assistant_message["content"]
        if "toolUse" in block
    ]

    if not tool_use_blocks:
        # No tool call ‚Äî model is done, display final response
        for block in assistant_message["content"]:
            if "text" in block:
                display(Markdown(block["text"]))
        break

    # Execute each tool call and collect results
    tool_results = []
    for tool_call in tool_use_blocks:
        tool_name = tool_call["name"]
        tool_input = tool_call["input"]
        tool_id = tool_call["toolUseId"]

        print(f"Calling: {tool_name}({json.dumps(tool_input)})")

        # Execute the tool
        func = tool_functions[tool_name]
        result = func(**tool_input)
        print(f"Result: {json.dumps(result, indent=2)}")

        tool_results.append({
            "toolResult": {
                "toolUseId": tool_id,
                "content": [{"json": result}],
                "status": "success"
            }
        })

    # Add tool results to conversation
    messages.append({"role": "user", "content": tool_results})

if turn >= max_turns:
    print("\nMax turns reached.")


--- Turn 1 ---
Calling: check_account_status({"account_id": "AC-78234"})
Result: {
  "account_id": "AC-78234",
  "plan": "Fiber Pro 500",
  "status": "active",
  "monthly_fee": 89.99,
  "last_payment": "2026-01-15",
  "equipment": "Router X500, ONT-2000"
}

--- Turn 2 ---
Calling: run_network_diagnostic({"test_type": "full", "account_id": "AC-78234"})
Result: {
  "account_id": "AC-78234",
  "test_type": "full",
  "signal_strength": "-18 dBm (good)",
  "latency": "12ms",
  "packet_loss": "8.2%",
  "status": "degraded",
  "recommendation": "Elevated packet loss detected. Likely cause: faulty ONT or upstream splitter issue."
}

--- Turn 3 ---
Calling: create_support_ticket({"priority": "high", "issue_summary": "Elevated packet loss (8.2%) detected. Likely faulty ONT-2000 or upstream splitter requiring technician dispatch.", "account_id": "AC-78234"})
Result: {
  "ticket_id": "TK-2026-02-6660",
  "account_id": "AC-78234",
  "priority": "high",
  "summary": "Elevated packet loss (8.2%) det

I've confirmed the issue with your connection and escalated it immediately. Here's your status update:

**Account Status:**  
‚úÖ Active Fiber Pro 500 plan (AC-78234)  
**Equipment:** Router X500 and ONT-2000  

**Diagnostic Results:**  
üî¥ **Critical Issue Found** - 8.2% packet loss detected  
üì° Signal strength: Good (-18 dBm)  
‚è±Ô∏è Latency: Normal (12ms)  

**Action Taken:**  
üìÖ **High-Priority Ticket Created** (TK-2026-02-6660)  
üìå Technician dispatch scheduled within **24-48 hours**  
üìù Issue: Potential faulty ONT-2000 or upstream splitter  

**Next Steps:**  
1. A technician will visit to inspect/replace your ONT and splitter  
2. You'll receive a confirmation email with the scheduled appointment time  
3. For urgent updates, call 1-800-555-0100 and reference ticket #TK-2026-02-6660  

Your connection should stabilize once the hardware is repaired. We apologize for the disruption to your work-from-home schedule and appreciate your patience while we resolve this.

## When to Consider Sub-Agents

As your tool count grows, consider splitting tools across specialized sub-agents when:

- **Tool count exceeds ~20** ‚Äî a single agent may struggle to select the right tool
- **Distinct functional domains** ‚Äî e.g., billing tools vs. network diagnostic tools vs. provisioning tools
- **Complex schemas** ‚Äî tools with many parameters benefit from a focused agent that understands the domain
- **Long conversations (>15‚Äì20 turns)** ‚Äî context window pressure increases; sub-agents keep each conversation focused

A common pattern is an orchestrator agent that routes requests to domain-specific sub-agents, each with their own tool set and system prompt.

## Conclusion

This notebook covered advanced prompting techniques for Amazon Nova 2:

- **Reasoning mode** improves accuracy on complex, multi-step tasks by letting the model think internally before responding
- **Tool calling** benefits from well-crafted schemas (20‚Äì50 word descriptions), structured system prompts, and appropriate inference parameters
- **Reasoning + tool use** enables the model to plan multi-tool workflows before executing them

For more details, see the [Advanced Prompting Techniques](https://docs.aws.amazon.com/nova/latest/nova2-userguide/advanced-prompting-techniques.html) documentation.