# Let's Unlock the Power of Interleaved Thinking with Claude 4!

Hey there, fellow AI enthusiasts! I'm thrilled to take you on a journey through one of Claude 4's most game-changing capabilities - Interleaved Thinking. Trust me, when I first tested this feature knew Anthropic was onto something special that would fundamentally transform how Claude reasons through complex problems.

In this notebook, I'll demystify this powerful feature together, break down a real implementation, and explore how you can leverage it in your own applications. By the end, you'll have a new tool in your AI toolkit that will take your Claude integrations to the next level!

## Getting Our Environment Ready

First things first - let's set up our Python environment. I always start with good imports because, well, trying to code without them is like showing up to a construction site without any tools. Not a great way to start your day!

In [None]:
!pip install boto3 botocore awscli --upgrade --quiet

In [None]:
# Import necessary libraries
import boto3
import json
import time
import copy
import uuid
from botocore.config import Config
from datetime import datetime, timedelta

REGION_NAME="us-west-2"

# Configure Bedrock client with appropriate region
config = Config(
    region_name=REGION_NAME,  # Change to your preferred region
    retries={
        'max_attempts': 15,
        'mode': 'adaptive'
    }
)

# Initialize Bedrock Runtime client
bedrock_rt = boto3.client('bedrock-runtime', config=config)

print("Bedrock client initialized successfully!")

## What Makes Interleaved Thinking So Revolutionary?

Before we dive into code, let's understand what makes this feature so special. If you've worked with LLMs and tools before, you know the traditional pattern: ask a question, get a response with a tool call, get the result, then get an answer. Functional, but limited.

Interleaved Thinking turns this on its head by enabling Claude to think between tool calls - creating a much more human-like reasoning process.

When I first saw this capability in action, I was reminded of how we humans tackle complex problems. We don't just follow a linear path - we gather information, reflect on it, seek more details based on our reflections, and gradually build towards a solution. That's exactly what Claude can now do!

Here's how traditional tool use compares to Interleaved Thinking:

**The Old Way (Traditional Tool Use)**:
```
User Query → Claude's Response with Tool Call → Tool Result → Claude's Final Answer
```

**The Game-Changing New Way (Interleaved Thinking)**:
```
User Query → Claude's Initial Thinking → Tool Call 1 → 
Tool Result 1 → Claude's Intermediate Thinking → Tool Call 2 →
Tool Result 2 → Claude's Final Thinking → Final Answer
```

This might seem like a subtle difference, but I promise you - the impact on complex reasoning tasks is profound. It's like upgrading from checkers to chess!

## Picking the Right Model for the Job

Just like selecting the right tool for a specific home improvement project, we need to pick the appropriate Claude 4 model for our task:

In [None]:
# Select the appropriate Claude 4 model
# MODEL_ID = "us.anthropic.claude-opus-4-20250514-v1:0"  # Larger, more capable model
MODEL_ID = "us.anthropic.claude-sonnet-4-20250514-v1:0"  # Balanced model for most use cases

print(f"Using Claude model: {MODEL_ID}")

I typically start with Sonnet for these demonstrations - it's like our Swiss Army knife of models. Powerful enough for complex tasks but more efficient than Opus. If you're dealing with especially intricate reasoning chains, you might want to upgrade to Opus, which would be like bringing out the industrial-grade equipment!

## Building Our First Tool - The Calculator

Now, let's create our first tool for Claude to use. Think of tool definitions as interfaces - we're creating a contract that defines how Claude can interact with external functions.

Our first tool is a calculator. Nothing fancy, but essential for our revenue calculation example:

In [None]:
# Define calculator tool in Bedrock format
calculator_tool = {
    "toolSpec": {
        "name": "calculator",
        "description": "Perform mathematical calculations",
        "inputSchema": {
            "json": {
                "type": "object",
                "properties": {
                    "expression": {
                        "type": "string",
                        "description": "Mathematical expression to evaluate"
                    }
                },
                "required": ["expression"]
            }
        }
    }
}

print("Calculator tool defined!")

I've found that clear, concise tool descriptions are crucial. Claude will use this description to decide when and how to use the tool, so think of it as writing documentation for a very eager intern!

## Our Second Tool - The Database Query

For our revenue calculation example, we'll need a second tool - a database query function that can retrieve product information:

In [None]:
# Define database query tool in Bedrock format
database_tool = {
    "toolSpec": {
        "name": "database_query",
        "description": "Query product database to get product information, inventory, sales, or category data",
        "inputSchema": {
            "json": {
                "type": "object",
                "properties": {
                    "query_type": {
                        "type": "string",
                        "description": "Type of query to execute - options: 'products', 'inventory', 'sales', 'categories', 'costs', or a simple keyword"
                    }
                },
                "required": ["query_type"]
            }
        }
    }
}

print("Database query tool defined!")

In real-world scenarios, this would likely connect to your actual data store. For this demo, we'll use mock data, but the pattern remains the same. I've spent countless debugging sessions tracking down tool definition issues, so trust me when I say - invest time in getting these definitions right!

## Tool Execution Helper Functions

For this notebook, we'll be using several helper functions from a separate `utils.py` file that handle the execution of tools when called by Claude. Let's import them:

In [None]:
# Import tool execution functions from utils.py
from utils import execute_calculator, execute_database_query, execute_tool

# Let's review what these functions do
print("Tool execution helpers loaded from utils.py")

### Helper Functions Overview

I've extracted these functions into a separate file to keep our notebook focused on the interleaved thinking concept. Here's what each one does:

1. **execute_calculator(input_data)**
   - This is our mathematical Swiss Army knife - it takes expressions from Claude and evaluates them
   - Handles things like currency formatting (those pesky decimal places!)
   - Returns results as nicely formatted strings

2. **execute_database_query(input_data)**
   - Our mock database - in production, you'd swap this with your actual database connector
   - Can return different types of product data depending on the query parameter
   - I've structured the return data to mimic what you'd typically get from a real product database

3. **execute_tool(tool_name, tool_params)**
   - The traffic director - routes tool calls to the right execution function
   - A simple pattern that scales well as you add more tools

Feel free to peek at the implementation in `utils.py` if you're curious about the details. I've added plenty of comments there to guide you through the code.

## The Heart of Interleaved Thinking

Now we're getting to the good stuff! Here's the complete function that demonstrates interleaved thinking. I've structured this function in four logical sections to make it easier to understand:

### The Four Key Sections

**Section 1: Setup and Initialization**  
This is where we prepare everything needed for our interleaved thinking demonstration. We define our user query (a revenue calculation that requires multiple tools), set up our tool configuration, and initialize our conversation structure. I also create arrays to track thinking blocks, tool uses, and results across iterations - critical for maintaining Claude's reasoning flow.

**Section 2: Main Interleaved Thinking Loop**  
This section contains the core loop that drives our interleaved thinking process. For each iteration, we make an API call to Claude with our conversation context and special configuration parameters. The magic happens in the `reasoning_config` where we enable interleaved thinking and provide a generous token budget for Claude's reasoning process.

**Section 3: Processing Claude's Response**  
Here we parse Claude's response to identify and handle different content types - `thinking blocks`, `tool calls`, and `text responses`. Each type needs special handling: **thinking blocks** must be preserved, **tool calls** must be executed, and **text responses** displayed to the user. This processing stage is where we actually see the interleaved thinking in action.

**Section 4: Loop Control and Conversation Maintenance**  
The final section manages the conversation flow. We check if Claude is finished (no more tool calls), prepare for the next iteration by preserving `thinking blocks`, and add tool results to the conversation. This critical section ensures Claude's reasoning continuity between iterations - without it, Claude would start from scratch each time instead of building on its previous thoughts.

Let's look at the full implementation:

In [None]:
user_query = """
You are tasked with determining which product from our database would generate the highest profit margin if we sold 200 units. You will need to explain your reasoning step by step. To assist you in this task, you have access to two tools:

1. A database_query tool to look up products
2. A calculator to ensure accuracy in your calculations

IMPORTANT: Use parallel tool calls to be efficient providing your response

Follow these steps to complete the task:
1. Query the database to retrieve all relevant product information, including product name, cost, and selling price.
2. For each product, calculate the profit margin for selling 200 units using the calculator tool. The profit margin is (selling price - cost) * 200.
3. Compare the profit margins for all products to determine which one generates the highest profit margin.
4. Explain your reasoning step by step, showing your calculations and comparisons.
5. Present your final answer, stating which product would generate the highest profit margin if we sold 200 units, and provide the calculated profit margin.

Your final output should be structured as follows:

<analysis>
Step-by-step reasoning and calculations
</analysis>

<answer>
The product that would generate the highest profit margin if we sold 200 units is [Product Name], with a profit margin of $[Amount]. This is because [brief explanation].
</answer>

Remember to show all your work in the <analysis> section, including database queries, calculations, and comparisons. Your final answer in the <answer> section should be concise and directly address the question.
"""

In [None]:
def demonstrate_bedrock_interleaved_thinking_loop():
    """
    Demonstrate interleaved thinking using Amazon Bedrock with a loop implementation
    that preserves the reasoning flow between tool calls.
    """
    #--------------------------------------------------
    # SECTION 1: Setup and initialization
    #--------------------------------------------------
    print("\n" + "=" * 80)
    print("🧠 BEDROCK INTERLEAVED THINKING: REVENUE CALCULATION 🧠".center(80))
    print("=" * 80)
    
    print("\n📋 WHAT IS INTERLEAVED THINKING?")
    print("-" * 80)
    print("Interleaved thinking allows Claude to:")
    print("1. Process a query and respond with initial thoughts")
    print("2. Request a tool call (like database query or calculator)")
    print("3. Receive the tool result and think about it before deciding next steps")
    print("4. Request additional tools as needed based on previous results")
    print("5. Synthesize all the gathered information into a comprehensive answer")
    print("-" * 80)
    
    
    print("\n📊 INITIAL QUERY")
    print("-" * 80)
    print(user_query)
    print("-" * 80 + "\n")
    
    # Configure tools
    tools = [calculator_tool, database_tool]
    tool_config = {"tools": tools}
    
    # Initialize conversation with just the user query
    conversation = [
        {
            "role": "user",
            "content": [{"text": user_query}]
        }
    ]
    
    # Storage for tracking state across iterations
    all_thinking_blocks = []
    all_tool_use_blocks = []
    all_tool_results = []
    
    #--------------------------------------------------
    # SECTION 2: Main interleaved thinking loop
    #--------------------------------------------------
    # Loop variables
    max_iterations = 5
    iteration = 0
    
    print("Starting interleaved thinking process...\n")
    
    while iteration < max_iterations:
        iteration += 1
        
        print(f"\n" + "=" * 80)
        print(f"🔄 INTERLEAVED THINKING ITERATION #{iteration}".center(80))
        print("=" * 80 + "\n")
        
        if iteration > 1:
            print(f"Claude is analyzing results from previous tool calls and deciding next steps...\n")
        
        # Make Bedrock API call
        try:
            print(f"Making API call for iteration #{iteration}...\n")
            
            # The key part: enabling interleaved thinking with reasoning_config
            response = bedrock_rt.converse(
                modelId=MODEL_ID,
                messages=conversation,
                toolConfig=tool_config,
                additionalModelRequestFields={
                    "anthropic_beta": ["interleaved-thinking-2025-05-14"],
                    "max_tokens": 20000,
                    "reasoning_config": {
                        "type": "enabled", 
                        "budget_tokens": 16000
                    }
                }
            )
            
            #--------------------------------------------------
            # SECTION 3: Processing Claude's response
            #--------------------------------------------------
            # Process Claude's response
            assistant_message = response["output"]["message"]
            
            # Extract different content types
            iteration_thinking_blocks = []
            iteration_tool_use_blocks = []
            text_content = ""
            has_tool_calls = False
            
            # Process each content item
            for content_item in assistant_message.get("content", []):
                if "reasoningContent" in content_item:
                    # Store thinking block
                    iteration_thinking_blocks.append(content_item)
                    all_thinking_blocks.append(content_item)
                    
                    # Display thinking
                    reasoning_text = content_item["reasoningContent"]["reasoningText"]["text"]
                    print("\n🧠 CLAUDE'S THINKING:")
                    print("~" * 80)
                    print(reasoning_text)
                    print("~" * 80)
                    
                elif "toolUse" in content_item:
                    # Store tool use block
                    iteration_tool_use_blocks.append(content_item)
                    all_tool_use_blocks.append(content_item)
                    has_tool_calls = True
                    
                    # Display tool request
                    tool_use = content_item["toolUse"]
                    tool_name = tool_use["name"]
                    tool_params = tool_use["input"]
                    tool_id = tool_use["toolUseId"]
                    
                    print("\n🛠️ CLAUDE REQUESTED TOOL:")
                    print("-" * 80)
                    print(f"Tool: {tool_name}")
                    print(f"Input: {json.dumps(tool_params, indent=2)}")
                    print(f"Tool ID: {tool_id}")
                    print("-" * 80)
                    
                    # Execute and show the actual tool result
                    tool_result = execute_tool(tool_name, tool_params)
                    print("\n📊 TOOL EXECUTION RESULT:")
                    print("-" * 80)
                    print(tool_result)
                    print("-" * 80)
                    
                    # Store tool result for later reference
                    all_tool_results.append({
                        "toolUseId": tool_id,
                        "content": [{"text": tool_result}],
                        "status": "success"
                    })
                    
                elif "text" in content_item:
                    text_content = content_item["text"]
                    print("\n💬 CLAUDE'S RESPONSE:")
                    print("-" * 80)
                    print(text_content)
                    print("-" * 80)
            
            #--------------------------------------------------
            # SECTION 4: Loop control and conversation maintenance
            #--------------------------------------------------
            # If no tool calls, we're done
            if not has_tool_calls:
                print("\n✅ CLAUDE HAS COMPLETED THE ANALYSIS!")
                print("No further tool calls requested in this iteration.")
                break
                
            # Prepare next iteration of the conversation
            # CRITICAL: We preserve thinking blocks to maintain reasoning flow
            
            # First, append the current assistant message
            conversation.append(assistant_message)
            
            # Create tool results message using toolResult blocks
            tool_results_message = {
                "role": "user",
                "content": []
            }
            
            # Only add most recent tool results
            for tool_result in all_tool_results[-len(iteration_tool_use_blocks):]:
                tool_results_message["content"].append({"toolResult": tool_result})
            
            # Add tool results to conversation
            conversation.append(tool_results_message)
            
        except Exception as e:
            print(f"\nError in iteration {iteration}: {str(e)}")
            break
    
    print("\n" + "=" * 80)
    print("🏁 BEDROCK INTERLEAVED THINKING DEMONSTRATION COMPLETE 🏁".center(80))
    print("=" * 80 + "\n")

When I first implemented this pattern, getting these four sections to work together seamlessly was the key to success. The most critical insight was realizing how important it is to preserve those thinking blocks between iterations. Without that, Claude couldn't build on its previous reasoning, and the whole interleaved thinking advantage would be lost.

## Run the Demonstration

Now let's run the demonstration and see interleaved thinking in action.
When you run this cell, you'll see the entire process unfold - Claude thinking through the problem, requesting tools, analyzing results, and building towards a comprehensive answer. It's like watching a master chef work through a complex recipe!

In [None]:
# Run the demonstration
demonstrate_bedrock_interleaved_thinking_loop()

## What Makes Interleaved Thinking So Powerful?

Now that we've seen the implementation, let's reflect on why this capability is a game-changer:

### The Power of Preserved Reasoning

Interleaved thinking transforms Claude from a one-shot tool user into a multi-step reasoning engine. It's the difference between asking someone to solve a problem in one go versus allowing them to work through it step by step.

When I first started implementing this pattern, I was amazed at how much more sophisticated Claude's reasoning became. Problems that previously required multiple separate interactions could now be solved in a single, coherent reasoning flow.

### Why Would You Use Interleaved Thinking?

Based on my experience implementing this with various clients, these are the use cases where interleaved thinking truly shines:

1. **Multi-step Analysis**: When solving problems requires gathering different pieces of information and performing calculations on them
2. **Dynamic Decision Trees**: When the next step depends on the results of previous steps
3. **Complex Data Exploration**: When Claude needs to iteratively explore data to find patterns or insights
4. **Situations Requiring Reflection**: When Claude needs to "think about" tool results before deciding what to do next

The real power comes when Claude can say, "Based on these results, I should now check X" - that dynamic adaptation to intermediate results is where the magic happens.

### Implementation Considerations

A few tips from the trenches:

1. **Token Budgets**: Be generous with your thinking token budget for complex problems
2. **Prompt Engineering**: Explicitly ask Claude to explain its reasoning steps
3. **Block Preservation**: Always preserve thinking blocks between API calls
4. **Error Handling**: Add robust error handling - multi-step reasoning chains can fail in more ways

## A Simpler Approach for Faster Implementation
Sometimes you don't need all the detailed logging and tracking in our main implementation.
I love this simplified version for quick prototyping. It contains the essential pattern while omitting the detailed tracking and display code.
It's perfect for when you understand the concept and just need a clean implementation that focuses on just the core mechanics of interleaved thinking:

In [None]:
def simplified_interleaved_thinking(user_query, tools):
    """A simplified implementation focusing on the core interleaved thinking mechanics"""

    conversation = [{"role": "user", "content": [{"text": user_query}]}]
    tool_config = {"tools": tools}

    while True:
        # Make API call with interleaved thinking enabled
        response = bedrock_rt.converse(
            modelId=MODEL_ID,
            messages=conversation,
            toolConfig=tool_config,
            additionalModelRequestFields={
                    "anthropic_beta": ["interleaved-thinking-2025-05-14"],
                "max_tokens": 4000,
                "reasoning_config": {
                    "type": "enabled", 
                    "budget_tokens": 3000
                }
            }
        )

        # Process response
        assistant_message = response["output"]["message"]
        conversation.append(assistant_message)

        # Check for tool calls
        has_tool_calls = False
        tool_results = []

        for content_item in assistant_message.get("content", []):
            if "toolUse" in content_item:
                has_tool_calls = True
                tool_use = content_item["toolUse"]
                result = execute_tool(tool_use["name"], tool_use["input"])

                tool_results.append({
                    "toolUseId": tool_use["toolUseId"],
                    "content": [{"text": result}],
                    "status": "success"
                })
        
        # If no tool calls, we're done
        if not has_tool_calls:
            return assistant_message
            
        # Add tool results and continue loop
        tool_results_message = {
            "role": "user",
            "content": [{"toolResult": result} for result in tool_results]
        }
        
        conversation.append(tool_results_message)

##  Running the Simplified Implementation
Let's see how to run our simplified version with a different example query:

In [None]:
# Set up a sample query for the simplified implementation
sample_query = """
You are tasked with determining which product from our database would generate the highest profit margin if we sold 200 units.
You will need to explain your reasoning step by step. To assist you in this task, you have access to two tools:

1. A database_query tool to look up products
2. A calculator to ensure accuracy in your calculations

IMPORTANT: Use parallel tool calls to be efficient providing your response

Before you begin create a step-by-step plan to complete the task.

Present your final answer in the following format:

<answer>
Step-by-step reasoning: [Provide your detailed analysis here]

Highest profit margin product: [Product name]
Profit margin for 200 units: [Amount in dollars]

Explanation: [Brief explanation of why this product generates the highest profit margin]
</answer>

Your output should consist of only the database queries, calculator operations, and the final answer within the specified tags.
Do not include any additional text or explanations outside of these elements.
"""

In [None]:
# Configure the tools (same as in our main example)
tools = [calculator_tool, database_tool]

# Execute the simplified implementation
print("Running simplified interleaved thinking implementation...\n")
result = simplified_interleaved_thinking(sample_query, tools)

# Display the final result
print("\n" + "=" * 80)
print("FINAL RESPONSE FROM SIMPLIFIED IMPLEMENTATION:".center(80))
print("=" * 80)

# Extract and print just the text content from the response
for content_item in result.get("content", []):
    if "text" in content_item:
        print(content_item["text"])

## Where Do We Go From Here?

Interleaved thinking represents a fundamental shift in how LLMs can tackle complex problems. I remember when we first started working with tool-using LLMs - they were impressive but limited to simple, one-shot interactions. With interleaved thinking, we've taken a significant step toward more human-like reasoning patterns.

As you integrate this capability into your own applications, consider these next steps:

1. **Experiment with Different Problems**: Try applying interleaved thinking to different types of multi-step reasoning challenges
2. **Optimize Token Usage**: Fine-tune your token budgets based on the complexity of your use cases
3. **Design Better Tool Ecosystems**: Build complementary tools that work well together in reasoning chains
4. **Learn from Claude's Thinking**: Analyze the thinking blocks to understand how Claude approaches problems

The future of AI assistants lies in these more sophisticated reasoning capabilities. By mastering interleaved thinking now, you're positioning yourself at the forefront of what these systems can do.

I'm excited to see what you build with this! Remember, the most powerful AI solutions often come from combining these advanced capabilities with deep domain expertise - your knowledge of your specific problem space is what will transform this from a cool demo into a game-changing application.

Happy building! 🚀