# Fine Grained Tool Calling

## Key Concepts
- Tool streaming adds InputJsonEvent type with partial_json (chunk) and snapshot (cumulative JSON)
- Default: API buffers chunks and validates complete top-level key-value pairs before sending
- Buffering causes delays followed by bursts - chunks held until valid pair ready
- Fine-grained mode disables API-side JSON validation for faster streaming
- Fine-grained trade-off: immediate chunks vs. potential invalid JSON
- Without fine-grained: validation delays but guaranteed valid JSON structure
- With fine-grained: traditional streaming speed but must handle JSON errors

## Important Code Patterns
- Handle streaming event: `if chunk.type == "input_json": process chunk.partial_json or chunk.snapshot`
- Enable fine-grained: `run_conversation(messages, tools=[schema], fine_grained=True)`
- Error handling: `try: json.loads(chunk.snapshot) except json.JSONDecodeError: handle error`
- Access partial data: `chunk.partial_json` for incremental piece
- Access cumulative: `chunk.snapshot` for complete JSON so far
- Invalid JSON examples: `"word_count": undefined` instead of proper number

## Best Practices
- Default mode sufficient for most applications (validation ensures correctness)
- Enable fine-grained when: need real-time UI updates, want early partial processing, buffering delays hurt UX
- Always implement robust JSON error handling with fine-grained mode
- Fine-grained requires comfortable handling invalid/incomplete JSON
- API validation catches errors in default mode (may wrap problematic values as strings)
- Use fine-grained for responsiveness priority over guaranteed validity
- Consider user experience impact of buffering delays vs. error handling complexity

In [7]:
# Load env variables and create client
from dotenv import load_dotenv
from anthropic import Anthropic


load_dotenv()

client = Anthropic()
model = "claude-sonnet-4-5"

In [8]:
# Helper functions


def add_user_message(messages, message):
    if isinstance(message, list):
        user_message = {
            "role": "user",
            "content": message,
        }
    else:
        user_message = {
            "role": "user",
            "content": [{"type": "text", "text": message}],
        }
    messages.append(user_message)


def add_assistant_message(messages, message):
    if isinstance(message, list):
        assistant_message = {
            "role": "assistant",
            "content": message,
        }
    elif hasattr(message, "content"):
        content_list = []
        for block in message.content:
            if block.type == "text":
                content_list.append({"type": "text", "text": block.text})
            elif block.type == "tool_use":
                content_list.append(
                    {
                        "type": "tool_use",
                        "id": block.id,
                        "name": block.name,
                        "input": block.input,
                    }
                )
        assistant_message = {
            "role": "assistant",
            "content": content_list,
        }
    else:
        # String messages need to be wrapped in a list with text block
        assistant_message = {
            "role": "assistant",
            "content": [{"type": "text", "text": message}],
        }
    messages.append(assistant_message)


def chat_stream(
    messages,
    system=None,
    temperature=1.0,
    stop_sequences=[],
    tools=None,
    tool_choice=None,
    betas=[],
):
    params = {
        "model": model,
        "max_tokens": 1000,
        "messages": messages,
        "temperature": temperature,
        "stop_sequences": stop_sequences,
    }

    if tool_choice:
        params["tool_choice"] = tool_choice

    if tools:
        params["tools"] = tools

    if system:
        params["system"] = system

    if betas:
        params["betas"] = betas

    return client.beta.messages.stream(**params)


def text_from_message(message):
    return "\n".join([block.text for block in message.content if block.type == "text"])

In [9]:
# Tool definition
from anthropic.types import ToolParam

save_article_schema = ToolParam(
    {
        "name": "save_article",
        "description": "Saves a scholarly journal article",
        "input_schema": {
            "type": "object",
            "properties": {
                "abstract": {
                    "type": "string",
                    "description": "Abstract of the article. One short sentence max",
                },
                "meta": {
                    "type": "object",
                    "properties": {
                        "word_count": {
                            "type": "integer",
                            "description": "Word count",
                        },
                        "review": {
                            "type": "string",
                            "description": "Eight sentence review of the paper",
                        },
                    },
                    "required": ["word_count", "review"],
                },
            },
            "required": ["abstract", "meta"],
        },
    }
)
save_short_article_schema = ToolParam(
    {
        "name": "save_article",
        "description": "Saves a scholarly journal article",
        "input_schema": {
            "type": "object",
            "properties": {
                "abstract": {
                    "type": "string",
                    "description": "Abstract of the article. One short sentence max",
                },
                "meta": {
                    "type": "object",
                    "properties": {
                        "word_count": {
                            "type": "integer",
                            "description": "Word count",
                        },
                        "review": {
                            "type": "string",
                            "description": "Review of paper. One short sentence max",
                        },
                    },
                    "required": ["word_count", "review"],
                },
            },
            "required": ["abstract", "meta"],
        },
    }
)


def save_article(**kwargs):
    return "Article saved!"


In [10]:
# Tool Running
import json


def run_tool(tool_name, tool_input):
    if tool_name == "save_article":
        return save_article(**tool_input)


def run_tools(message):
    tool_requests = [block for block in message.content if block.type == "tool_use"]
    tool_result_blocks = []

    for tool_request in tool_requests:
        try:
            tool_output = run_tool(tool_request.name, tool_request.input)
            tool_result_block = {
                "type": "tool_result",
                "tool_use_id": tool_request.id,
                "content": json.dumps(tool_output),
                "is_error": False,
            }
        except Exception as e:
            tool_result_block = {
                "type": "tool_result",
                "tool_use_id": tool_request.id,
                "content": f"Error: {e}",
                "is_error": True,
            }

        tool_result_blocks.append(tool_result_block)

    return tool_result_blocks

In [11]:
# Run conversation
def run_conversation(messages, tools=[], tool_choice=None, fine_grained=False):
    while True:
        with chat_stream(
            messages,
            tools=tools,
            betas=["fine-grained-tool-streaming-2025-05-14"] if fine_grained else [],
            tool_choice=tool_choice,
        ) as stream:
            for chunk in stream:
                if chunk.type == "text":
                    print(chunk.text, end="")

                if chunk.type == "content_block_start":
                    if chunk.content_block.type == "tool_use":
                        print(f'\n>>> Tool Call: "{chunk.content_block.name}"')

                if chunk.type == "input_json" and chunk.partial_json:
                    print(chunk.partial_json, end="")

                if chunk.type == "content_block_stop":
                    print("\n")

            response = stream.get_final_message()

        add_assistant_message(messages, response)

        if response.stop_reason != "tool_use":
            break

        tool_results = run_tools(response)
        add_user_message(messages, tool_results)

        if tool_choice:
            break

    return messages

In [14]:
messages = []

add_user_message(
    messages,
    "Create and save a fake computer science article",
)

run_conversation(
    messages,
    tools=[save_article_schema],
    fine_grained=True
)

I'll create and save a fake computer science article for you.


>>> Tool Call: "save_article"
{"abstract": "This paper introduces a novel quantum-inspired algorithm for distributed graph partitioning that achieves logarithmic time complexity on sparse networks.", "meta": {
  "word_count": 4250,
  "review": "This paper presents an innovative approach to the graph partitioning problem by leveraging quantum-inspired heuristics in distributed computing environments. The authors demonstrate that their algorithm, QuPart, achieves O(log n) time complexity on sparse graphs with bounded degree, which represents a significant improvement over classical methods. The experimental evaluation on real-world social network datasets shows up to 40% reduction in edge cuts compared to METIS and other state-of-the-art partitioners. The theoretical analysis is rigorous and the proofs are well-constructed, though some assumptions about network topology may limit practical applicability. The distributed impl

[{'role': 'user',
  'content': [{'type': 'text',
    'text': 'Create and save a fake computer science article'}]},
 {'role': 'assistant',
  'content': [{'type': 'text',
    'text': "I'll create and save a fake computer science article for you."},
   {'type': 'tool_use',
    'id': 'toolu_01VQTGshtZkru9fKj2LiFC4d',
    'name': 'save_article',
    'input': {'abstract': 'This paper introduces a novel quantum-inspired algorithm for distributed graph partitioning that achieves logarithmic time complexity on sparse networks.',
     'meta': {'word_count': 4250,
      'review': 'This paper presents an innovative approach to the graph partitioning problem by leveraging quantum-inspired heuristics in distributed computing environments. The authors demonstrate that their algorithm, QuPart, achieves O(log n) time complexity on sparse graphs with bounded degree, which represents a significant improvement over classical methods. The experimental evaluation on real-world social network datasets shows 