# Response Streaming

## Key Concepts
- Streaming sends response chunks as they're generated (like ChatGPT's typing effect)
- Improves perceived responsiveness for long responses
- Uses Server-Sent Events (SSE) to stream data
- Each chunk contains partial content that builds the full response
- Must handle stream properly with try/finally or context managers

## Important Code Patterns
- `client.messages.stream()` - creates streaming response instead of `messages.create()`
- `with client.messages.stream(...) as stream:` - context manager handles cleanup
- `for text in stream.text_stream:` - iterate through text chunks as they arrive
- `stream.get_final_message()` - get complete message object after streaming
- Handle exceptions to ensure stream closes properly
- Print chunks without newlines: `print(chunk, end="", flush=True)`

## Best Practices
- Always use context manager or try/finally to ensure stream closes
- Flush output after each chunk for real-time display
- Streaming adds minimal overhead but greatly improves UX
- Non-streaming is simpler for batch processing or when full response needed first
- Consider streaming for user-facing applications, non-streaming for backend processing

In [15]:
# Install dependencies
%pip install anthropic python-dotenv

# Imports
from dotenv import load_dotenv
import os
from anthropic import Anthropic

# Load environment variables
load_dotenv()

# Create client
client = Anthropic(api_key=os.getenv('ANTHROPIC_API_KEY'))
model = "claude-sonnet-4-0"


1041.65s - pydevd: Sending message related to process being replaced timed-out after 5 seconds


You should consider upgrading via the '/Users/lmbirss/Documents/coding-projects/building-with-claude-api/venv/bin/python -m pip install --upgrade pip' command.[0m
Note: you may need to restart the kernel to use updated packages.


In [16]:
def add_user_message(messsages, text):
    user_message = {"role": "user", "content": text}
    messages.append(user_message)
def add_assistant_message(messsages, text):
    assistant_message = {"role": "assistant", "content": text}
    messages.append(assistant_message)
def chat(messages, system=None):
    params = {
         "model": model,
         "max_tokens": 1000,
         "messages": messages,
    }
    if system:
         params["system"] = system
    message = client.messages.create(**params)
    return message.content[0].text


In [17]:
messages = []
add_user_message(messages, "Write a 1 sentence description of a fake database")

stream = client.messages.create(
    model=model,
    max_tokens=1000,
    messages=messages,
    stream=True
)

for event in stream:
    print(event)

RawMessageStartEvent(message=Message(id='msg_01VHF3cGPF5petdAybRFUi1Z', content=[], model='claude-sonnet-4-20250514', role='assistant', stop_reason=None, stop_sequence=None, type='message', usage=Usage(cache_creation=CacheCreation(ephemeral_1h_input_tokens=0, ephemeral_5m_input_tokens=0), cache_creation_input_tokens=0, cache_read_input_tokens=0, input_tokens=18, output_tokens=1, server_tool_use=None, service_tier='standard')), type='message_start')
RawContentBlockStartEvent(content_block=TextBlock(citations=None, text='', type='text'), index=0, type='content_block_start')
RawContentBlockDeltaEvent(delta=TextDelta(text='F', type='text_delta'), index=0, type='content_block_delta')
RawContentBlockDeltaEvent(delta=TextDelta(text='akeDB is a lightweight', type='text_delta'), index=0, type='content_block_delta')
RawContentBlockDeltaEvent(delta=TextDelta(text=' in', type='text_delta'), index=0, type='content_block_delta')
RawContentBlockDeltaEvent(delta=TextDelta(text='-memory database simula

In [None]:
with client.messages.stream(
    model=model,
    max_tokens=1000,
    messages=messages
) as stream:
    for text in stream.text_stream:
        # print(text, end="")
        pass
    
final_message = stream.get_final_message()

Message(id='msg_017AKaJyRi28eVwjnN1vHtfr', content=[TextBlock(citations=None, text='The "GlobalMind Database" is a fictional cloud-based repository that claims to store and cross-reference the complete digital footprints, behavioral patterns, and predictive models of every internet user worldwide for advanced social analytics.', type='text')], model='claude-sonnet-4-20250514', role='assistant', stop_reason='end_turn', stop_sequence=None, type='message', usage=Usage(cache_creation=CacheCreation(ephemeral_1h_input_tokens=0, ephemeral_5m_input_tokens=0), cache_creation_input_tokens=0, cache_read_input_tokens=0, input_tokens=18, output_tokens=49, server_tool_use=None, service_tier='standard'))