# Controlling Model Output

## Key Concepts
- Two powerful techniques: prefilled assistant messages and stop sequences
- Prefilled messages provide the beginning of Claude's response to steer direction
- Stop sequences force Claude to end generation at specific strings
- Claude continues from exactly where prefilled text ends (doesn't repeat it)
- Stop sequence itself is not included in the final response
- Both techniques give fine-grained control over output format and content

## Important Code Patterns
- `add_assistant_message(messages, "Coffee is better because")` - prefill response start
- `chat(messages, stop_sequences=["5"])` - stop at specific string
- `stop_sequences=[", 5"]` - fine-tune stopping point to avoid trailing punctuation
- Modify chat function to accept `stop_sequences=[]` parameter
- Pass stop_sequences to `client.messages.create()` in API call
- Claude picks up immediately after prefilled text without repeating it

## Best Practices
- Use prefilling for consistent formatting and structure
- Use prefilling when you need Claude to take a particular stance (not neutral)
- Use stop sequences to cap responses at natural breakpoints
- Combine both techniques for structured output and templates
- Fine-tune stop sequences to control exactly where stopping occurs
- Prefilling works by adding assistant message at end of messages list

In [None]:
# Install dependencies
%pip install anthropic python-dotenv

# Imports
from dotenv import load_dotenv
import os
from anthropic import Anthropic

# Load environment variables
load_dotenv()

# Create client
client = Anthropic(api_key=os.getenv('ANTHROPIC_API_KEY'))
model = "claude-sonnet-4-0"


In [22]:
def add_user_message(messsages, text):
    user_message = {"role": "user", "content": text}
    messages.append(user_message)
def add_assistant_message(messsages, text):
    assistant_message = {"role": "assistant", "content": text}
    messages.append(assistant_message)
def chat(messages, system=None, temperature=1.0, stop_sequences=[]):
    params = {
         "model": model,
         "max_tokens": 1000,
         "messages": messages,
         "temperature": temperature,
         "stop_sequences": stop_sequences
    }
    if system:
         params["system"] = system
    message = client.messages.create(**params)
    return message.content[0].text


In [23]:
messages = []
add_user_message(
    messages,
    "Is tea or coffee better at breakfast",
)
add_assistant_message(
   messages, 
   "Coffee is better because"
)   

answer = chat(messages)
answer



" coffee at breakfast helps wake you up - the caffeine is more concentrated. However, tea has polyphenol antioxidants, and both have health benefits. If you're looking to wake up, coffee is superior. If you want something gentler, tea is fine too."

In [27]:
messages = []
add_user_message(
    messages, 
    "Count from 1 to 10 "
)

answer = chat(messages, stop_sequences=[", 5"])
answer

'1, 2, 3, 4'