<a href="https://colab.research.google.com/github/anthropics/anthropic-cookbook/blob/main/misc/sampling_past_max_tokens.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Sampling Responses from Claude Beyond the Max Tokens Limit

This notebook illustrates how to get Claude to give responses longer than the maximum value of the max_tokens parameter by using a prefill with the content of the previous message.

In [None]:
%%capture
!pip install anthropic

First, we'll prompt Claude by asking it to write something longer than 4096 tokens.

In [None]:
import anthropic

client = anthropic.Anthropic(
    api_key="YOUR API KEY HERE",
)
message = client.messages.create(
    model="claude-3-sonnet-20240229",
    max_tokens=4096,
    messages=[
        {
            "role": "user",
            "content": """
Please write five stories. Each should be at least 1000 words. Number the words to make sure you don't lose track. Make each story about a different animal.
Put them in <story_1>, <story_2>, ... tags
"""},

    ]
)



In [None]:
print(message.stop_reason)

max_tokens


You can see above that Claude stopped sampling because of max_tokens. And you can see below that Claude's message cuts out in the middle of the fifth story.

In [None]:
print(message.content[0].text)

<story_1>
1. Once upon a time, in the vast expanse of the African savanna, there lived a magnificent lion named Mufasa. 2. His golden mane rippled in the warm breeze, and his piercing amber eyes commanded respect from all who laid eyes upon him. 3. Mufasa was the king of the Pride Lands, a realm teeming with diverse wildlife and lush grasslands.

4. As the sun rose each morning, Mufasa would take his daily patrol, surveying his territory with a vigilant gaze. 5. He knew every inch of the Pride Lands, from the towering acacia trees to the winding rivers that sustained life. 6. His powerful roar echoed across the plains, a reminder to all that he was the undisputed ruler of this domain.

7. One day, while leading his pride on a hunt, Mufasa noticed a group of hyenas lurking in the shadows. 8. These scavengers were known for their cunning and ruthlessness, and they posed a constant threat to the delicate balance of the Pride Lands. 9. With a flick of his tail, Mufasa signaled his lionesse

Solution? We put the partially completed response, which was cut off in its prime, in an Assistant message to Claude, who then continues sampling and completes the story. You can see that Claude seamlessly picks up in mid-sentence.

In [None]:
message2 = client.messages.create(
    model="claude-3-sonnet-20240229",
    max_tokens=4096,
    messages=[
        {
            "role": "user",
            "content": """
Please write five stories. Each should be at least 1000 words. Number the words to make sure you don't lose track. Make each story about a different animal.
Put them in <story_1>, <story_2>, ... tags
"""},
        {
            "role": "assistant",
            "content": message.content[0].text  # The text of Claude's partially completed message.
        }
    ]
)
print(message2.content[0].text)

 to disperse in a whirlwind of vibrant colors.

11. Zico, enthralled by the spectacle, followed the macaws deeper into the heart of the rainforest, weaving through the intricate maze of vines and branches. 12. His family's calls for him to return faded into the distance, lost in the symphony of the jungle.

13. For hours, Zico trailed the macaws, his nimble form navigating the treacherous terrain with ease. 14. The macaws, sensing a potential threat, led him on a winding path, their flight seemingly effortless.

15. As dusk fell, Zico found himself utterly lost, surrounded by unfamiliar sights and sounds. 16. The calls of his family had long faded, replaced by the eerie hoots of owls and the rustling of unseen creatures lurking in the shadows.

17. Exhausted and disoriented, Zico alighted on a sturdy branch, his eyes wide with fear. 18. The night passed slowly, every rustle of leaves sending shivers down his spine.

19. As the first rays of dawn filtered through the canopy, Zico realiz

Please be advised that this approach unfortunately entails being "double-charged" for reading the input tokens in your prompt and single-charged for reading the output tokens of Claude's 4096-token response as input tokens. However, they will not be double-charged as output tokens.