Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On bedrock can't set tokens for claude-sonnet-3-5 > 4096 even though the model supports it #2772

Closed
mwrshah opened this issue Aug 22, 2024 · 3 comments
Labels
ai/provider invalid This doesn't seem right

Comments

@mwrshah
Copy link

mwrshah commented Aug 22, 2024

Description

On trying a call with streamText, get an error back:

ValidationException: The maximum tokens you requested exceeds the model limit of 4096. Try again with a maximum tokens value that is lower than 4096.

Using boto3 I am able to make calls with maxTokens > 4096.

Code example

    result = await streamText({
      model: bedrock('anthropic.claude-3-5-sonnet-20240620-v1:0'),
      maxTokens: 7168,
      headers:{
        'anthropic-beta': 'max-tokens-3-5-sonnet-2024-07-15'
      },
      abortSignal: req.signal,
      system: 'You are a software developer',
      messages: logConvertToCoreMessages(messages),
      onFinish(finish_reason) {
        data.close()
        console.log(`finish reason: ${finish_reason.finishReason} : ${JSON.stringify(finish_reason.usage, null, 2)}`)
      }
    });
    return result.toDataStreamResponse();

Additional context

No response

@lgrammel
Copy link
Collaborator

Is it really available? The model version in Bedrock seems to be older. Are you able to generate > 4096 tokens with Boto3, or is it just not validating that parameter?

@cfernhout
Copy link

cfernhout commented Sep 19, 2024

Hi @lgrammel, I have the same issue. I believe it's an AWS issue because when using boto3 (Python) I can't set max_tokens to >4092, but I can set max_tokens to 8192 using the anthropic SDK.

I stumbled on this issue while looking into my own issue. I'm not a vercel AI user 😉

import boto3
from anthropic import AnthropicBedrock
from *** import AWS_ACCESS_KEY_ID, AWS_REGION, AWS_SECRET_ACCESS_KEY

client = AnthropicBedrock(
    aws_access_key=AWS_ACCESS_KEY_ID,
    aws_secret_key=AWS_SECRET_ACCESS_KEY,
    aws_region=AWS_REGION,
)

message = client.messages.create(
    model="eu.anthropic.claude-3-5-sonnet-20240620-v1:0",
    max_tokens=8192,
    messages=[{"role": "user", "content": "Hello, world"}]
)
# This works perfectly fine


client = boto3.client(
    'bedrock-runtime',
    region_name=AWS_REGION,
    aws_access_key_id=AWS_ACCESS_KEY_ID,
    aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
)

response = client.converse(
    modelId=f"arn:aws:bedrock:{AWS_REGION}:***:inference-profile/eu.anthropic.claude-3-5-sonnet-20240620-v1:0",
    messages=[{"role": "user", "content": [{"text": "Hello, world"}]}],
    inferenceConfig={
        'maxTokens': 8192,
    },
)
# ValidationException: An error occurred (ValidationException) when calling the Converse operation: The maximum tokens you requested exceeds the model limit of 4096. Try again with a maximum tokens value that is lower than 4096.

The only caveat is that AnthropicBedrock create doesn't validate max_tokens. So I can run the following without an error:

message = client.messages.create(
    model="eu.anthropic.claude-3-5-sonnet-20240620-v1:0",
    max_tokens=2_000_000,
    messages=[{"role": "user", "content": "Hello, world"}]
)

@lgrammel lgrammel added the invalid This doesn't seem right label Sep 19, 2024
@cfernhout
Copy link

cfernhout commented Sep 19, 2024

The bedrock team is aware of this issue: boto/boto3#4279 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ai/provider invalid This doesn't seem right
Projects
None yet
Development

No branches or pull requests

3 participants