Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug ? claude3 responses seems to be cut after a few words/tokens. #1543

Closed
alexisdal opened this issue Mar 8, 2024 · 20 comments
Closed

Bug ? claude3 responses seems to be cut after a few words/tokens. #1543

alexisdal opened this issue Mar 8, 2024 · 20 comments

Comments

@alexisdal
Copy link

I wanted to evaluate claude3 opus, their latest biggest model that just ranked #3 in the chatbot arena ranking. the closest to gpt4-turbo to date.

So I just added some credits on console then created a new claude api key that I pasted in chatbotui to try claude3 opus. I used the default settings. but the first answer was immediately cut in the middle of the sentence. that reminded me of the default max_length on the openai playground ridiculously limited to 256 tokens. But I didn't find this param in the settings at all. I just increased the context lenght to 200k tokens, just in case (but litkely unrelated) and started a new chat on claude 3 opus. screenshot below.

Yet the answers seems to be cut-off / chopped. I do not know if

  • I'm not using chatbotui right (I'm still on the free plan on chatbotui.com, maybe that's related?)
  • i'm supposed to configure chatbotui settings somehow and I missed it
  • there's something wrong with the settings on my anthropic account (not enough money spent, too low credits)

=> in other words, I do not know what I'm doing wrong, but clearly this is not an expected behavior of opus 3.

image

@alexisdal
Copy link
Author

for comparison: how claude3 opus behaves in the anthropic console/playground. I used the very same prompts.
notice the default settings on the left

temperature is on a scale 0->1 (and defaults to zero). whereas openai temperature rnages from 0 to 2 and defaults to 1.

max tokens to samples = max length of one claude response = defaults to 1000

it's visible that claude 3 answer is appropriate, correct and not cut there.

image

@ScrogBot
Copy link

ScrogBot commented Mar 8, 2024

I’ve noticed that with Claude as well. I respond with “continue” and it finishes.

@rmkr
Copy link

rmkr commented Mar 8, 2024

Continue doesn't work for me. My API key does work on TypingMind. I do have ChatbotUI Plus so I don't think its related to your account level. There is also twitter thread where @mckaywrigley is talking about this and relates it to the type of account. (Evaluator Accounts).

image

@alexisdal
Copy link
Author

I'm fine throwing a few $$ to buy extra credit (instead of the free $5 that were awarded on my account and that were used in the screenshots above). but I need someone to confirm whether or not it's acutally usefull so that I dont throw money out the window
=> indeed, I do not understand why the playground/console (that works with API) would give a different response compared to chatbotui that leverages also the API access.... 🤔

@rmkr
Copy link

rmkr commented Mar 9, 2024

I have ~25 dollars worth of credit, plus the 5, so I don't think it's that. I am on Build Tier 1 which is one above the free. It's either they are limiting the tokens coming out or something wrong the the query. This is just speculation.

@alexisdal
Copy link
Author

instead, I just tried to execute the same system / user prompt using a bare minimum python script (copy/paste from anthropic console)

For me, anthropic anser is not stripped off. so it's smells like something from chatbotui

(.venv) C:\Users\alex\Desktop\test_claude3>nano .env

(.venv) C:\Users\alex\Desktop\test_claude3>cat run.py
from dotenv import load_dotenv
load_dotenv()

import anthropic

client = anthropic.Anthropic(
    # defaults to os.environ.get("ANTHROPIC_API_KEY")
    # api_key="my_api_key",
)
message = client.messages.create(
    model="claude-3-opus-20240229",
    max_tokens=1000,
    temperature=0,
    system="You are a friendly, helpful AI assistant.",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Hi Opus, Is that true that you're really super charged compared to other anthropic AI models? what are the other anthropic models and how those models compare? explain me"
                }
            ]
        }
    ]
)
print(message.content)

(.venv) C:\Users\alex\Desktop\test_claude3>python run.py
[ContentBlock(text='I appreciate your interest, but I\'m actually not sure how I compare to other Anthropic AI models in terms of capabilities. I know I was created by Anthropic, but I don\'t have detailed information about their other models or my exact capabilities relative to them. \n\nI aim to be helpful and to engage in friendly conversation, but I think it\'s best if I avoid making strong claims about being "super charged" or superior to other AIs, since I\'m quite uncertain about that. I\'m an AI with significant capabilities in many areas, but also important limitations.\n\nRather than comparing myself to other AIs, I think it\'s better to focus on how I can be helpful to you in our conversation. Please let me know if there are any topics you\'d like to discuss or ways I can assist you!', type='text')]

(.venv) C:\Users\alex\Desktop\test_claude3>


(.venv) C:\Users\alex\Desktop\test_claude3>pip list
Package            Version
------------------ --------
annotated-types    0.6.0
anthropic          0.19.1
anyio              4.3.0
certifi            2024.2.2
charset-normalizer 3.3.2
colorama           0.4.6
distro             1.9.0
filelock           3.13.1
fsspec             2024.2.0
h11                0.14.0
httpcore           1.0.4
httpx              0.27.0
huggingface-hub    0.21.4
idna               3.6
packaging          23.2
pip                24.0
pydantic           2.6.3
pydantic_core      2.16.3
python-dotenv      1.0.1
PyYAML             6.0.1
requests           2.31.0
sniffio            1.3.1
tokenizers         0.15.2
tqdm               4.66.2
typing_extensions  4.10.0
urllib3            2.2.1

(.venv) C:\Users\alex\Desktop\test_claude3>

for the record, I just took a fresh venv with python 3.12 and did a pip install anthropic python-dotenv (to store my API key in a separate .env file) => all other libs are dependancies

@alexisdal
Copy link
Author

this is what I suspect is happening
chatbotui strips off claude3 responses
upon clinking on the send button, then entire context is sent back to claude, who then thinks it cut its own responses
and then claude tries to apologize for cutting its own answers, which it did not do, but it's what it looks like given the dict sent in the API request,
and then the new response it eventually cut off again by chatbotui
=> and the loop basically continues

making claude-3-opus-20240229 model (and maybe other claude / anthropic models) totally useless in chatbotui ?

just an hypothesis, but I can't think of another one that would better explain the behaviors and facts collected until now.

@rmkr
Copy link

rmkr commented Mar 9, 2024

They could also be limiting responses to the IP of the ChatBotUI server. This could be proved by hosting ChatBot locally and seeing if the response is still cut.

@rmkr
Copy link

rmkr commented Mar 9, 2024

I confirmed that if I host locally. I do not have issue with Claude, it may be a limiting issues with Claude to the server.

Screenshot 2024-03-09 at 13 11 47

@tysonvolte
Copy link

Still debugging this but debug console says pipeline broken when this occurs for me. I'll try to look into this tomorrow.

@tysonvolte
Copy link

Solved it: it's due to Vercel's MaxDuration timeout (15 seconds if you check the logs). However, setting the API endpoint to use Edge runtime fixes the issue.

// export const runtime = "edge"

Might've been an accidental oversight but uncommenting this line in the anthropic endpoint fixes the issue.

More information on why it happens:

https://vercel.com/docs/functions/configuring-functions/duration

@lgruen
Copy link

lgruen commented Mar 17, 2024

@tysonvolte Thanks for the suggestion! I tried that and have redeployed to Vercel, but it doesn't seem to help in my case.

@tysonvolte
Copy link

@tysonvolte Thanks for the suggestion! I tried that and have redeployed to Vercel, but it doesn't seem to help in my case.

Can you check your Vercel logs and paste what error you're getting?

@spammenotinoz
Copy link
Contributor

spammenotinoz commented Mar 17, 2024

Your milage may vary, but I found the same issue occurred via the Workbench for Sonnet and Haiku. Opus worked fine. Put $5 on the account then they all would complete the queries correctly in the Workbench,
Still have the problem in ChatBot-UI.

Thanks @tysonvolte's however removing the // seemed to break my deployment, ie: only a single word was returned.
image

image

Based on the 10second timeout, possibly running as a serverless function not an edge function which is where @tysonvolte was heading, but didn't work for me.

image

@spammenotinoz
Copy link
Contributor

this is what I suspect is happening chatbotui strips off claude3 responses upon clinking on the send button, then entire context is sent back to claude, who then thinks it cut its own responses and then claude tries to apologize for cutting its own answers, which it did not do, but it's what it looks like given the dict sent in the API request, and then the new response it eventually cut off again by chatbotui => and the loop basically continues

making claude-3-opus-20240229 model (and maybe other claude / anthropic models) totally useless in chatbotui ?

just an hypothesis, but I can't think of another one that would better explain the behaviors and facts collected until now.

That's how all API models work, that's why token counts escalate quickly during a conversation, they are compounding. If you pasted say 4k tokens in, and it sends 4k back, on your next prompt you send 8k back plus your new prompt... and this goes on. I've put a cap it to only include history for the last 10 messages, but a better way would be to use a model to summarise \ rationalise the previous messages\responses.

@spammenotinoz
Copy link
Contributor

spammenotinoz commented Mar 18, 2024

@tysonvolte @lgruen @rmkr @alexisdal @mckaywrigley

Try this fix #1571

BugFix: Bump Anthropic SDK to 0.18.0, resolve parsing issue.
Improve error handling
Revise Anthropic to run on the edge, resolve timeout issue.

Please note, that I can not code, but this seemed to consistently fix the issue on my end.

@lgruen
Copy link

lgruen commented Mar 19, 2024

@spammenotinoz That worked perfectly! Thank you very much.

@tysonvolte Thanks for the suggestion! I tried that and have redeployed to Vercel, but it doesn't seem to help in my case.

Can you check your Vercel logs and paste what error you're getting?

I had the same issue as @spammenotinoz described in #1543 (comment). For reference, here are the Vercel logs (before @spammenotinoz's fix in #1571):

Mar 19 19:07:25.37
xxx
POST
/api/chat/anthropic
From chunk: [ 'event: content_block_delta' ]
Mar 19 19:07:25.37
xxx
POST
/api/chat/anthropic
Could not parse message into JSON:
Mar 19 19:07:25.37
xxx
POST
/api/chat/anthropic
SyntaxError: Unexpected end of JSON input at (node_modules/@anthropic-ai/sdk/streaming.mjs:58:39) at (node_modules/ai/dist/index.mjs:417:19) at (node_modules/ai/dist/index.mjs:298:30)
Mar 19 19:07:25.37
xxx
POST
/api/chat/anthropic
SyntaxError: Unexpected end of JSON input at (node_modules/@anthropic-ai/sdk/streaming.mjs:58:39) at (node_modules/ai/dist/index.mjs:417:19) at (node_modules/ai/dist/index.mjs:298:30)
Mar 19 19:07:25.37
xxx
POST
/api/chat/anthropic
SyntaxError: Unexpected end of JSON input at (node_modules/@anthropic-ai/sdk/streaming.mjs:58:39) at (node_modules/ai/dist/index.mjs:417:19) at (node_modules/ai/dist/index.mjs:298:30)
Mar 19 19:07:25.37
xxx
POST
/api/chat/anthropic
SyntaxError: Unexpected end of JSON input 

@tysonvolte
Copy link

tysonvolte commented Mar 19, 2024

Yeah turns out it wasn't just that line, I'd already upped Anthropic's ver for a separate issue which is why it seemed like that fixed it the problem. Great work y'all!

@mlovic
Copy link

mlovic commented Mar 20, 2024

I had the same issue, also fixed by #1571. Thank you @spammenotinoz! Hope it gets merged soon.

@mckaywrigley
Copy link
Owner

Merged and fixed. @spammenotinoz you are a hero.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants