-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug ? claude3 responses seems to be cut after a few words/tokens. #1543
Comments
for comparison: how claude3 opus behaves in the anthropic console/playground. I used the very same prompts. temperature is on a scale 0->1 (and defaults to zero). whereas openai temperature rnages from 0 to 2 and defaults to 1. max tokens to samples = max length of one claude response = defaults to 1000 it's visible that claude 3 answer is appropriate, correct and not cut there. |
I’ve noticed that with Claude as well. I respond with “continue” and it finishes. |
Continue doesn't work for me. My API key does work on TypingMind. I do have ChatbotUI Plus so I don't think its related to your account level. There is also twitter thread where @mckaywrigley is talking about this and relates it to the type of account. (Evaluator Accounts). |
I'm fine throwing a few $$ to buy extra credit (instead of the free $5 that were awarded on my account and that were used in the screenshots above). but I need someone to confirm whether or not it's acutally usefull so that I dont throw money out the window |
I have ~25 dollars worth of credit, plus the 5, so I don't think it's that. I am on Build Tier 1 which is one above the free. It's either they are limiting the tokens coming out or something wrong the the query. This is just speculation. |
instead, I just tried to execute the same system / user prompt using a bare minimum python script (copy/paste from anthropic console) For me, anthropic anser is not stripped off. so it's smells like something from chatbotui
for the record, I just took a fresh venv with python 3.12 and did a pip install anthropic python-dotenv (to store my API key in a separate .env file) => all other libs are dependancies |
this is what I suspect is happening making claude-3-opus-20240229 model (and maybe other claude / anthropic models) totally useless in chatbotui ? just an hypothesis, but I can't think of another one that would better explain the behaviors and facts collected until now. |
They could also be limiting responses to the IP of the ChatBotUI server. This could be proved by hosting ChatBot locally and seeing if the response is still cut. |
Still debugging this but debug console says pipeline broken when this occurs for me. I'll try to look into this tomorrow. |
Solved it: it's due to Vercel's MaxDuration timeout (15 seconds if you check the logs). However, setting the API endpoint to use Edge runtime fixes the issue. // export const runtime = "edge" Might've been an accidental oversight but uncommenting this line in the anthropic endpoint fixes the issue. More information on why it happens: https://vercel.com/docs/functions/configuring-functions/duration |
@tysonvolte Thanks for the suggestion! I tried that and have redeployed to Vercel, but it doesn't seem to help in my case. |
Can you check your Vercel logs and paste what error you're getting? |
Your milage may vary, but I found the same issue occurred via the Workbench for Sonnet and Haiku. Opus worked fine. Put $5 on the account then they all would complete the queries correctly in the Workbench, Thanks @tysonvolte's however removing the // seemed to break my deployment, ie: only a single word was returned. Based on the 10second timeout, possibly running as a serverless function not an edge function which is where @tysonvolte was heading, but didn't work for me. |
That's how all API models work, that's why token counts escalate quickly during a conversation, they are compounding. If you pasted say 4k tokens in, and it sends 4k back, on your next prompt you send 8k back plus your new prompt... and this goes on. I've put a cap it to only include history for the last 10 messages, but a better way would be to use a model to summarise \ rationalise the previous messages\responses. |
@tysonvolte @lgruen @rmkr @alexisdal @mckaywrigley Try this fix #1571 BugFix: Bump Anthropic SDK to 0.18.0, resolve parsing issue. Please note, that I can not code, but this seemed to consistently fix the issue on my end. |
@spammenotinoz That worked perfectly! Thank you very much.
I had the same issue as @spammenotinoz described in #1543 (comment). For reference, here are the Vercel logs (before @spammenotinoz's fix in #1571):
|
Yeah turns out it wasn't just that line, I'd already upped Anthropic's ver for a separate issue which is why it seemed like that fixed it the problem. Great work y'all! |
I had the same issue, also fixed by #1571. Thank you @spammenotinoz! Hope it gets merged soon. |
Merged and fixed. @spammenotinoz you are a hero. |
I wanted to evaluate claude3 opus, their latest biggest model that just ranked #3 in the chatbot arena ranking. the closest to gpt4-turbo to date.
So I just added some credits on console then created a new claude api key that I pasted in chatbotui to try claude3 opus. I used the default settings. but the first answer was immediately cut in the middle of the sentence. that reminded me of the default max_length on the openai playground ridiculously limited to 256 tokens. But I didn't find this param in the settings at all. I just increased the context lenght to 200k tokens, just in case (but litkely unrelated) and started a new chat on claude 3 opus. screenshot below.
Yet the answers seems to be cut-off / chopped. I do not know if
=> in other words, I do not know what I'm doing wrong, but clearly this is not an expected behavior of opus 3.
The text was updated successfully, but these errors were encountered: