Token Limiting Associated with Reasoning LLMs #1322
this-is-sebs
started this conversation in
General
Replies: 1 comment 4 replies
-
Thanks for bringing this up. I think I'll first bump the max tokens limit in the setting, and then look into whether it's possible to treat reasoning tokens separately. Since reasoning model APIs have various different formats at the moment, it's unclear if it's possible. |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
I have been using obsidian copilot for a while now. With the advent of reasoning models I am noticing that sometimes the obsidian copilot is not returning an output because of a token rate limit.
For example: I ask gpt-o3-mini-high a question that elicits a 2000 reasoning token output, but the output set in copilot is 1000 tokens. The obsidian copilot will obviously not return anything due to APIs not returning reasoning tokens. However, this often happens even when I increase the limit for tokens through copilot. If I ask a complex question it might exceed 12k-19k in reasoning tokens.
I was wondering if there was a specific way to limit the reasoning tokens before the AI then synthesizes those reasoning tokens into output tokens, through an api variable? Additionally, might it be possible to increase the LLM max output tokens beyond the maximum? This is just a problem I have stumbled into when using reasoning models with your platform. I know that some other reasoning models (DeepSeek...) may output even more verbose reasoning tokens beyond 19k. I wasn't sure if it was possible to separate the two, reasoning and output tokens. The costs associated with the reasoning tokens is also to be considered.
Beta Was this translation helpful? Give feedback.
All reactions