Skip to content

fix: Ignore the Chain-of-Thought in AI response #952

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 8, 2025

Conversation

gadfly3173
Copy link
Contributor

@gadfly3173 gadfly3173 commented Feb 5, 2025

I reviewed the code of the official OpenAI SDK, and it seems that the official SDK does not handle the direct return of the Chain-of-Thought <think>{thinking}</think>\n\n{content}. After switching to the official SDK, you may need to perform additional processing on the returned results.

@gadfly3173 gadfly3173 changed the title fix: Ignore the thinking chain in AI response fix: Ignore the Chain-of-Thought in AI response Feb 5, 2025
@gadfly3173 gadfly3173 marked this pull request as draft February 6, 2025 09:31
@gadfly3173 gadfly3173 marked this pull request as ready for review February 6, 2025 09:57
@gadfly3173
Copy link
Contributor Author

gadfly3173 commented Feb 6, 2025

Moved CoT removing method into GenerateCommitMessage to avoid remove think tags in prompts.

@gadfly3173
Copy link
Contributor Author

gadfly3173 commented Feb 7, 2025

Improve regex pattern to specifically match and capture content within <think>, <thought>, <thinking>, or <thought_chain> tags.

Can we merge this PR first? It removes some API service providers that do not support the openai reasoning_content field and adds CoT in the API response content.

Refer:

https://github.com/Mintplex-Labs/anything-llm/blob/4b71dd100593b3de96d6e077b816d1a8dcffa120/frontend/src/components/WorkspaceChat/ChatContainer/ChatHistory/ThoughtContainer/index.jsx#L16-L21

https://github.com/CherryHQ/cherry-studio/blob/762c3d4950983f11f5e5552fdbfab95e99d3f929/src/renderer/src/utils/formats.ts#L84-L100

@love-linger
Copy link
Collaborator

The openai-dotnet team will release a new version within a week. Please wait for the feat-openai branch being merged first.

@gadfly3173 gadfly3173 marked this pull request as draft February 7, 2025 06:47
- Improve chat response processing to handle thinking patterns using regular expressions.
- Migrate server value by removing trailing '/chat/completions' path.
@gadfly3173 gadfly3173 marked this pull request as ready for review February 8, 2025 03:51
@gadfly3173
Copy link
Contributor Author

Can we merge this now? I want to use deepseek-r1-distill-llama-70b from groq, which is free and fast. 😄

@love-linger
Copy link
Collaborator

I'll check this PR tomorrow

@love-linger
Copy link
Collaborator

I'm using deepseek api (DeepSeek official service) with deepseek-reasoner model to test it, but I can not see the <think>...</think output...

@gadfly3173
Copy link
Contributor Author

I'm using deepseek api (DeepSeek official service) with deepseek-reasoner model to test it, but I can not see the <think>...</think output...

The official API does not have this problem, but it may be encountered when deploying through open-source models, such as the siliconflow API that was just launched (it has now been modified) and groq (it seems they are not planning to change it). Discussions about this support can be found in the communities of various clients such as OpenWebUI, cherry-studio, etc.

@love-linger love-linger merged commit ef2e0a8 into sourcegit-scm:develop Feb 8, 2025
13 checks passed
@love-linger love-linger self-assigned this Feb 8, 2025
@love-linger love-linger added the enhancement New feature or request label Feb 8, 2025
@gadfly3173
Copy link
Contributor Author

I'm using deepseek api (DeepSeek official service) with deepseek-reasoner model to test it, but I can not see the <think>...</think output...我正在使用 deepseek api(DeepSeek 官方服务)与 deepseek-reasoner 模型进行测试,但看不到 <think>...</think 输出...

You can try groq, which is free 1000 reqs/day

@gadfly3173 gadfly3173 deleted the fix/openai-think branch February 8, 2025 09:32
@love-linger
Copy link
Collaborator

  • In the next release's CHANGELOG, I will add the BREAKING CHANGE about the Server field of OpenAI configuration. I do not want to add some hard-code to migrate the old one.
    image

  • How do you think that we just display the entire output of AI service, but remove the CoT while trying to apply it to the commit message?

@gadfly3173
Copy link
Contributor Author

  • In the next release's CHANGELOG, I will add the BREAKING CHANGE about the Server field of OpenAI configuration. I do not want to add some hard-code to migrate the old one.

I think remove /chat/completions is safe, also could avoid users input this by mistake. (It's highly unlikely that a real API provider would have an endpoint like "/chat/completions/chat/completions")

  • How do you think that we just display the entire output of AI service, but remove the CoT while trying to apply it to the commit message?

For most AI clients, it should be considered to separate the thinking chain from the display of answers. However, in the scenario where the AI returns results as submitted information here, the thinking chain is of no use. Moreover, the current processing results temporarily store the thinking chain in the thinkingBuffer, and it is also very easy to retrieve it later if someone really needs it.

@love-linger
Copy link
Collaborator

All right. I have no questions. Good job!

@gadfly3173
Copy link
Contributor Author

Comments as records:

DeepSeek updated the chat template https://huggingface.co/deepseek-ai/DeepSeek-R1/commit/8a58a132790c9935686eb97f042afa8013451c9f , which makes the final output not include the starting <think> tag (which is really strange...) . For the current implementation, handling this situation can become cumbersome. vLLM has made additional processing and may need to observe how other API providers and chat clients will react.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants