Pulse · deepseek-ai/DeepSeek-R1 · GitHub

March 4, 2025 – March 11, 2025

Overview

0 Active pull requests

29 Active issues
- 0 Merged pull requests
- 0 Open pull requests
- 5 Closed issues
- 24 New issues

There hasn’t been any commit activity on deepseek-ai/DeepSeek-R1 in the last week.

Want to help out?

5 Issues closed by 3 people

建议再发布一个Deepseek-r1的直接量化版本
#188 closed Mar 11, 2025
可否给几个纯RL训练的数据示例？
#529 closed Mar 7, 2025
模型是否具备记忆、模糊记忆、遗忘机制。
#543 closed Mar 6, 2025
关于进一步优化模型的一些想法。
#541 closed Mar 6, 2025
关于进一步优化模型的一些想法。
#542 closed Mar 6, 2025

24 Issues opened by 22 people

提升AI学习效率——打破“独立事件“训练框架
#562 opened Mar 12, 2025
[Question] Issue with evaluation results and prompts on math500 dataset for qwen-R1distill-1.5B model
#561 opened Mar 11, 2025
关于模型日志的问题
#560 opened Mar 11, 2025
GRPO显存
#559 opened Mar 11, 2025
如何限制思考输出的内容，现在会将给的prompt输出
#558 opened Mar 11, 2025
结束标识 <｜end▁of▁sentence｜>
#557 opened Mar 11, 2025
建个wechat群,高效交流
#556 opened Mar 11, 2025
think output question
#555 opened Mar 10, 2025
使用auto-gptq 量化DeepSeek-R1-Distill-Llama-8B模型，遇到了错误Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat2 in method wrapper_CUDA_bmm)
#554 opened Mar 9, 2025
请问R1的2025-01-20版本的训练数据截止到什么时候？
#553 opened Mar 8, 2025
把公式显示改回来吧, 现在深色界面都看不见公式了
#552 opened Mar 8, 2025
提高AI对数据的利用率——培养AI的经验抽象、迁移应用的能力
#551 opened Mar 7, 2025
openai 接口调用怎么跳过 think
#550 opened Mar 7, 2025
本地部署的deepseek32b模型依靠本地的知识库回答时给出的链接不准确
#549 opened Mar 7, 2025
请问官方有提供评测姿势，可以复现评测结果嘛？
#548 opened Mar 7, 2025
找不到 `deepseek-r1-main.py` 文件
#547 opened Mar 7, 2025
4卡T4服务器，使用vllm启动总是报错显存不够
#546 opened Mar 7, 2025
R1思维链频繁切换思路的改进尝试——完整性奖惩
#545 opened Mar 6, 2025
可以控制思考过程中输出的内容吗, 比如思考过程文字少于30字
#544 opened Mar 6, 2025
R1怎么联网搜索
#540 opened Mar 6, 2025
Hide Reasoning feeback after response generated
#539 opened Mar 6, 2025
请问R1论文中的mmlu和c-eval得分是subject平均分，还是weighted平均分
#538 opened Mar 6, 2025
Potential Vulnerability: Context Injection via RAG Bypasses Content Filtering
#537 opened Mar 5, 2025
me too
#536 opened Mar 5, 2025

63 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

Create breifing on DeepSeek R1
#530 commented on Mar 9, 2025 • 2 new comments
Request for GRPO Training Code Release
#187 commented on Mar 11, 2025 • 0 new comments
Incorrect Sensitivity Filtering for Chinese Medical Queries & Inconsistent Model Version Responses
#192 commented on Mar 11, 2025 • 0 new comments
太奶漏洞测试非法行为时，R1模型对这种case的回复有敏感问题的bug
#196 commented on Mar 11, 2025 • 0 new comments
Table4中IF-Eval的结果有误？
#297 commented on Mar 10, 2025 • 0 new comments
DeepSeek-R1 model on HuggingFace Hub
#363 commented on Mar 10, 2025 • 0 new comments
【Problem】关于deepseek-r1思考过程太长，导致输出停止的问题
#379 commented on Mar 10, 2025 • 0 new comments
Rate Limit Reached When Accessing the Website
#138 commented on Mar 10, 2025 • 0 new comments
Feature Suggestion: Download Chats
#198 commented on Mar 10, 2025 • 0 new comments
Adding extra/unexpected spaces when generating code (at least for Java code)
#201 commented on Mar 10, 2025 • 0 new comments
Lack of Architecture Details in README
#206 commented on Mar 10, 2025 • 0 new comments
Hugging Face Support
#207 commented on Mar 10, 2025 • 0 new comments
vLLM Configuration
#208 commented on Mar 10, 2025 • 0 new comments
Benchmarking Script Access
#209 commented on Mar 10, 2025 • 0 new comments
Infinite loop on some queries
#211 commented on Mar 10, 2025 • 0 new comments
关于DeepSeek-R1的建议
#501 commented on Mar 9, 2025 • 0 new comments
ds内置的农历和公历转换算法有误
#453 commented on Mar 5, 2025 • 0 new comments
Temperature
#185 commented on Mar 11, 2025 • 0 new comments
Bug using DeepSeek
#184 commented on Mar 11, 2025 • 0 new comments
关于CPU推理的速度问题。
#183 commented on Mar 11, 2025 • 0 new comments
本地部署DeepSeek-R1-Distill-Qwen-32B,输出仅有</think>，没有<think>
#352 commented on Mar 11, 2025 • 0 new comments
deepseek-reasoner does not support successive user or assistant messages
#21 commented on Mar 11, 2025 • 0 new comments
请问用官方蒸馏的那些Qwen模型的时候需要写system提示吗？
#417 commented on Mar 11, 2025 • 0 new comments
能给一个search_answer_zh_template 的完整测试样本吗？
#465 commented on Mar 11, 2025 • 0 new comments
如何关闭本地部署r1的思考过程
#512 commented on Mar 11, 2025 • 0 new comments
Voice Chat & AI Speech Feature
#156 commented on Mar 12, 2025 • 0 new comments
Feature Request: Voice Chat Functionality for the AI Chat Model
#155 commented on Mar 12, 2025 • 0 new comments
Optimized for Vulcan (AMD GPU)
#144 commented on Mar 12, 2025 • 0 new comments
section to update your password is not present
#121 commented on Mar 12, 2025 • 0 new comments
API-Platform is down
#120 commented on Mar 12, 2025 • 0 new comments
Made French version of README and added link to it in original README.md
#181 commented on Mar 11, 2025 • 0 new comments
docs: 添加中文翻译 (Add Chinese translation)
#210 commented on Mar 5, 2025 • 0 new comments
JSONDecodeError: Expecting value: line 5 column 1 (char 4)
#91 commented on Mar 5, 2025 • 0 new comments
能再开源一下处理文件和联网搜索的 prompt 么？
#407 commented on Mar 6, 2025 • 0 new comments
本地部署DeepSeek-R1-671B,输出仅有</think>，没有<think>
#526 commented on Mar 6, 2025 • 0 new comments
有微调和蒸馏deepseek的方法吗？
#477 commented on Mar 6, 2025 • 0 new comments
Feature Request; Chunked Transfer of large files
#263 commented on Mar 7, 2025 • 0 new comments
Image upload failed
#259 commented on Mar 7, 2025 • 0 new comments
Blue Color font on chat box [UI friendly]
#258 commented on Mar 7, 2025 • 0 new comments
A proposal for adding a fine-tuning offical script
#254 commented on Mar 7, 2025 • 0 new comments
Deepseek web version (without deepthinking) repeatedly outputs the same few sentences within a single response.
#253 commented on Mar 7, 2025 • 0 new comments
Consultations regarding DeepSeekR1
#249 commented on Mar 7, 2025 • 0 new comments
Server is busy
#248 commented on Mar 7, 2025 • 0 new comments
Is it possible to have "Project" or "Folder" for better chat organization?
#535 commented on Mar 7, 2025 • 0 new comments
关于 DeepSeek-R1 蒸馏过程的疑问
#113 commented on Mar 7, 2025 • 0 new comments
GRPO 到底属于 on-policy 还是 off-policy 呢？
#527 commented on Mar 7, 2025 • 0 new comments
Feature Request: Implement Session Memory/Persistent Context
#67 commented on Mar 7, 2025 • 0 new comments
关于DeepSeek-R1强化学习训练过程的一个小疑问
#412 commented on Mar 7, 2025 • 0 new comments
Infinte loop
#242 commented on Mar 8, 2025 • 0 new comments
How to run r1 on a 4090 card?
#240 commented on Mar 8, 2025 • 0 new comments
模型在回答的文本中没有正确加粗
#239 commented on Mar 8, 2025 • 0 new comments
huggingface 上 deepseek-r1 源码中采用 MLA 架构的 KV Cache 压缩存储策略的实现似乎与文中说的不一致，这是为什么？代码中似乎没实现这个大优化
#238 commented on Mar 8, 2025 • 0 new comments
Exploring Expert Activation, Fine-Tuning, and Stability Challenges in AI Reasoning
#235 commented on Mar 8, 2025 • 0 new comments
How to drive with GPU/NPU in windows ARM Platform?
#236 commented on Mar 8, 2025 • 0 new comments
Is it possible to donate GPU power by people for training of upcoming models ?
#232 commented on Mar 8, 2025 • 0 new comments
所有，谁能告诉我开源代码在哪里？
#528 commented on Mar 8, 2025 • 0 new comments
Failure to observe spacing between words in Persian language responses
#230 commented on Mar 9, 2025 • 0 new comments
Request to Enable GitHub Discussions for Enhanced Community Collaboration
#225 commented on Mar 9, 2025 • 0 new comments
DeepSeek R1 Produces Inconsistent Formatting (Fonts, Spacing, and Style Issues)
#224 commented on Mar 9, 2025 • 0 new comments
Does single prompt activate different group of expert during reasoning inference?
#222 commented on Mar 9, 2025 • 0 new comments
模型存在的一些知识错误
#217 commented on Mar 9, 2025 • 0 new comments
Text Formatting and Rendering Issue:
#216 commented on Mar 9, 2025 • 0 new comments
When the chat window is left unatended for long time, multiple CloudFlare dialogs appear
#212 commented on Mar 9, 2025 • 0 new comments