-
Notifications
You must be signed in to change notification settings - Fork 11.1k
Insights: deepseek-ai/DeepSeek-R1
Overview
-
0 Active pull requests
-
- 0 Merged pull requests
- 0 Open pull requests
- 5 Closed issues
- 24 New issues
There hasn’t been any commit activity on deepseek-ai/DeepSeek-R1 in the last week.
Want to help out?
5 Issues closed by 3 people
-
建议再发布一个Deepseek-r1的直接量化版本
#188 closed
Mar 11, 2025 -
可否给几个纯RL训练的数据示例?
#529 closed
Mar 7, 2025 -
模型是否具备记忆、模糊记忆、遗忘机制。
#543 closed
Mar 6, 2025 -
关于进一步优化模型的一些想法。
#541 closed
Mar 6, 2025 -
关于进一步优化模型的一些想法。
#542 closed
Mar 6, 2025
24 Issues opened by 22 people
-
提升AI学习效率——打破“独立事件“训练框架
#562 opened
Mar 12, 2025 -
关于模型日志的问题
#560 opened
Mar 11, 2025 -
GRPO显存
#559 opened
Mar 11, 2025 -
如何限制思考输出的内容,现在会将给的prompt输出
#558 opened
Mar 11, 2025 -
结束标识 <|end▁of▁sentence|>
#557 opened
Mar 11, 2025 -
建个wechat群,高效交流
#556 opened
Mar 11, 2025 -
think output question
#555 opened
Mar 10, 2025 -
请问R1的2025-01-20版本的训练数据截止到什么时候?
#553 opened
Mar 8, 2025 -
把公式显示改回来吧, 现在深色界面都看不见公式了
#552 opened
Mar 8, 2025 -
提高AI对数据的利用率——培养AI的经验抽象、迁移应用的能力
#551 opened
Mar 7, 2025 -
openai 接口调用 怎么跳过 think
#550 opened
Mar 7, 2025 -
本地部署的deepseek32b模型依靠本地的知识库回答时给出的链接不准确
#549 opened
Mar 7, 2025 -
请问官方有提供评测姿势,可以复现评测结果嘛?
#548 opened
Mar 7, 2025 -
找不到 `deepseek-r1-main.py` 文件
#547 opened
Mar 7, 2025 -
4卡T4服务器,使用vllm启动总是报错显存不够
#546 opened
Mar 7, 2025 -
R1思维链频繁切换思路的改进尝试——完整性奖惩
#545 opened
Mar 6, 2025 -
可以控制思考过程中输出的内容吗, 比如思考过程文字少于30字
#544 opened
Mar 6, 2025 -
R1怎么联网搜索
#540 opened
Mar 6, 2025 -
Hide Reasoning feeback after response generated
#539 opened
Mar 6, 2025 -
请问R1论文中的mmlu和c-eval得分是subject平均分,还是weighted平均分
#538 opened
Mar 6, 2025 -
Potential Vulnerability: Context Injection via RAG Bypasses Content Filtering
#537 opened
Mar 5, 2025 -
me too
#536 opened
Mar 5, 2025
63 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Create breifing on DeepSeek R1
#530 commented on
Mar 9, 2025 • 2 new comments -
Request for GRPO Training Code Release
#187 commented on
Mar 11, 2025 • 0 new comments -
Incorrect Sensitivity Filtering for Chinese Medical Queries & Inconsistent Model Version Responses
#192 commented on
Mar 11, 2025 • 0 new comments -
太奶漏洞测试非法行为时,R1模型对这种case的回复有敏感问题的bug
#196 commented on
Mar 11, 2025 • 0 new comments -
Table4中IF-Eval的结果有误?
#297 commented on
Mar 10, 2025 • 0 new comments -
DeepSeek-R1 model on HuggingFace Hub
#363 commented on
Mar 10, 2025 • 0 new comments -
【Problem】关于deepseek-r1思考过程太长,导致输出停止的问题
#379 commented on
Mar 10, 2025 • 0 new comments -
Rate Limit Reached When Accessing the Website
#138 commented on
Mar 10, 2025 • 0 new comments -
Feature Suggestion: Download Chats
#198 commented on
Mar 10, 2025 • 0 new comments -
Adding extra/unexpected spaces when generating code (at least for Java code)
#201 commented on
Mar 10, 2025 • 0 new comments -
Lack of Architecture Details in README
#206 commented on
Mar 10, 2025 • 0 new comments -
Hugging Face Support
#207 commented on
Mar 10, 2025 • 0 new comments -
vLLM Configuration
#208 commented on
Mar 10, 2025 • 0 new comments -
Benchmarking Script Access
#209 commented on
Mar 10, 2025 • 0 new comments -
Infinite loop on some queries
#211 commented on
Mar 10, 2025 • 0 new comments -
关于DeepSeek-R1的建议
#501 commented on
Mar 9, 2025 • 0 new comments -
ds内置的农历和公历转换算法有误
#453 commented on
Mar 5, 2025 • 0 new comments -
Temperature
#185 commented on
Mar 11, 2025 • 0 new comments -
Bug using DeepSeek
#184 commented on
Mar 11, 2025 • 0 new comments -
关于CPU推理的速度问题。
#183 commented on
Mar 11, 2025 • 0 new comments -
本地部署DeepSeek-R1-Distill-Qwen-32B,输出仅有</think>,没有<think>
#352 commented on
Mar 11, 2025 • 0 new comments -
deepseek-reasoner does not support successive user or assistant messages
#21 commented on
Mar 11, 2025 • 0 new comments -
请问用官方蒸馏的那些Qwen模型的时候需要写system提示吗?
#417 commented on
Mar 11, 2025 • 0 new comments -
能给一个search_answer_zh_template 的完整测试样本吗?
#465 commented on
Mar 11, 2025 • 0 new comments -
如何关闭本地部署r1的思考过程
#512 commented on
Mar 11, 2025 • 0 new comments -
Voice Chat & AI Speech Feature
#156 commented on
Mar 12, 2025 • 0 new comments -
Feature Request: Voice Chat Functionality for the AI Chat Model
#155 commented on
Mar 12, 2025 • 0 new comments -
Optimized for Vulcan (AMD GPU)
#144 commented on
Mar 12, 2025 • 0 new comments -
section to update your password is not present
#121 commented on
Mar 12, 2025 • 0 new comments -
API-Platform is down
#120 commented on
Mar 12, 2025 • 0 new comments -
Made French version of README and added link to it in original README.md
#181 commented on
Mar 11, 2025 • 0 new comments -
docs: 添加中文翻译 (Add Chinese translation)
#210 commented on
Mar 5, 2025 • 0 new comments -
JSONDecodeError: Expecting value: line 5 column 1 (char 4)
#91 commented on
Mar 5, 2025 • 0 new comments -
能再开源一下处理文件和联网搜索的 prompt 么?
#407 commented on
Mar 6, 2025 • 0 new comments -
本地部署DeepSeek-R1-671B,输出仅有</think>,没有<think>
#526 commented on
Mar 6, 2025 • 0 new comments -
有微调和蒸馏deepseek的方法吗?
#477 commented on
Mar 6, 2025 • 0 new comments -
Feature Request; Chunked Transfer of large files
#263 commented on
Mar 7, 2025 • 0 new comments -
Image upload failed
#259 commented on
Mar 7, 2025 • 0 new comments -
Blue Color font on chat box [UI friendly]
#258 commented on
Mar 7, 2025 • 0 new comments -
A proposal for adding a fine-tuning offical script
#254 commented on
Mar 7, 2025 • 0 new comments -
Deepseek web version (without deepthinking) repeatedly outputs the same few sentences within a single response.
#253 commented on
Mar 7, 2025 • 0 new comments -
Consultations regarding DeepSeekR1
#249 commented on
Mar 7, 2025 • 0 new comments -
Server is busy
#248 commented on
Mar 7, 2025 • 0 new comments -
Is it possible to have "Project" or "Folder" for better chat organization?
#535 commented on
Mar 7, 2025 • 0 new comments -
关于 DeepSeek-R1 蒸馏过程的疑问
#113 commented on
Mar 7, 2025 • 0 new comments -
GRPO 到底属于 on-policy 还是 off-policy 呢?
#527 commented on
Mar 7, 2025 • 0 new comments -
Feature Request: Implement Session Memory/Persistent Context
#67 commented on
Mar 7, 2025 • 0 new comments -
关于DeepSeek-R1强化学习训练过程的一个小疑问
#412 commented on
Mar 7, 2025 • 0 new comments -
Infinte loop
#242 commented on
Mar 8, 2025 • 0 new comments -
How to run r1 on a 4090 card?
#240 commented on
Mar 8, 2025 • 0 new comments -
模型在回答的文本中没有正确加粗
#239 commented on
Mar 8, 2025 • 0 new comments -
huggingface 上 deepseek-r1 源码中采用 MLA 架构的 KV Cache 压缩存储策略的实现似乎与文中说的不一致,这是为什么?代码中似乎没实现这个大优化
#238 commented on
Mar 8, 2025 • 0 new comments -
Exploring Expert Activation, Fine-Tuning, and Stability Challenges in AI Reasoning
#235 commented on
Mar 8, 2025 • 0 new comments -
How to drive with GPU/NPU in windows ARM Platform?
#236 commented on
Mar 8, 2025 • 0 new comments -
Is it possible to donate GPU power by people for training of upcoming models ?
#232 commented on
Mar 8, 2025 • 0 new comments -
所有,谁能告诉我开源代码在哪里?
#528 commented on
Mar 8, 2025 • 0 new comments -
Failure to observe spacing between words in Persian language responses
#230 commented on
Mar 9, 2025 • 0 new comments -
Request to Enable GitHub Discussions for Enhanced Community Collaboration
#225 commented on
Mar 9, 2025 • 0 new comments -
DeepSeek R1 Produces Inconsistent Formatting (Fonts, Spacing, and Style Issues)
#224 commented on
Mar 9, 2025 • 0 new comments -
Does single prompt activate different group of expert during reasoning inference?
#222 commented on
Mar 9, 2025 • 0 new comments -
模型存在的一些知识错误
#217 commented on
Mar 9, 2025 • 0 new comments -
Text Formatting and Rendering Issue:
#216 commented on
Mar 9, 2025 • 0 new comments -
When the chat window is left unatended for long time, multiple CloudFlare dialogs appear
#212 commented on
Mar 9, 2025 • 0 new comments