# DeepSeek Distilled Qianfan LLMs

本文介绍千帆系列基于DeepSeek-R1的蒸馏模型，包括训练方法、基准测试结果和使用示例。

## 1. 训练方法

千帆系列大模型的训练分为两个主要阶段：

1. **持续预训练阶段**：构建了超过3T tokens的高质量预训练语料，来源包括百度内部搜索数据、百科内容、教育资源、文章及小说等，以及外部开源数据集如Fineweb、RedPajama、Star-Code、悟道和MathPile等。这些数据经过严格的数据管道过滤和深度加工。

2. **高效蒸馏阶段**：通过构建高质量蒸馏数据集，从亿级SFT数据中严格去重，获得千万级问答对语料（含代码和数学相关数据）。使用DeepSeek-R1等推理大模型生成高质量教师模型输出，并通过规则过滤、代码执行验证、模型打分和难度分级等方式确保数据质量，最终获得数百万条高质量蒸馏数据，用于监督微调基础模型，成功训练出8B和70B的蒸馏模型。

## 2. 模型在公开Benchmark下的指标

### DeepSeek-R1-Distill-Qianfan-8B 模型性能

| 模型 | MMLU(5-shot) | MMLU-Pro(5-shot) | WinoGrande(5-shot) | ARC-C(5-shot) | CMMLU(5-shot) | C-Eval(5-shot) | GSM8K(4-shot) | MATH(4-shot) | CMATH | HumanEval(0-shot) | MBPP(3-shot) | AIME2024 | GPQA | MATH500 |
|:------|:---------------|-------------------:|---------------------:|---------------:|----------------:|-----------------:|----------------:|---------------:|--------:|------------------:|---------------:|-----------:|-------:|----------:|
| DeepSeek-R1-Distill-Qianfan-8B | 77.2 | 49.6 | 74.03 | 54.9 | 69.68 | 67.76 | 87.64 | 78.78 | 88.17 | 60.37 | 49.6 | 30 | 42 | 86.54 |
| DeepSeek-R1-Distill-Llama-8B | 65.05 | 36.35 | 63.38 | 38.98 | 49.54 | 49.93 | 74.98 | 56.18 | 84.5 | 68.9 | 60 | 50 | 45.5 | 86 |
| DeepSeek-R1-Distill-Qwen-7B | 62.39 | 41.21 | 54.14 | 37.63 | 53.54 | 57.52 | 77.33 | 60.56 | 79.5 | 70.73 | 60.8 | 60 | 47.98 | 89.4 |

### DeepSeek-R1-Distill-Qianfan-70B 模型性能

| 模型 | MMLU(5-shot) | MMLU-Pro(5-shot) | WinoGrande(5-shot) | ARC-C(5-shot) | CMMLU(5-shot) | C-Eval(5-shot) | GSM8K(4-shot) | MATH(4-shot) | CMATH | HumanEval(0-shot) | MBPP(3-shot) | AIME2024 | GPQA | MATH500 |
|:------|:---------------|-------------------:|---------------------:|---------------:|----------------:|-----------------:|----------------:|---------------:|--------:|------------------:|---------------:|-----------:|-------:|----------:|
| DeepSeek-R1-Distill-Qianfan-70B | 88 | 69.6 | 85.08 | 64.41 | 85.9 | 84.87 | 88.02 | 92.97 | 94.9 | 85.98 | 78 | 64.17 | 58.46 | 93.5 |
| DeepSeek-R1-Distill-Llama-70B | 87.06 | 71.7 | 81.61 | 56.95 | 72.54 | 74.31 | 80.74 | 82.26 | 89.83 | 88.41 | 81.4 | 65 | 62.1 | 94.4 |

## 3. 使用示例

以下是通过OpenAI接口调用千帆模型的示例代码：

In [None]:
import os
import time
from openai import OpenAI

# 获取API密钥
qianfan_api_key = os.environ.get("QIANFAN_TOKEN")

# 初始化客户端
client = OpenAI(api_key=qianfan_api_key, base_url="https://qianfan.baidubce.com/v2")

# 发送请求
response = client.chat.completions.create(
    model="qianfan-8b",  # 可选择qianfan-8b或qianfan-70b
    messages=[ {"role": "user", "content": "Every morning, Aya does a $9$ kilometer walk, and then finishes at the coffee shop. One day, she walks at $s$ kilometers per hour, and the walk takes $4$ hours, including $t$ minutes at the coffee shop. Another morning, she walks at $s+2$ kilometers per hour, and the walk takes $2$ hours and $24$ minutes, including $t$ minutes at the coffee shop. This morning, if she walks at $s+\\frac12$ kilometers per hour, how many minutes will the walk take, including the $t$ minutes at the coffee shop?\n\nPlease reason step by step, and put your final answer within \\boxed{}.\n"}],
    stream=False,
    temperature=0.6,
    top_p=0.95,
    max_tokens=16000
)

In [None]:
# 打印推理内容
print("推理内容:")
print(response.choices[0].message.reasoning_content)

In [None]:
# 打印最终回答
print("最终回答:")
print(response.choices[0].message.content)