# 一、单轮对话：基础问答的实现

In [1]:
from transformers import AutoModelForCausalLM, AutoTokenizer

# 加载模型和分词器
model_name = "/root/autodl-tmp/Qwen3-8B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# 准备输入
prompt = "请简单介绍一下大模型"
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

# 生成回答
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=32768)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

# 打印回答
print("Response:", response)

Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]

Response: user
请简单介绍一下大模型
assistant
<think>
嗯，用户让我简单介绍一下大模型。首先，我需要确定用户对大模型的了解程度。可能他们只是听说过，但不太清楚具体是什么。所以我要从基础开始，解释什么是大模型，它的特点，应用场景，以及优缺点。

首先，大模型通常指的是参数量非常大的机器学习模型，比如像GPT-3、BERT这些。参数量大意味着模型可以捕捉更复杂的模式，处理更广泛的任务。不过，用户可能不知道参数量和模型能力之间的关系，所以需要简单说明。

接下来，用户可能想知道大模型的应用场景，比如自然语言处理、图像识别、语音处理等。这里需要举一些常见的例子，比如聊天机器人、翻译工具、内容生成等，让用户更容易理解。

然后，优缺点部分也很重要。优点包括强大的泛化能力和多任务处理，但缺点如计算资源需求高、训练成本大、可能存在的偏见和伦理问题。用户可能关心这些方面，特别是如果他们打算使用或开发大模型的话。

另外，用户可能对技术细节不太熟悉，所以需要用通俗易懂的语言，避免太多术语。比如，解释参数量时，可以比喻成“大脑的神经元数量”，这样更形象。

还要考虑用户可能的深层需求。他们可能想了解大模型的实际应用价值，或者是否适合自己使用。因此，在介绍时可以提到实际案例，比如像ChatGPT、Qwen这样的模型，增强说服力。

最后，确保回答结构清晰，分点说明，让用户一目了然。同时，保持口语化，避免过于学术化的表达，让内容更易接受。
</think>

大模型（Large Model）是指参数量巨大（通常超过十亿甚至万亿级别）的机器学习模型，它们通过海量数据训练，具备强大的模式识别和泛化能力，能够处理复杂的任务。以下是其核心特点和应用场景的简要介绍：

---

### **1. 核心特点**
- **参数量巨大**：模型参数数量庞大（如GPT-3有1750亿参数），使其能够捕捉更复杂的语言和数据模式。
- **多任务处理**：能同时完成文本生成、翻译、问答、编程、图像生成等多样化任务。
- **上下文理解**：通过长文本输入，理解语境并生成连贯、符合逻辑的输出。
- **持续学习**：部分模型支持微调，适应特定领域或场景。

---

### **2. 应用场景**
- **自然语言处理（NLP）**：如聊天机器人（如ChatGPT）、智能客服、内容

# 二、多轮对话：构建连贯对话体验

## （一）普通多轮对话实现

In [1]:
from transformers import AutoModelForCausalLM, AutoTokenizer

class QwenChatbot:
    def __init__(self, model_name= "/root/autodl-tmp/Qwen3-8B"):
        self.tokenizer = AutoTokenizer.from_pretrained(model_name)
        self.model = AutoModelForCausalLM.from_pretrained(model_name)
        self.history = []

    def generate_response(self, user_input):
        messages = self.history + [{"role": "user", "content": user_input}]

        text = self.tokenizer.apply_chat_template(
            messages,
            tokenize=False,
            add_generation_prompt=True
        )

        inputs = self.tokenizer(text, return_tensors="pt")
        response_ids = self.model.generate(**inputs, max_new_tokens=32768)[0][len(inputs.input_ids[0]):].tolist()
        response = self.tokenizer.decode(response_ids, skip_special_tokens=True)

        # Update history
        self.history.append({"role": "user", "content": user_input})
        self.history.append({"role": "assistant", "content": response})

        return response

# Example Usage
if __name__ == "__main__":
    chatbot = QwenChatbot()

    # First input
    user_input_1 = "How many r's in strawberries?"
    print(f"User: {user_input_1}")
    response_1 = chatbot.generate_response(user_input_1)
    print(f"Bot: {response_1}")
    print("----------------------")

    # Second input
    user_input_2 = "Then, how many r's in blueberries?"
    print(f"User: {user_input_2}")
    response_2 = chatbot.generate_response(user_input_2)
    print(f"Bot: {response_2}")
    print("----------------------")

    # Third input
    user_input_3 = "Really?"
    print(f"User: {user_input_3}")
    response_3 = chatbot.generate_response(user_input_3)
    print(f"Bot: {response_3}")

Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]

User: How many r's in strawberries?
Bot: <think>
Okay, let's see. The user is asking how many times the letter 'r' appears in the word "strawberries". Alright, first I need to make sure I have the correct spelling of the word. Let me write it out: S-T-R-A-W-B-E-R-R-I-E-S. Wait, is that right? Let me check again. Strawberry... S-T-R-A-W-B-E-R-R-I-E-S. Yeah, that seems correct. Now, I need to count the number of 'r's.

Let me go through each letter one by one. Starting with the first letter: S. Not an 'r'. Next is T. Still not. Then R. There's an 'r' here. That's the first one. Then A, W, B, E, R. Oh, another 'r' here. So that's the second one. Then R again. Wait, after the first 'r', the next letters are A, W, B, E, R, R, I, E, S. So after the first 'r' (position 3), the next 'r' is at position 8 and then position 9? Wait, let me count again to be precise.

Breaking down the word letter by letter:

1. S
2. T
3. R
4. A
5. W
6. B
7. E
8. R
9. R
10. I
11. E
12. S

So positions 3, 8, and 9 

## （二）动态控制思考模式

In [2]:
# Example Usage with Dynamic Thinking Control
if __name__ == "__main__":
    chatbot = QwenChatbot()

    # First input (thinking mode enabled by default)
    user_input_1 = "How many r's in strawberries?"
    print(f"User: {user_input_1}")
    response_1 = chatbot.generate_response(user_input_1)
    print(f"Bot: {response_1}")
    print("----------------------")

    # Second input with /no_think
    user_input_2 = "Then, how many r's in blueberries? /no_think"
    print(f"User: {user_input_2}")
    response_2 = chatbot.generate_response(user_input_2)
    print(f"Bot: {response_2}")
    print("----------------------")

    # Third input with /think
    user_input_3 = "Really? /think"
    print(f"User: {user_input_3}")
    response_3 = chatbot.generate_response(user_input_3)
    print(f"Bot: {response_3}")

Loading checkpoint shards:   0%|          | 0/5 [00:00<?, ?it/s]

User: How many r's in strawberries?
Bot: <think>
Okay, let's see. The user is asking how many times the letter 'r' appears in the word "strawberries". Hmm, first I need to make sure I have the correct spelling of the word. Let me write it out: S-T-R-A-W-B-E-R-R-I-E-S. Wait, is that right? Let me double-check. Strawberry is spelled S-T-R-A-W-B-E-R-R-I-E-S. Yes, that's correct. Now, I need to count the number of 'r's in that spelling.

Let me go through each letter one by one. Starting with the first letter: S – not an 'r'. Next is T – no. Then R – that's one. Then A – no. W – no. B – no. E – no. R – that's two. Then another R – three. Then I – no. E – no. S – no. So, in total, there are three 'r's. Wait, let me check again to make sure I didn't miss any. S-T-R-A-W-B-E-R-R-I-E-S. So positions 3, 8, and 9 are 'r's. Yes, that's three. I think that's correct. Maybe I should write it out with the letters and underline the 'r's to confirm. S-T-R-A-W-B-E-R-R-I-E-S. Underlining the third letter