# C6：提示工程

## 文本生成模型

### 选择模型

从小的开源模型入手，例如：`microsoft/Phi-3-mini-4k-instruct`

In [2]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from util import init_torch_device


model_name = "microsoft/Phi-3-mini-4k-instruct"
device = init_torch_device()
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map=device.type,
    torch_dtype='auto',
    trust_remote_code=False,
)
tokenizer = AutoTokenizer.from_pretrained(
    model_name,
)

pipe = pipeline(
    'text-generation',
    model=model,
    tokenizer=tokenizer,
    return_full_text=False,
    max_new_tokens=500,
    do_sample=False,
)


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Device set to use mps
The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


In [4]:
messages = [
    {
        "role": "user",
        "content": "Explain the concept of prompt engineering in a way that is easy to understand."
    }
]

output = pipe(messages)
print(output)


The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


[{'generated_text': " Prompt engineering is like being a chef who knows exactly how to ask a recipe to make the perfect dish. Just as a chef chooses the right ingredients and follows the steps carefully, prompt engineering involves crafting the right questions or instructions to get the best possible answers from a computer program. It's about knowing exactly what you want to achieve and how to communicate that to the computer in a way that it understands and responds effectively."}]


In [5]:
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False)
print(prompt)

<|user|>
Explain the concept of prompt engineering in a way that is easy to understand.<|end|>
<|endoftext|>


### 控制模型输出

`pipeline` 加载模型时，`do_sample`控制了是否进行采样。使用 `temperature` 和 `top_p` 控制输出时，必须配置 `do_sample=True`

#### temperature

决定生成文本的随机性或创造性。`temperature=0` 时每次都生成相同的响应，即总是选择可能性最大的词

* 高 `temperature` (例如 0.8) 用于产生更多样化的输出
* 低 `temperature` (例如 0.2) 用于产生更具确定性的输出

#### top_p

`top-p`采样，核采样（nucleus sampling）

考虑概率最高的若干词元，直到达到累积的概率限制

例如：`top_p=0.1`，模型会从概率最高的 token 开始考虑，直到累积概率达到 0.1

#### top_k

制导律可能性最大的 k 个 token

#### 应用场景

| 场景 | temperature | top_p | 说明 |
|---|---|---|---|
|头脑风暴|高|高|高随机性输出，且可能输出的 token 集合较大，生成结果高度多样化。往往富有创意且出人意料|
|邮件生成|低|低|高稳定输出，词元集小。产生可预测、重点明确、保守的结果|
|创意写作|高|低|高随机输出，词元集小。产生有创意、且保持连贯性|
|翻译|低|高|高稳定输出，词元集大。更广泛的词元范围，翻译结果更具多样性|

In [7]:
output = pipe(
    messages,
    do_sample=True,
    temperature=1,
)
print(output[0]['generated_text'])

 Prompt engineering is like being a chef who knows exactly what ingredients to mix to create the perfect meal. In this case, the chef is carefully crafting a question or a set of instructions (prompts) for a computer system like Chatbot or AI. The goal is to guide the AI to digest the information, process it, and then provide a helpful answer or carry out a task just like Chatbot did in our previous example. The art of prompt engineering lies in the ability to word these prompts effectively, making sure they align well with the context, tone, and expected response the AI is going to provide. Imagine it like setting the table before a fancy dinner party – you have to ensure everything is just right before your guests (in this case, the AI) arrive. Through the skills of constructing the right prompts, we can get the AI to serve exactly what we need, like Chatbot did with the trivia game.


In [8]:
output = pipe(
    messages,
    do_sample=True,
    top_p=1,
)
print(output[0]['generated_text'])

 Prompt engineering is like giving clear directions to a robot. Imagine you're trying to get your robot friend to bake a cake. You'd tell it exactly what ingredients to use, how to mix them, and how long to bake it. Prompt engineering works the same way but with computers. When you ask a computer to do something, like write a story or translate a sentence, you give it a 'prompt.' That's like your directions for the computer. The more clear and detailed your prompt is, the better the computer can understand and do what you want. It's all about how you phrase your question or command, just like how you'd tell a robot how to bake the perfect cake.


## 提示工程

### 基本要素

### 基于指令的提示词

常见任务：
    * 监督分类
    * 搜索
    * 摘要
    * 代码生成
    * 命名实体识别

一些技术：
    * 具体性：准确描述要达到的目的
    * 幻觉：要求知道答案才生成，否则回答不知道
    * 顺序：在提示词的开头或结尾放置指令，对于长提示，中间往往被遗忘。LLM 通常关注开头（首位效应）和结尾（近因效应）

### 高级提示工程

高级组件：
    * 角色定位：例如，你是一个物理学家
    * 指令：要求具体，避免有太大解释空间
    * 上下文：描述或背景等附加信息，“为什么提出这个指令”
    * 格式：输出文本的格式
    * 受众：生成文本的目标对象，输出的水平
    * 语气：在生成文本中应该使用的语气
    * 数据：与任务本身相关的主要数据

迭代：调整提示中包含的组件以及组件的顺序

注意：提示词及策略在不同模型上有不同的效果

#### 上下文学习

直接展示想要完成目标任务的具体示例

样本数量：
* 零样本提示
* 单样本提示
* 少样本提示

通过 `role` 区分用户（`user`）和模型（`assistant`）

#### 链式提示

用一个提示词的输出作为下一个提示词的输入

将问题分解为几个部分，逐个完成

场景：
    * 响应验证：对之前的输出做二次检查
    * 并行提示：并行创建多个提示词，合并最终结果
    * 写故事：将问题分解为多个组件来利用 LLM 写书或故事



In [9]:
# 链式提示
product_prompt = [
    {
        'role': 'user',
        'content': 'Create a name and slogan for a chatbot that leverages LLM'
    }
]

output = pipe(product_prompt)
product_description = output[0]['generated_text']
print(product_description)

The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


 Name: ChatSage
Slogan: "Unleashing the power of AI for seamless communication."


In [10]:
sales_prompt = [
    {
        'role': 'user',
        'content': f'Generate a very short sales pitch for the following product: {product_description}'
    }
]

outputs = pipe(sales_prompt)
sales_pitch = outputs[0]['generated_text']
print(sales_pitch)


The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


 Introducing ChatSage, the ultimate AI-powered communication solution that revolutionizes the way you connect with others. With our cutting-edge technology, we unleash the power of AI to provide seamless and efficient communication, making it easier than ever to stay connected. Say goodbye to miscommunication and hello to a world of effortless conversations with ChatSage. Experience the future of communication today!


### 利用生成模型推理

#### 思维链：先思考再回答

展示**推理示例**，引导模型在回答中运用推理

或者零样本的情况下，**引导推理**，例如：`Let's think setp-by-step`

#### 自洽性：采样输出

多次使用相同提示

将占多数的结果作为最终答案

#### 思维树：探索中间步骤

Tree-fo-Thought, ToT

面对需要多个推理步骤的问题时，将其分解为多个部分

模型被提示探索当前问题的不同解决方案，然后，对最佳解决方案投票，并进行下一步


In [11]:
cot_prompt = [
    {
        'role': 'user',
        'content': 'Roger has 5 tennis balls. He buys 2 more cans of tennis balls. Each can has 3 tennis balls. How many tennis balls does he have now?'
    },
    {
        'role': 'assistant',
        'content': 'Roger started with 5 balls. 2 cans of 3 tennis balls each is 6 balls. 5 + 6 = 11. The answer is 11.'
    },
    {
        'role': 'user',
        'content': 'The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many apples do they have?'
    }
]
outputs = pipe(cot_prompt)
print(outputs[0]['generated_text'])

The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


 The cafeteria started with 23 apples. They used 20 apples to make lunch, so they had 23 - 20 = 3 apples left. After buying 6 more apples, they now have 3 + 6 = 9 apples. The answer is 9.


In [12]:
zeroshot_cot_prompt = [
    {
        'role': 'user',
        'content': 'The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many apples do they have? Let\'s think step by step.'
    }
]
outputs = pipe(zeroshot_cot_prompt)
print(outputs[0]['generated_text'])


The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


 Step 1: Start with the initial number of apples in the cafeteria, which is 23.

Step 2: Subtract the number of apples used to make lunch, which is 20.
23 - 20 = 3 apples remaining.

Step 3: Add the number of apples bought, which is 6.
3 + 6 = 9 apples.

So, the cafeteria now has 9 apples.


In [13]:
# 思维树
zeroshot_tot_prompt = [{
    'role': 'user',
    'content': '''Imagine three different experts are answering this question. All experts will write down 1 step of their thinking, then sare it with the group.
Then all experts will go on to the next step, etc. If any expert realizes they're wrong at any point then they leave. 
The question is 'The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many apples do they have?'
Make sure to discuss the results.
    '''
}]
outputs = pipe(zeroshot_tot_prompt)
print(outputs[0]['generated_text'])


The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


 Expert 1:
Step 1: Start with the initial number of apples, which is 23.

Expert 2:
Step 1: Subtract the number of apples used for lunch (20) from the initial number (23), resulting in 3 apples remaining.
Step 2: Add the number of apples bought (6) to the remaining apples (3), resulting in a total of 9 apples.

Expert 3:
Step 1: Begin with the initial number of apples (23).
Step 2: Subtract the number of apples used for lunch (20) from the initial number (23), resulting in 3 apples remaining.
Step 3: Add the number of apples bought (6) to the remaining apples (3), resulting in a total of 9 apples.

Discussion:
All three experts arrived at the same answer, which is 9 apples. This indicates that their calculations and reasoning were correct. The cafeteria started with 23 apples, used 20 for lunch, and then bought 6 more, resulting in a total of 9 apples.


## 输出验证

要求：
  * 结构化输出
  * 输出有效性
  * 伦理
  * 准确性

方法：
  * 示例
  * 语法
  * 微调

### 示例

提供少样本

### 语法：约束采样

利用生成模型验证自己的输出

将输出作为新的提示词，基于预定义的规则进行验证

### 微调

微调模型输出层