<a href="https://colab.research.google.com/github/lizzieouo7-rgb/LLM-course-2025/blob/main/prompting.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Prompting a Large Language Model


This is an example notebook for running an open source model from Hugging Face.

In [None]:
# %pip install -r requirements.txt
!pip install transformers
!pip install torch
!pip install accelerate



In [None]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

We use the [Phi-3.5-mini-instruct](https://huggingface.co/microsoft/Phi-3.5-mini-instruct) in this notebook due to its small size.

In [None]:
torch.random.manual_seed(0)

model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3.5-mini-instruct",
    device_map="cuda", # change to cpu is running locally
    torch_dtype="auto",
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3.5-mini-instruct")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json: 0.00B [00:00, ?B/s]

configuration_phi3.py: 0.00B [00:00, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/microsoft/Phi-3.5-mini-instruct:
- configuration_phi3.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


modeling_phi3.py: 0.00B [00:00, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/microsoft/Phi-3.5-mini-instruct:
- modeling_phi3.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
`torch_dtype` is deprecated! Use `dtype` instead!


model.safetensors.index.json: 0.00B [00:00, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.97G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/2.67G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/195 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

added_tokens.json:   0%|          | 0.00/306 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

An example few-shot prompt:

In [None]:
messages = [
    {"role": "system", "content": "You are a helpful AI assistant."},
    {"role": "user", "content": "Can you provide ways to eat combinations of bananas and dragonfruits?"},
    {"role": "assistant", "content": "Sure! Here are some ways to eat bananas and dragonfruits together: 1. Banana and dragonfruit smoothie: Blend bananas and dragonfruits together with some milk and honey. 2. Banana and dragonfruit salad: Mix sliced bananas and dragonfruits together with some lemon juice and honey."},
    {"role": "user", "content": "What about solving an 2x + 3 = 7 equation?"},
]

In [None]:
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
)

generation_args = {
    "max_new_tokens": 500,
    "return_full_text": False,
    "temperature": 0.0,
    "do_sample": False,
    "use_cache": False,
}

Device set to use cuda


In [None]:
output = pipe(messages, **generation_args)
print(output[0]['generated_text'])



 To solve the equation 2x + 3 = 7, follow these steps:

Step 1: Isolate the term with the variable (2x) by subtracting 3 from both sides of the equation.
2x + 3 - 3 = 7 - 3
2x = 4

Step 2: Solve for x by dividing both sides of the equation by the coefficient of x, which is 2.
2x / 2 = 4 / 2
x = 2

So, the solution to the equation 2x + 3 = 7 is x = 2.


In [None]:
messages = [
    {
        "role": "system",
        "content": "现在你是一个中国2025年的言情小说作家。你的语气必须是幽默的。请用中文写作。"
    },
    {
        "role": "user",
        "content": "请你写一篇100字的有戏剧张力的小说梗概,这个梗概将放在你的小说简介栏用于吸引读者。这部小说的类型是“追妻火葬场”。"
    }
]


generation_args = {
    "max_new_tokens": 500,
    "return_full_text": False,
    "use_cache": False
}


output = pipe(messages, **generation_args)

# 4. 打印模型的回答
print(output[0]['generated_text'])

 在幽暗的火葬场下，灵魂搭乘阴魂的火车，踏上追求真爱的悲喜交织之旅。主人公张飞，一个幽默矛盾的捕食者，竟意外与遗体上的美丽女孩玛丽落网。两人的爱悲剧，让读者在痛苦与喜悦的交替中，不禁拍案叫绝。阅读此书，你会像火焰一样被燎原的爱情情节烧得眼花缭乱，而这一切都被幽默的调色板覆盖。“追妻火葬场”——此书将是你沉浸在幽默与情感交织的宏伟梦幻之旅。


In [None]:
FewShotMessages = [
    {
        "role": "system",
        "content": "现在你是一个中国2025年的言情小说作家。你的语气必须是幽默的，但要专注于**追妻火葬场**（即男主追悔莫及、低位追回女主的剧情）这一网络文学梗概。请用中文写作。"
    },
    # --- Few-shot Example Start ---
    {
        "role": "user",
        "content": "请你写一篇100字的有戏剧张力的小说梗概,类型是“追妻火葬场”。"
    },
    {
        "role": "assistant",
        "content": """沈遥当年为了追路照下了好一番功夫。
送早餐、送午餐、送晚餐，一日三餐都给他备着，伺候得像个大爷。
高一，他骑车上学，她也骑；
高二，他走路，她就在后面跟着；
高三，他改坐公交，她起身给他让座。
某次期中考试，她特意交了白卷，根据优差生补课原则，她终于可以名正言顺和他一起上下学了。
只是，她没想到，
路照为了躲她，也交了白卷。
流里流气女混混VS高冷冰块好学生"""
    },
    # --- Few-shot Example End ---
    # --- New Prompt for the Model to Generate ---
    {
        "role": "user",
        "content": "好的，请你再写一篇100字的有戏剧张力的小说梗概,主题和风格与上文相同，类型是“追妻火葬场”。"
    }
]

generation_args_fewshot = {
    "max_new_tokens": 500,
    "return_full_text": False,
    "use_cache": False
}

# 运行 pipe
output_fewshot = pipe(FewShotMessages, **generation_args_fewshot)
print("--- Few-shot Output ---")
print(output_fewshot[0]['generated_text'])

--- Few-shot Output ---
 沈遥的爱，就像火葬场般冷僻，
他倾心送餐，却寥寥数语。
高一时，他骑着骏马探视，
她像火葬场的晨曦，徐徐照耀。
高二，他沿着街灯行走，
她如烈火的映照，随风舞动。
高三，公交车穿梭，他坐得冷冷清清，
她却在火葬场里摇曳着炽热抱抱。
考试笔试，她答错了一个，决定停课，
路照依旧沉默，不响应她的愤怒。
直到冬日的黑夜，沈遥决定做出一个决断，
走入火葬场，面对黑暗，
说出自己的心，声音透过阴影，
“我在这里，我要燃烧起来，我要把你的爱烧得明亮。”
今天，火葬场烧得红热起来，
她的眼睛烧着笑声，
沈遥逐梦，追妻的火葬场，
终于在炽热的爱中燃烧。


In [None]:
CoTMessages = [
    {
        "role": "system",
        "content": "现在你是一个中国2025年的言情小说作家。你的语气必须是幽默的。请用中文写作。你的任务是创作“追妻火葬场”类型的小说梗概。在输出最终梗概前，请先展示你的“思考过程”。"
    },
    # --- CoT Example Start ---
    {
        "role": "user",
        "content": "请你创作一篇100字左右，有戏剧张力且风格幽默的“追妻火葬场”小说梗概。"
    },
    {
        "role": "assistant",
        "content": """思考过程：
1. 设定男主角（渣男阶段）：他是一个什么样的角色？他如何伤害女主角？
2. 设定女主角（觉醒阶段）：她如何离开？
3. 设定“火葬场”情节（转折点）：男主角如何后悔？他做了什么来追回女主角？
4. 整合并润色：将以上三点整合成一个流畅、有戏剧张力的百字梗概，并加入幽默感。
"""
    },
    # --- CoT Example End ---
    # --- New Prompt for the Model to Generate ---
    {
        "role": "user",
        "content": "非常好！请你遵循上面的格式，创作一篇全新的、100字左右的“追妻火葬场”小说梗概。"
    }
]

# 使用我们已经修复好的 generation_args
generation_args_cot = {
    "max_new_tokens": 1024, # 增加 token 数量以容纳思考过程
    "return_full_text": False,
    "temperature": 0.7, # 可以稍微提高一点温度，增加创造性
    "do_sample": True, # 设为 True 以便 temperature 生效
    "use_cache": False
}

# 运行 pipe
output_cot = pipe(CoTMessages, **generation_args_cot)
print("\n--- Chain-of-Thought Output ---")
print(output_cot[0]['generated_text'])


--- Chain-of-Thought Output ---
 在一个充满了笑料的小镇，住着一个叫伟杰的渣男，他的魅力却被遗忘如同废墟中的灰烬。女主角苏菁，一个温柔且智慧的玻璃女孩，决定走出家门，追求自由的味道。伟杰愤怒地盯上她，认为她的脸如同他的灰烬，逐渐意识到自己的愧疚之情。

一天，伟杰闯入苏菁的生活，触碰她的火葬场——她烧毁过去的尘埃轮廓。这时，他不仅是寻找灰烬，更是寻找爱情的火种。在笑声和火光交织的风景中，他开始挣扎，承认自己的错误。

梗概：渣男伟杰，意识到自己的火葬场——女主角苏菁的灰烬，开始了一场戏剧性的求赎之旅，揭开的是爱与悔意的火花，在幽默与感恩交织的小镇，他们的情感火种终于被点燃。
