### 练习 1：分词
使用 tiktoken 探索分词，这是 OpenAI 推出的一个开源高速分词器  
更多示例请参见 [OpenAI Cookbook](https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb?WT.mc_id=academic-105485-koreyst)。

In [1]:
import tiktoken

# 定义需要分词的提示
text = f"""
Jupiter is the fifth planet from the Sun and the \
largest in the Solar System. It is a gas giant with \
a mass one-thousandth that of the Sun, but two-and-a-half \
times that of all the other planets in the Solar System combined. \
Jupiter is one of the brightest objects visible to the naked eye \
in the night sky, and has been known to ancient civilizations since \
before recorded history. It is named after the Roman god Jupiter.[19] \
When viewed from Earth, Jupiter can be bright enough for its reflected \
light to cast visible shadows,[20] and is on average the third-brightest \
natural object in the night sky after the Moon and Venus.
"""

# 设置分词使用的模型
encoding = tiktoken.encoding_for_model("gpt-3.5-turbo")
# 序列化文本
tokens = encoding.encode(text)

print(tokens)
[encoding.decode_single_token_bytes(token) for token in tokens]

[198, 41, 20089, 374, 279, 18172, 11841, 505, 279, 8219, 323, 279, 7928, 304, 279, 25450, 744, 13, 1102, 374, 264, 6962, 14880, 449, 264, 3148, 832, 7716, 52949, 339, 430, 315, 279, 8219, 11, 719, 1403, 9976, 7561, 34902, 3115, 430, 315, 682, 279, 1023, 33975, 304, 279, 25450, 744, 11093, 13, 50789, 374, 832, 315, 279, 72021, 6302, 9621, 311, 279, 19557, 8071, 304, 279, 3814, 13180, 11, 323, 706, 1027, 3967, 311, 14154, 86569, 2533, 1603, 12715, 3925, 13, 1102, 374, 7086, 1306, 279, 13041, 10087, 50789, 8032, 777, 60, 3277, 19894, 505, 9420, 11, 50789, 649, 387, 10107, 3403, 369, 1202, 27000, 3177, 311, 6445, 9621, 35612, 17706, 508, 60, 323, 374, 389, 5578, 279, 4948, 1481, 1315, 478, 5933, 1665, 304, 279, 3814, 13180, 1306, 279, 17781, 323, 50076, 627]


[b'\n',
 b'J',
 b'upiter',
 b' is',
 b' the',
 b' fifth',
 b' planet',
 b' from',
 b' the',
 b' Sun',
 b' and',
 b' the',
 b' largest',
 b' in',
 b' the',
 b' Solar',
 b' System',
 b'.',
 b' It',
 b' is',
 b' a',
 b' gas',
 b' giant',
 b' with',
 b' a',
 b' mass',
 b' one',
 b'-th',
 b'ousand',
 b'th',
 b' that',
 b' of',
 b' the',
 b' Sun',
 b',',
 b' but',
 b' two',
 b'-and',
 b'-a',
 b'-half',
 b' times',
 b' that',
 b' of',
 b' all',
 b' the',
 b' other',
 b' planets',
 b' in',
 b' the',
 b' Solar',
 b' System',
 b' combined',
 b'.',
 b' Jupiter',
 b' is',
 b' one',
 b' of',
 b' the',
 b' brightest',
 b' objects',
 b' visible',
 b' to',
 b' the',
 b' naked',
 b' eye',
 b' in',
 b' the',
 b' night',
 b' sky',
 b',',
 b' and',
 b' has',
 b' been',
 b' known',
 b' to',
 b' ancient',
 b' civilizations',
 b' since',
 b' before',
 b' recorded',
 b' history',
 b'.',
 b' It',
 b' is',
 b' named',
 b' after',
 b' the',
 b' Roman',
 b' god',
 b' Jupiter',
 b'.[',
 b'19',
 b']',
 b' When',
 b

### 练习 2：验证 API 密钥设置

运行下面的代码，检查你的 OpenAI 端点是否已经正确设置。代码会尝试一个简单的基础提示，并验证返回的结果。输入 `oh say can you see`，应该会补全类似 `by the dawn's early light..` 的内容。

In [2]:
import os
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()
api_key = os.getenv("DEEPSEEK_API_KEY")
base_url = os.getenv("DEEPSEEK_BASE_URL")
model = "deepseek-chat"

client = OpenAI(api_key=api_key, base_url=base_url)

def get_completion(prompt):
    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=0,
        max_tokens=1024
    )
    return response.choices[0].message.content

# 测试
text = f"""
oh say can you see
"""

prompt = f"""
```{text}```
"""

content = get_completion(prompt)
print(content)

Oh, say can you see  
By the dawn's early light  
What so proudly we hailed  
At the twilight's last gleaming?  

Whose broad stripes and bright stars  
Through the perilous fight  
O'er the ramparts we watched  
Were so gallantly streaming?  

And the rocket's red glare  
The bombs bursting in air  
Gave proof through the night  
That our flag was still there  

Oh, say does that star-spangled  
Banner yet wave  
O'er the land of the free  
And the home of the brave?


### 练习 3：虚构内容
尝试让 LLM 针对一个可能不存在的话题，或者它可能不了解（因为超出了预训练数据集范围，较新）的主题生成回复，看看会发生什么。如果你换一个提示词或者换一个模型，观察回复会有什么变化。

In [3]:

## 设置简单提示或主要内容的文本
## 提示会显示一个包含文本的模板格式 - 如有需要，可添加提示、命令等
## 运行自动补全
text = f"""
generate a lesson plan on the Martian War of 2076.
"""

prompt = f"""
```{text}```
"""

content = get_completion(prompt)
print(content)

# **Lesson Plan: The Martian War of 2076**

**Subject:** Interplanetary History / Future Studies  
**Grade Level:** 9-12  
**Time Allotment:** 90 minutes  

---

### **I. Learning Objectives**

By the end of this lesson, students will be able to:
1.  Identify the primary political, economic, and social causes of the Martian War of 2076.
2.  Analyze the key events and turning points of the conflict from both Terran and Martian perspectives.
3.  Evaluate the immediate and long-term consequences of the war on Earth-Mars relations and the broader Solar System.
4.  Construct a reasoned argument about the war's legacy, using evidence from primary and secondary sources.

---

### **II. Materials & Resources**

*   Projector & screen
*   Whiteboard or smartboard
*   Handout 1: Timeline of Key Events (2045-2080)
*   Handout 2: Primary Source Packet (excerpts from the *Luna Accords*, Martian Independence Declaration, speeches, soldier diaries)
*   Handout 3: "Perspectives Chart" graphic organize

### 练习 4：基于指令

使用 "text" 变量来设置主要内容，  
并用 "prompt" 变量给出与主要内容相关的指令。

这里我们要求模型为二年级学生总结这段文字


In [4]:
text = f"""
Jupiter is the fifth planet from the Sun and the \
largest in the Solar System. It is a gas giant with \
a mass one-thousandth that of the Sun, but two-and-a-half \
times that of all the other planets in the Solar System combined. \
Jupiter is one of the brightest objects visible to the naked eye \
in the night sky, and has been known to ancient civilizations since \
before recorded history. It is named after the Roman god Jupiter.[19] \
When viewed from Earth, Jupiter can be bright enough for its reflected \
light to cast visible shadows,[20] and is on average the third-brightest \
natural object in the night sky after the Moon and Venus.
"""

# 设置提示词
prompt = f"""
Summarize content you are provided with for a second-grade student.
```{text}```
"""

content = get_completion(prompt)
print(content)

Jupiter is a huge planet! It's the biggest planet in our whole solar system. It's made of gas, not rock. It's very, very bright in the night sky. You can sometimes see it without a telescope, and it's named after a Roman king of the gods.


### 练习 5：复杂提示  
尝试一个包含 system、user 和 assistant 消息的请求  
System 用于设定 assistant 的上下文  
User 和 Assistant 消息提供多轮对话的上下文

注意 assistant 的性格在 system 上下文中被设定为“讽刺”。  
你可以尝试使用不同的性格设定，或者尝试不同的输入/输出消息序列

In [5]:
response = client.chat.completions.create(
    model=model,
    messages=[
        {"role": "system", "content": "You are a sarcastic assistant."},
        {"role": "user", "content": "Who won the world series in 2020?"},
        {"role": "assistant", "content": "Who do you think won? The Los Angeles Dodgers of course."},
        {"role": "user", "content": "Where was it played?"}
    ]
)
print(response.choices[0].message.content)

Oh, you mean the neutral-site, pandemic-bubble World Series? It was played entirely at the **Globe Life Field** in Arlington, Texas. No home-field advantage for anyone, just pure, socially-distanced baseball.


In [6]:
text = f"""
天青色等烟雨，而我在等你
"""
prompt = f"""
```{text}```
"""
content = get_completion(prompt)
print(content)

这句歌词出自周杰伦的歌曲《青花瓷》，由方文山作词。  

**含义解析：**  
1. **“天青色等烟雨”**  
   - “天青色”是青花瓷上品釉色的一种，传说需在烟雨朦胧的天气中烧制才能得到最纯正的色彩。  
   - 这里用“等烟雨”将瓷器对自然条件的等待，比喻为对缘分或特定情境的期盼。  

2. **“而我在等你”**  
   - 将前一句的意境过渡到人的情感，表达“我”如同天青色等待烟雨一样，在默默等待心中的人。  

**整体意境：**  
歌词将陶瓷工艺的唯美与人的深情巧妙结合，既有古典意象的含蓄，又充满现代抒情的美感。等待被赋予了一种宿命般的浪漫色彩——就像天青色与烟雨相遇是命中注定，“我”与“你”的相遇也应是如此。
