<center><a href="https://www.nvidia.cn/training/"><img src="https://dli-lms.s3.amazonaws.com/assets/general/DLI_Header_White.png" width="400" height="186" /></a></center>

# OpenAI 库 Hello World

In [None]:
from videos.walkthroughs import walkthrough_12 as walkthrough

In [None]:
walkthrough()

在这个 notebook 中，我们将学习如何与 OpenAI API 交互，通过 Llama 3.1 8b 模型进行文本补全（text completion）。本节会介绍如何设置和使用 OpenAI 库以与 LLM 交互。

---

## 目标

完成这个 notebook 后，您将会：

- 理解如何设置和使用 OpenAI 库。
- 使用 Llama 3.1 8b instruct 模型进行文本补全。
- 学会解释和利用 API 响应。
- 理解在像 Llama 3.1 8b instruct 这样的聊天模型中使用 *chat* 补全入口的重要性。

---

## 导入

在这里我们导入 `OpenAI` 库，它将使我们能够与本地托管的 Llama 3.1 8b Instruct NIM 交互，该 NIM 暴露了 OpenAI API。

In [1]:
from openai import OpenAI

---

## 设置 OpenAI 客户端

要开始使用 OpenAI API，我们需要设置 OpenAI 客户端。这涉及到配置基础 URL 和提供 API 密钥。

默认情况下，OpenAI API 服务器监听 `8000` 端口并暴露 `/v1` 入口。在我们的情况下，我们有一个本地运行的 NIM，与您正在交互的 Jupyter 环境在同一台机器上，NIM 的主机名是 `llama`。因此，为了构造用于与 NIM 交互的 `base_url`，我们将使用 `llama` 主机名与 `8000` 端口和 `/v1` 入口结合起来：

In [2]:
base_url = 'http://llama:8000/v1'

创建 OpenAI 客户端时，`api_key` 参数是必需的，但在我们本地运行模型的情况下，实际上并不需要提供 API 密钥。因此我们将把 `api_key` 的值设置为一个任意字符串。

In [3]:
api_key = 'an_arbitrary_string'

现在有了 `base_url` 和 `api_key`，我们可以实例化一个 OpenAI 客户端。

In [4]:
client = OpenAI(base_url=base_url, api_key=api_key)

---

## 观察可用模型

现在我们已经创建了 OpenAI 客户端，可以先通过调用 `client.models.list()` 来看看有哪些能用的模型。正如之前提到的，我们需要一个 Llama 3.1 8B Instruct 模型。

In [5]:
available_models = client.models.list()

In [6]:
available_models

SyncPage[Model](data=[Model(id='meta/llama-3.1-8b-instruct', created=1743929814, object='model', owned_by='system', root='meta/llama-3.1-8b-instruct', parent=None, max_model_len=131072, permission=[{'id': 'modelperm-abcfb289463d4cac8c3dad829982c049', 'object': 'model_permission', 'created': 1743929814, 'allow_create_engine': False, 'allow_sampling': True, 'allow_logprobs': True, 'allow_search_indices': False, 'allow_view': True, 'allow_fine_tuning': False, 'organization': '*', 'group': None, 'is_blocking': False}])], object='list')

这里有很多信息我们并不关心，稍微深入一下这个对象就能更清楚地看到可用的模型：

In [7]:
available_models.data[0].id

'meta/llama-3.1-8b-instruct'

---

## 发起简单的聊天补全请求

创建了 `client` 实例后，我们可以通过使用 `client.chat.completions.create` 方法发起一个简单的请求来实现聊天补全，该方法需要用到 `model`，以及一组要发送给模型的 `messages`。关于 `messages` 列表的细节稍后会详细讨论，现在我们将传入一个简单的单条消息，是一个用户（您）要求模型讲一个关于太空的有趣事实的提示词。

In [8]:
model = 'meta/llama-3.1-8b-instruct'
prompt = 'Tell me a fun fact about space.'

In [9]:
response = client.chat.completions.create(
    model=model,
    messages=[{'role': 'user', 'content': prompt}]
)

In [10]:
print(response)

ChatCompletion(id='chat-df205b24e4af4f8cb46361b159e2e504', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Here\'s one:\n\n**Gravity is actually NOT a global phenomenon!**\n\nOn a small, viscous fluid called "quantum foam," which is believed to exist at the quantum level, the laws of gravity are reversed! This phenomenon is known as "Negative Mass" or "Exotic Matter." Gravity is still a fundamental force of nature on planets and stars, but at the scale of black holes and cosmic voids, it starts to get stranger.\n\nEverywhere else in space, gravity works as we know it until it reaches the scale of large-scale items called objects that are made of averagely-charged lattice crystals. (It gets too complicated, but it\'s too real too!)\n\nThe main research in negative mass was published in 2007, dated to exist still today as a sister vantage point against classical gravity in research promote the flawed des asked in mixed ast terms net ex

API 响应中提供了相当多的信息，但我们最关心的是模型的响应。

在这里，我们从完整的 API 响应中解析出模型生成的响应。

In [11]:
model_response = response.choices[0].message.content

In [12]:
print(model_response)

Here's one:

**Gravity is actually NOT a global phenomenon!**

On a small, viscous fluid called "quantum foam," which is believed to exist at the quantum level, the laws of gravity are reversed! This phenomenon is known as "Negative Mass" or "Exotic Matter." Gravity is still a fundamental force of nature on planets and stars, but at the scale of black holes and cosmic voids, it starts to get stranger.

Everywhere else in space, gravity works as we know it until it reaches the scale of large-scale items called objects that are made of averagely-charged lattice crystals. (It gets too complicated, but it's too real too!)

The main research in negative mass was published in 2007, dated to exist still today as a sister vantage point against classical gravity in research promote the flawed des asked in mixed ast terms net export mature growth authority ordins uniform durch Ack Plot relates business architecture after Marbleเนcrit attention  recent replies Certain rel publicity heard représ s

---

## 练习：创建您的第一个提示词

使用我们现有的 OpenAI API `client` 来生成并打印一个本地 Llama 3.1 8b 模型的响应，用您自己的提示词。

### 您的代码

### 参考答案

In [13]:
prompt = 'What is the OpenAI API?'

In [14]:
response = client.chat.completions.create(
    model=model,
    messages=[{'role': 'user', 'content': prompt}]
)

In [15]:
model_response = response.choices[0].message.content

In [16]:
print(model_response)

The OpenAI API is a cloud-based API that provides access to various artificial intelligence (AI) capabilities for developers, enterprises, and researchers. It offers a range of services, including natural language processing (NLP), computer vision, and conversational AI, based on the concepts and techniques developed by OpenAI.

OpenAI is a company founded by Elon Musk, Sam Altman, and other prominent figures in the tech industry, with the goal of advancing the development of generally beneficial artificial general intelligence (AGI). The company's mission is to deploy AGI safely and transparently, while minimizing risks and maximizing benefits.

The OpenAI API offers a range of capabilities, including:

1. **Text Generation**: Generate human-like text based on a prompt or context.
2. **Chatbots**: Create custom chatbots that can converse with users and perform tasks.
3. **Translation**: Translate text from one language to another.
4. **Summarization**: Summarize long pieces of text in

---

## 理解补全和聊天补全入口

我们一直在使用 `chat.completions` 入口，但在使用 OpenAI API 时，您也可以选择使用 `completions` 入口。理解这些入口之间的差异非常重要，因为它们处理提示词和生成响应的方式不同，即使是对于单个提示词。

`chat.completions` 入口旨在处理多轮对话，跟踪先前消息提供的上下文。通过预测交互，它生成更简洁、切中主题的响应，即使只提供了单个提示词。

而 `completions` 入口则是为了生成针对单条提示词的响应，不维持对话上下文。它的目标是回应给定的提示词，而不是以对话的方式进行响应。

主要的要点是，当您使用“聊天”或“指令”模型（比如您今天使用的 llama-3.1-8b-instruct 模型）时，请使用 `chat.completions` 而不是 `completions`。

---

## 总结

完成这个 notebook 后，您应该对如何使用 OpenAI 库进行聊天补全并解析模型响应有了基本的了解。为接下来更高级的主题和提示工程打下了基础。

下一个 notebook 中，我们将探讨如何使用 LangChain 与语言模型交互，这将为管理和生成文本提供更多灵活性和高级功能。