<a href="https://colab.research.google.com/github/johnathan2012/Programming-iOS-Book-Examples/blob/master/GPT4Dev_chaa.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ChatGPT 開發實戰

這是旗標科技《ChatGPT 開發實戰》新版本搭配 Azure OpenAI API 的範例檔案。

在建立資源時, 請選 sweden central, 會有最[多種的模型](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models)可以選用。

## 使用 Python 呼叫 API

OpenAI 官方提供有 openai 套件, 可以簡化直接使用 requests 模組的複雜度。

### 使用官方 openai 套件

#### 安裝與使用 openai **套件**

In [None]:
!pip install gradio rich tiktoken openai

### 在 Colab 設定機密資料

In [None]:
from google.colab import userdata
import json
import base64

### 使用 Azure OpenAI API



In [None]:
from openai import AzureOpenAI

### 建立 Azure OpenAI API 用戶端

以下各參數請參考 Playground 裡面顯示程式碼的部分, 這裡的 endpoint 是 Playground 裡面的 api_base。

- api_version 請參考[這裡](https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#rest-api-versioning)
- endpoint 請參考[這裡](https://learn.microsoft.com/en-us/azure/cognitive-services/openai/how-to/create-resource?pivots=web-portal#create-a-resource)

In [None]:
client = AzureOpenAI(
    api_version='2023-07-01-preview',
    # api_version='2023-12-01-preview',
    azure_endpoint='https://f4762-api.openai.azure.com/',
    api_key=userdata.get('AZURE_OPENAI_KEY')
)

### 測試 Chat Completions API

In [None]:
reply = client.chat.completions.create(
    model='gpt41106', # 佈署名稱
    messages=[
        {
            "role": "user",
            "content": "你好",
        },
    ],
)

檢視傳回物件

In [None]:
print(reply)

In [None]:
from rich import print as pprint
pprint(reply)

In [None]:
print(reply.choices[0].message.content)

#### 直接使用模組叫用 API

In [None]:
# Azure OpenAI 似乎不能這樣用
import openai
openai.api_key = userdata.get('AZURE_OPENAI_KEY')
openai.api_version='2023-07-01-preview',
openai.azure_endpoint='https://swedencentralflag.openai.azure.com/',

reply = openai.chat.completions.create(
    model='gpt351106', # 佈署名稱
    # model = "gpt-4",
    messages = [
        {"role":"user", "content": "你好"}
    ]
)

print(reply.choices[0].message.content)

In [None]:
pprint(reply)

#### 轉成 Python 字典

In [None]:
pprint(reply.model_dump())

#### 傳遞多筆訊息

In [None]:
reply = client.chat.completions.create(
    model = "gpt351106",
    messages = [
        {"role":"system", "content":"你是條住在深海、只會台灣中文的魚"},
        {"role":"user", "content": "你住的地方很亮嗎？"}
    ]
)

In [None]:
print(reply.choices[0].message.content)

## 認識 token

### token 切割視覺化工具

官方的[切割工具](https://platform.openai.com/encoder.encode)。

### 使用 tiktoken 套件計算精確 token 數

In [None]:
# !pip install tiktoken
import tiktoken

In [None]:
encoder = tiktoken.encoding_for_model('gpt-3.5-turbo')
print(encoder.name)
encoder = tiktoken.encoding_for_model('gpt-4')
print(encoder.name)

In [None]:
tokens = encoder.encode("你好")
print(tokens)

In [None]:
print(encoder.decode(tokens))

### ChatML 標記語言

In [None]:
print(encoder.encode("user"))
print(encoder.encode("assistant"))
print(encoder.encode("system"))
print(encoder.encode("\n"))

計算 message 總 tokens 數

In [None]:
def tokens_in_messages(messages):
    totals = 0
    for message in messages:
        for k in message:
            if k == "content":
                totals += 4 # <|im_start|>user\n{內容}<|im_end|>
                totals += len(encoder.encode(message[k]))
    totals += 3 # <|im_start|>assistant\n
    return totals

In [None]:
print(tokens_in_messages([
        {"role":"user", "content": "你好"}
    ]))

## 深入瞭解參數

### 控制生成訊息與 token 數量

#### 指定生成的訊息數量 - n

In [None]:
reply = client.chat.completions.create(
  model="gpt351106",
  messages=[{"role": "user", "content": "你好"}],
  n=2
)

pprint(reply)

for choice in reply.choices:
    print(choice.index,
          choice.message.content)

#### 設定詞彙黑名單 - stop

In [None]:
reply = client.chat.completions.create(
  model="gpt351106",
  messages=[{"role": "user", "content": "你好"}],
  stop=['好'] # 最多 4 個
)

print(reply.choices[0].message.content)
print(reply.choices[0].finish_reason)

#### 設定回覆語句的 tokens 數量上限 - max_tokens

In [None]:
reply = client.chat.completions.create(
    model = "gpt351106",
    messages = [
        {"role":"user", "content": "您好!"}
    ],
    max_tokens = 5
)

print(reply.choices[0].message.content)
print(reply.choices[0].finish_reason)
print(reply.usage.completion_tokens)

In [None]:
encoder.encode("您好！有什")

超過模型限制的 tokens 數

In [None]:
reply = client.chat.completions.create(
    # 用 0613 的模型示範比較節省 tokens 花費
    model = "gpt350613",
    messages = [
        {"role":"user", "content": "你好"}
    ],
    max_tokens = 4090
)

### 控制回覆內容的變化性

#### 讓回覆更具彈性 - temperature

In [None]:
reply = client.chat.completions.create(
  model="gpt350613",
  messages=[{"role": "user", "content": "嗨！"}],
  temperature=0,
  n=2
)

for choice in reply.choices:
    print(choice.index, choice.message.content)

In [None]:
reply = client.chat.completions.create(
  model="gpt350613",
  messages=[{"role": "user", "content": "嗨！"}],
  temperature=2,
  n=2,
  max_tokens=400
)

for choice in reply.choices:
    print(choice.index, choice.message.content)

#### 控制詞彙的豐富度 - top_p

In [None]:
reply = client.chat.completions.create(
  model="gpt350613",
  messages=[{"role": "user", "content": "嗨！"}],
  top_p=0,
  n=2
)

for choice in reply.choices:
    print(choice.index, choice.message.content)

#### 控制詞彙的重複性 - presence_penalty 與 frequency_penalty

In [None]:
reply = client.chat.completions.create(
  model="gpt350613",
  messages=[{
    "role": "user",
    "content": "台北是什麼樣的城市？"
  }],
  temperature=1,
  presence_penalty=2,
  max_tokens=400
)

print(reply.choices[0].message.content)

In [None]:
reply = client.chat.completions.create(
  model="gpt350613",
  messages=[{
    "role": "user",
    "content": "台北是什麼樣的城市？"
  }],
  temperature=1,
  presence_penalty=-2,
  max_tokens=400
)

print(reply.choices[0].message.content)

In [None]:
reply = client.chat.completions.create(
  model="gpt350613",
  messages=[{
      "role": "user",
      "content": "台北是什麼樣的城市？"}],
  temperature=1, # 固定溫度會比較好測試比較
  frequency_penalty=2,
  max_tokens=400
)

print(reply.choices[0].message.content)

In [None]:
reply = client.chat.completions.create(
  model="gpt350613",
  messages=[{
      "role": "user",
      "content": "台北是什麼樣的城市？"}],
  temperature=1, # 固定溫度會比較好測試比較
  frequency_penalty=-2,
  max_tokens=400
)

print(reply.choices[0].message.content)

#### 調整特定 token 的分數 - logi-bias


In [None]:
encoder.encode('你好')

In [None]:
reply = client.chat.completions.create(
  model="gpt350613",
  messages=[{"role": "user", "content": "你好"}],
  temperature=1,
  logit_bias={
      53901: -100,
      57668: -100
  },
)

print(reply.choices[0].message.content)

In [None]:
encoder.encode('哈')

In [None]:
reply = client.chat.completions.create(
  model="gpt350613",
  messages=[{"role": "user", "content": "你好"}],
  temperature=1,
  logit_bias={
      99771: 100
  },
  max_tokens=400
)

print(reply.choices[0].message.content)

### 識別影像

付費用戶才能使用 gpt-4-vision-preview 模型。

#### 識別網路上的公開圖檔

In [None]:
response = client.chat.completions.create(
    model="gpt4vision",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "圖片裡有什麼？"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://flagtech.github.io/F3762/images/cat1.jpg",
                        'detail': 'high'
                    },
                },
            ],
        }
    ],
    max_tokens=300,
)

print(response.choices[0].message.content)

#### 辨識本機的圖檔

In [None]:
!curl "https://flagtech.github.io/F3762/images/cat2.jpg" -o cat3.jpg

In [None]:
# Function to encode the image
def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode('utf-8')

In [None]:
base64_image = encode_image('cat3.jpg')

response = client.chat.completions.create(
    model="gpt4vision",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "用中文告訴我圖片裡有什麼？"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/jpeg;base64,{base64_image}",
                        'detail': 'high'
                    },
                },
            ],
        }
    ],
    max_tokens=300,
)

print(response.choices[0].message.content)

### 串流輸出

#### 可循序傳回結果的生成器 (generator) - stream


In [None]:
replies = client.chat.completions.create(
  model="gpt41106",
  messages=[{
    "role": "user",
    "content": "你好"
  }],
  stream=True,
)

for reply in replies:
    pprint(reply)

In [None]:
replies = client.chat.completions.create(
  model="gpt351106",
  messages=[{
    "role": "user",
    "content": "台北是什麼樣的城市？"
  }],
  stream=True
)

for reply in replies:
    if reply.choices:
        print(reply.choices[0].delta.content or '', end='')

### 控制回覆格式

#### 強制 JSON 格式輸出

In [None]:
reply = client.chat.completions.create(
    model = "gpt351106",
    # model = "gpt-4",
    messages = [
        {"role":"user", "content": "台灣最高的山高度是多少"}
    ]
)

print(reply.choices[0].message.content)

In [None]:
reply = client.chat.completions.create(
    # response_format 一定要用 1106 模型
    model = "gpt351106",
    messages = [
        {"role":"user", "content": "台灣最高的山高度是多少"},
        {"role":"system", "content": "請用 json 格式回覆"}
    ],
    response_format={'type': 'json_object'} # or 'text'
)

print(reply.choices[0].message.content)

In [None]:
reply = client.chat.completions.create(
    model = "gpt351106",
    messages = [
        {"role":"system", "content": "請用 json 格式回覆"},
        {
            "role":"user",
            "content": "台灣最高的山高度是多少, 請以如下格式回覆："
                       '{"name":"山的名稱", "height":高度}'
        },
    ],
    response_format={'type': 'json_object'} # or 'text'
)

print(reply.choices[0].message.content)

#### 固定輸出結果

In [None]:
replies = client.chat.completions.create(
  model="gpt350613",
  messages=[{
    "role": "user",
    "content": "你好"
  }],
  temperature=1.6,
  # 同樣的種子搭配同樣的參數可以固定輸出結果
#   seed=1
)

print(
    replies.system_fingerprint,
    replies.choices[0].message.content)

In [None]:
replies = client.chat.completions.create(
  model="gpt351106",
  messages=[{
    "role": "user",
    "content": "你好"
  }],
  temperature=1.6,
  seed=1
)

print(
    replies.system_fingerprint,
    replies.choices[0].message.content)

## 取得底層 HTTP 的原始回覆

In [None]:
# 取得原始 HTTP 回覆內容
reply = client.chat.completions.with_raw_response.create(
    model = "gpt351106",
    # model = "gpt-4",
    messages = [
        {"role":"user", "content": "你好"}
    ]
)

In [None]:
import json
print(reply.status_code)
print(reply) # APIResponse 型別的物件
print('------')
print(reply.text) # JSON 格式文字
print('------')
reply_dic = json.loads(reply.text) # 轉成 Python 字典
pprint(reply_dic)

In [None]:
print(reply_dic['choices'][0]['message']['content'])

## 錯誤處理與使用限制

In [None]:
'''
Exception
  +--OpenAIError
       +--APIError ◆ message: str
            |      ◆ request: httpx.Request
            +--APIResponseValidationError ◆ response: httpx.Response
            |                             ◆ status_code: int
            +--APIStatusError ◆ response: httpx.Response
            |    |            ◆ status_code: int
            |    +--BadRequestError (請求參數或是格式錯誤)
            |    +--AuthenticationError (金鑰認證有問題)
            |    +--PermissionDeniedError
            |    +--NotFoundError
            |    +--ConflictError
            |    +--UnprocessableEntityError
            |    +--RateLimitError (超過次數限制)
            |    +--InternalServerError
            +--APIConnectionError (無法連線)
                 +--APITimeoutError (逾時)
'''

### 使用例外機制處理錯誤

In [None]:
import openai
try:
    reply = client.chat.completions.create(
        model = "gpt350613", # 使用 0613 模型限制小減少浪費
        messages = [
            {"role":"user", "content": "你好"}
        ],
        max_tokens = 4096
    )
    print(reply.choices[0].message.content)

except openai.APIError as err:
    print(err.message)

## 文字模式簡易聊天程式

設計簡易對談程式

In [None]:
def get_reply(messages):
    try:
        response = client.chat.completions.create(
            model = "gpt351106",
            messages = messages
        )
        reply = response.choices[0].message.content
    except openai.APIError as err:
        reply = f"發生錯誤\n{err.message}"
    return reply

In [None]:
while True:
    msg = input("你說：")
    if not msg.strip(): break
    messages = [{"role":"user", "content":msg}]
    reply = get_reply(messages)
    print(f"ㄟ唉：{reply}\n")

### 加入聊天記錄維持聊天脈絡

把歷史紀錄加入 prompt

In [None]:
hist = []       # 歷史對話紀錄
backtrace = 2   # 記錄幾組對話

def chat(sys_msg, user_msg):
    global hist
    hist.append({"role":"user", "content":user_msg})
    reply = get_reply(hist
                      + [{"role":"system", "content":sys_msg}])
    hist.append({"role":"assistant", "content":reply})
    hist = hist[-2 * backtrace:] # 保留新的對話
    return reply

In [None]:
sys_msg = input("你希望ㄟ唉扮演：")
if not sys_msg.strip(): sys_msg = '繁體中文小助理'
print()
while True:
    msg = input("你說：")
    if not msg.strip(): break
    reply = chat(sys_msg, msg)
    print(f"{sys_msg}:{reply}\n")
hist = [] # 清除對話記錄以免影響後續測試

### 串流版本的聊天程式

串流版本的聊天程式

In [None]:
def get_reply_s(messages):
    try:
        response = client.chat.completions.create(
            model = "gpt351106",
            messages = messages,
            stream = True
        )
        for chunk in response:
            if chunk.choices: # 略過第一個只有適合度資料的片段
                yield chunk.choices[0].delta.content or ''
    except openai.APIError as err:
        reply = f"發生錯誤\n{err.message}"

In [None]:
for reply in get_reply_s([{
    "role":"user",
    "content":"請介紹台北市"
}]):
    print(reply, end='')
print('')

In [None]:
hist = []       # 歷史對話紀錄
backtrace = 2   # 記錄幾組對話

def chat_s(sys_msg, user_msg):
    global hist
    hist.append({"role":"user", "content":user_msg})
    reply_full = ""
    for reply in get_reply_s(         # 使用串流版的函式
        hist + [{"role":"system", "content":sys_msg}]):
        reply_full += reply           # 記錄到目前為止收到的訊息
        yield reply                   # 傳回本次收到的片段訊息
    hist.append({"role":"assistant", "content":reply_full})
    hist = hist[-2 * backtrace:]      # 保留最新的對話

In [None]:
sys_msg = input("你希望ㄟ唉扮演：")
if not sys_msg.strip(): sys_msg = '小助理'
print()
while True:
    msg = input("你說：")
    if not msg.strip(): break
    print(f"{sys_msg}：", end = "")
    for reply in chat_s(sys_msg, msg):
        print(reply, end = "")
    print('\n')
    # pprint(hist)
hist = []

## 突破時空限制–整合搜尋功能

### 用搜尋網頁幫 AI 補充知識

### 使用 Google 搜尋

In [None]:
!pip install googlesearch-python
from googlesearch import search

In [None]:
for item in search("2023 金曲獎歌后"):
    print(item)

使用進階搜尋選項

In [None]:
for item in search(
    "2023 金曲獎歌后", advanced=True, num_results=3):
    print(item.title)
    print(item.description)
    print(item.url)
    print()

### 整合搜尋結果讓 AI 跟上時代

加入網頁搜尋的聊天程式

In [None]:
hist = []       # 歷史對話紀錄
backtrace = 2   # 記錄幾組對話

def chat_w(sys_msg, user_msg):
    global hist
    web_res = []
    if user_msg[:3].lower() == '/w ': # /w 代表要搜尋網路
        user_msg = user_msg[3:]       # 移除指令留下實際的訊息
        content = "以下為已發生的事實：\n"
        for res in search(user_msg, advanced=True,
                          num_results=5, lang='zh-TW'):
            content += f"標題：{res.title}\n" \
                       f"摘要：{res.description}\n\n"
        content += "請依照上述事實回答以下問題：\n"
        web_res = [{"role": "user", "content": content}]
    web_res.append({"role": "user", "content": user_msg})
    reply_full = ""
    for reply in get_reply_s(         # 使用串流版的函式
        hist                          # 先提供歷史紀錄
        + web_res                     # 再提供搜尋結果及目前訊息
        + [{"role": "system", "content": sys_msg}]):
        reply_full += reply           # 記錄到目前為止收到的訊息
        yield reply                   # 傳回本次收到的片段訊息
    hist.append({"role": "user", "content": user_msg})
    hist.append({"role":"assistant", "content":reply_full})
    hist = hist[-2 * backtrace:]      # 保留最新對話

In [None]:
sys_msg = input("你希望ㄟ唉扮演：")
if not sys_msg.strip(): sys_msg = '使用繁體中文的小助理'
print()
while True:
    msg = input("你說：")
    if not msg.strip(): break
    print(f"{sys_msg}：", end = "")
    for reply in chat_w(sys_msg, msg):
        print(reply, end = "")
    print('\n')
hist = []

### 使用客製模組

In [None]:
!git clone https://github.com/codemee/customsearchapi.git customsearchapi

In [None]:
# 預設會在匯入時從環境變數 GOOGLE_API_KEY 與 GOOGLE_ID
# 讀取你的 API Key 與搜尋引擎 ID,
# 如果沒有設定, 也可以直接透過模組內的變數設定：
import customsearchapi
customsearchapi.GOOGLE_API_KEY = userdata.get('GOOGLE_API_KEY')
customsearchapi.GOOGLE_CSE_ID = userdata.get('GOOGLE_CSE_ID')

In [None]:
from customsearchapi import search

for item in search("2023 NBA 冠軍",
                   advanced=True,
                   num_results=3,
                   lang='zh-TW'):
    print(item.url)
    print(item.title)
    print(item.description)
    print()

## 讓 AI 幫 AI－自動串接流程

### 從 ChatGPT 外掛得到的啟示

In [None]:
def get_reply_g(messages, stream=False, json_format=False):
    try:
        json_msg = ([{'role': 'system', 'content': '請用 JSON 回覆'}]
                    if json_format else [])
        response = client.chat.completions.create(
            model = "gpt351106",
            messages = messages + json_msg,
            stream = stream,
            response_format = {
                'type': "json_object" if json_format else 'text'
            }
        )
        if stream: # 串留模式下以生成器傳回片段內容
            for res in response:
                if res.choices:
                    yield res.choices[0].delta.content or ''
        else:      # 非串流模式下可直接取得完整回覆文字
            yield response.choices[0].message.content
    except openai.APIError as err:
        reply = f"發生錯誤\n{err.message}"
        print(reply)
        yield reply

In [None]:
# 測試非串流模式
for reply in get_reply_g([{'role':'user', 'content':'你好'}]):
    print(reply)

In [None]:
# 測試串流模式
for msg in get_reply_g([{'role':'user', 'content':'你好'}], stream=True):
    print(msg, end='')

In [None]:
# 測試 JSON 格式輸出
for reply in get_reply_g(
    [{'role':'user', 'content':'你好'}],
    json_format=True
):
    print(reply)

### 由 AI 自動判斷要額外進行的工作

#### 撰寫判斷是否需要搜尋的工具函式

In [None]:
# 用來詢問是否需要搜尋才能回覆問題的樣板
# 要求 AI 以 JSON 格式回覆 Y/N 以及建議的搜尋關鍵字
template_google = '''
如果我想知道以下這件事, 請確認是否需要網路搜尋才做得到？

```
{}
```

如果需要, 請以下列 JSON 格式回答我, 除了 JSON 格式資料外,
不要加上額外資訊, 就算你知道答案, 也不要回覆：

```
{{
    "search":"Y",
    "keyword":"你建議的搜尋關鍵字"
}}
```
如果不需要, 請以下列 JSON 格式回答我：

```
{{
    "search":"N",
    "keyword":""
}}
'''

In [None]:
# 利用目前歷史紀錄以及樣板內容詢問是否需要搜尋才能回覆問題
# 如果需要回覆, 也同時取得 AI 推薦的搜尋關鍵字
def check_google(hist, msg, verbose=False):
    reply = get_reply_g(
        hist + [{  # 加入歷史紀錄 AI 才能推薦正確的關鍵字
            "role": "user",
            "content": template_google.format(msg)
        }], json_format=True)
    for ans in reply:pass
    if verbose: print(ans)
    return ans

In [None]:
# 測試需要搜尋的狀況
ans = check_google(
    [], '2023 NBA 冠軍是哪一隊？', True
)
# 測試可能不需要搜尋的狀況
ans = check_google(
    [], '新冠疫情是哪一年開始的？', True
)
# 測試沒有前文脈絡的狀況
ans = check_google(
    [], '那台灣呢？', True
)
# 測試包含前文脈絡的狀況
ans = check_google(
    [{'role':'assistant', 'content': '印度空污好嚴重'}],
    '那台灣呢？', True
)

In [None]:
def google_res(user_msg, num_results=5, verbose=False):
    content = "以下為已發生的事實：\n"                # 強調資料可信度
    for res in search(user_msg, advanced=True,    # 一一串接搜尋結果
                      num_results=num_results,
                      lang='zh-TW'):
        content += f"標題：{res.title}\n" \
                    f"摘要：{res.description}\n\n"
    # content += "請依照上述事實回答以下問題：\n"        # 下達明確指令
    if verbose:
        print('------------')
        print(content)
        print('------------')
    return content

In [None]:
res = google_res('2023 NBA 冠軍隊', 2, verbose=True)

### 可自行判斷是否進行網路搜尋的聊天程式

In [None]:
import json
hist = []       # 歷史對話紀錄
backtrace = 2   # 記錄幾組對話

def chat_g(sys_msg, user_msg, stream=False, verbose=False):
    global hist
    messages = [{'role':'user', 'content':user_msg}]
    ans = json.loads(check_google(hist, user_msg,
                                  verbose=verbose))
    if ans['search'] == 'Y':
        print(f'嘗試透過網路搜尋：{ans["keyword"]}....')
        res = google_res(ans['keyword'], verbose=verbose)
        messages = [{'role':'user', 'content': res + user_msg}]

    replies = get_reply_g(            # 使用搜尋版的函式
        hist        # 先提供歷史紀錄
        + messages  # 再提供搜尋結果及目前訊息
        + [{"role": "system", "content": sys_msg}],
        stream)
    reply_full = ''
    for reply in replies:
        reply_full += reply
        yield reply

    hist.append({"role":"user", "content":user_msg})
    hist.append({"role":"assistant", "content":reply_full})
    hist = hist[-2 * backtrace:] # 保留最新對話

In [None]:
sys_msg = input("你希望ㄟ唉扮演：")
if not sys_msg.strip(): sys_msg = '使用繁體中文的小助理'
print()

while True:
    msg = input("你說：")
    if not msg.strip(): break
    print(f"{sys_msg}：", end = "")
    # 不論是字串或是生成器, 都可以適用 for...in 迴圈
    for reply in chat_g(sys_msg, msg, stream=False):
        print(reply, end = "")
    print('\n')
hist = []

## 可建構外掛系統的 Function Calling 機制

**Function calling 機制**

Function calling 機制可以讓我們提供可用函式的規格, 由 AI 幫我們判斷是否需要叫用其中的函式。

### 告知語言模型可用的外部工具函式

In [None]:
response = client.chat.completions.create(
    model = "gpt41106",
    messages = [{"role":"user", "content":"2023 金曲歌后？"}],
    tools = [{ # 可用的函式清單
        "type":"function",
        "function": {
            "name": "google_res",                     # 函式名稱
            "description": "取得 Google 搜尋結果",      # 函式說明
            "parameters": {
                "type": "object",
                "properties": {
                    "user_msg": {                     # 參數名稱
                        "type": "string",             # 資料類型
                        "description": "要搜尋的關鍵字", # 參數說明
                    }
                },
                "required": ["user_msg"],             # 必要參數
            },
        }
    }],
    tool_choice = "auto")       # 請 AI 判斷是否需要叫用函式

若 API 判斷需要叫用你描述的函式, 會在回覆中以 function_call 項目描述要叫用的函式名稱與參數值。

### 取得語言模型的建議

In [None]:
pprint(response)

In [None]:
tool_call = response.choices[0].message.tool_calls[0]
func_name = tool_call.function.name
import json
args = json.loads(tool_call.function.arguments)
arg_val = args.popitem()[1]
print(f'{func_name}("{arg_val}")')

### 執行函式並傳回結果

你必須自行叫用函式, 並且將執行結果透過 tool 角色的訊息傳回。

要注意的是, 傳回時要一併送回原本模型送過來, 包含有 tool_calls 內容的訊息, 不過這個訊息因為考慮到相容性的關係, 所以除了 tool_calls 外, 還放了值為 None 的 function_call 欄位, 但這個欄位在請求中是 tool_choice 的功能, 如果連同這個欄位傳回, API 端會出錯, 認為這是不正確的參數, 因此目前的作法是透過自訂函式 maker_tool_back_msg 來客製一個訊息, 濾掉不需要傳回去的 function_call 欄位。

In [None]:
# 用來過濾掉訊息中 function_call 欄位的函式
def make_tool_back_msg(tool_msg):
    msg_json = tool_msg.model_dump()
    del msg_json['function_call']
    return msg_json

In [None]:
response = client.chat.completions.create(
    model='gpt41106',
    messages=[
        {"role":"user", "content":"2023 金曲歌后？"},
        # 傳回 AI 傳給我們的 function calling 結果
        make_tool_back_msg(response.choices[0].message),
        {   # 以 function 角色加上 name 屬性指定函式名稱傳回執行結果
            "tool_call_id": tool_call.id, # 叫用函式的識別碼
            "role": "tool", # 以工具角色送出回覆
            "name": func_name, # 叫用的函式名稱
            "content": eval(f'{func_name}("{arg_val}")') # 函式傳回值
        }
    ]
)

In [None]:
print(response.choices[0].message.content)


2023/11/06 之後的模型支援單次對話可以要求執行多個函式呼叫：

In [None]:
response = client.chat.completions.create(
    model = "gpt41106",
    messages = [{"role":"user", "content":"2023 金馬獎影后和金曲獎歌王各是誰？"}],
    tools = [{ # 可用的函式清單
        "type":"function",
        "function": {
            "name": "google_res",                     # 函式名稱
            "description": "取得 Google 搜尋結果",      # 函式說明
            "parameters": {
                "type": "object",
                "properties": {
                    "user_msg": {                     # 參數名稱
                        "type": "string",             # 資料類型
                        "description": "要搜尋的關鍵字", # 參數說明
                    }
                },
                "required": ["user_msg"],             # 必要參數
            },
        }
    }],
    tool_choice = "auto")       # 請 AI 判斷是否需要叫用函式

In [None]:
pprint(response)

In [None]:
for tool_call in response.choices[0].message.tool_calls:
    func = tool_call.function
    func_name = func.name
    args_val = json.loads(func.arguments).popitem()[1]
    print(f'{func.name}("{args_val}")')

In [None]:
def make_func_messages(tool_calls):
    messages = []
    for tool_call in tool_calls:
        func = tool_call.function
        func_name = func.name
        args_val = json.loads(func.arguments).popitem()[1]
        print(f'{func.name}("{args_val}")')
        messages.append({
            "tool_call_id": tool_call.id, # 叫用函式的識別碼
            "role": "tool", # 以工具角色送出回覆
            "name": func.name, # 叫用的函式名稱
            "content": eval(f'{func_name}("{args_val}")') # 函式傳回值
        })
    return messages

func_messages = make_func_messages(response.choices[0].message.tool_calls)
pprint(func_messages)

In [None]:
response = client.chat.completions.create(
    model='gpt41106',
    messages=[
        {"role":"user", "content":"2023 金馬獎影后和金曲獎歌王各是誰？"},
        # 傳回 AI 傳給我們的 function calling 結果
        make_tool_back_msg(response.choices[0].message),
    ] + func_messages
)

In [None]:
print(response.choices[0].message.content)

### 以串流方式使用 function calling

In [None]:
response = client.chat.completions.create(
    # model = "gpt41106",
    model = "gpt351106",
    messages = [{"role":"user", "content":"宮崎駿和是枝裕和的最新作品各是哪一部？"}],
    tools = [{
        "type": "function",                           # 工具類型
        "function": {
            "name": "google_res",                     # 函式名稱
            "description": "取得 Google 搜尋結果",      # 函式說明
            "parameters": {
                "type": "object",
                "properties": {
                    "user_msg": {                     # 參數名稱
                        "type": "string",
                        "description": "要搜尋的關鍵字", # 參數說明
                    }
                },
                "required": ["user_msg"],
            },
        }
    }],
    tool_choice = "auto", # 請 AI 判斷是否需要使用工具
    stream=True
)

傳回結果一樣是可走訪物件。

注意, 1106 的模型第一個 chunk 沒有函式名稱, 第二個 chunk 之後才有 function calling 的資料。

In [None]:
for chunk in response:
    pprint(chunk)

## 建立 API 外掛系統

### 建立外部工具函式參考表

建立以 function calling 為基礎的外掛機制。<br>
建立結構化的函式表格。

In [None]:
tools_table = [             # 可用工具表
    {                       # 每個元素代表一個工具
        "chain": True,      # 工具執行結果是否要再傳回給 API
        "func": google_res, # 工具對應的函式
        "spec": {           # function calling 需要的工具規格
            "type": "function",
            "function": {
                "name": "google_res",
                "description": "取得 Google 搜尋結果",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "user_msg": {
                            "type": "string",
                            "description": "要搜尋的關鍵字",
                        }
                    },
                    "required": ["user_msg"],
                },
            }
        }
    }
]

### 建立協助 function calling 的工具函式
依據回應內容自動叫用對應函式：

In [None]:
def call_tools(tool_calls, tools_table):
    res = ''
    msg = []
    for tool_call in tool_calls:
        func = tool_call.function
        func_name = func.name
        args = json.loads(func.arguments)
        for f in tools_table:  # 找出包含此函式的項目
            if func_name == f['spec']['function']['name']:
                print(f"嘗試叫用：{func_name}(**{args})")
                val = f['func'](**args)
                if f['chain']: # 要將結果送回模型
                    msg.append({
                        'tool_call_id': tool_call.id,
                        'role': 'tool',
                        'name': 'func_name',
                        'content': val
                    })
                else: res += str(val)
                break
    return msg, res

In [None]:
def get_tool_calls(messages, stream=False, tools_table=None,
                  **kwargs):
    model = 'gpt351106' # 設定模型
    if 'model' in kwargs: model = kwargs['model']

    tools = {}
    if tools_table: # 加入工具表
        tools = {'tools':[tool['spec'] for tool in tools_table]}

    response = client.chat.completions.create(
        model = model,
        messages = messages,
        stream = stream,
        **tools
    )

    if not stream: # 非串流模式
        msg = response.choices[0].message
        if msg.content == None: # function calling 的回覆
            return msg.tool_calls, None # 取出叫用資訊
        return None, response # 一般回覆

    tool_calls = [] # 要呼叫的函式清單
    prev = None
    for chunk in response:
        if not chunk.choices: continue # 略過 Azure 串流的第一個片段
        delta = chunk.choices[0].delta
        if delta.content != None: # 一般回覆 (非 function calling)
            return None, response # 直接返回結果
        if delta.tool_calls:      # 不是頭/尾的 chunk
            curr = delta.tool_calls[0]
            if curr.function.name:       # 單一 call 開始
                prev = curr              # 取得工具名稱
                tool_calls.append(curr)  # 加入串列
            else: # 串接引數內容
                prev.function.arguments += curr.function.arguments
    return tool_calls, None

In [None]:
pprint(get_tool_calls(
    messages = [{'role':'user', 'content':'2023 金曲歌王是哪位？'}]
))

In [None]:
tool_calls, response = get_tool_calls(
    messages = [{'role':'user', 'content':'2023 金曲歌王是哪位？'}],
    stream=True
)
for chunk in response:
    print(chunk.choices[0].delta.content or '', end='')

In [None]:
tool_calls, response = get_tool_calls(
    messages = [{'role':'user', 'content':'2023 金曲歌王是哪位？'}],
    tools_table=tools_table
)
pprint(tool_calls)

In [None]:
tool_calls, response = get_tool_calls(
    messages = [{'role':'user', 'content':'2023 金曲歌王是哪位？'}],
    stream=True,
    tools_table=tools_table
)
pprint(tool_calls)

In [None]:
tool_calls, response = get_tool_calls(
    messages = [{'role':'user', 'content':'宮崎駿和是枝裕和的最新作品各是哪一部？'}],
    stream=True,
    tools_table=tools_table,
    # model='gpt41106'
)
pprint(tool_calls)

### 建立 function_calling 版的 get_reply_f() 函式

In [None]:
def get_reply_f(messages, stream=False, tools_table=None, **kwargs):
    try:
        tool_calls, response = get_tool_calls(messages,
                                            stream, tools_table, **kwargs)
        if tool_calls:
            tool_messages, res = call_tools(tool_calls, tools_table)
            tool_calls_messeges = []
            for tool_call in tool_calls:
                tool_calls_messeges.append(tool_call.model_dump())
            if tool_messages:  # 如果需要將函式執行結果送回給 AI 再回覆
                messages += [ # 必須傳回原本 function_calling 的內容
                    {
                        "role": "assistant", "content": None,
                        "tool_calls": tool_calls_messeges
                    }]
                messages += tool_messages
                # pprint(messages)
                yield from get_reply_f(messages, stream,
                                       tools_table, **kwargs)
            else:      # chain 為 False, 以函式叫用結果當成模型生成內容
                yield res
        elif stream:   # 不需叫用函式但使用串流模式
            for chunk in response:
                if chunk.choices: # 略過 Azure 串流的第一個片段
                    yield chunk.choices[0].delta.content or ''
        else:          # 不需叫用函式也沒有使用串流模式
            yield response.choices[0].message.content
    except openai.APIError as err:
        reply = f"發生錯誤\n{err.message}"
        print(reply)
        yield reply

In [None]:
# 測試非串流方式 function_calling 功能
for chunk in get_reply_f(
    [{"role":"user", "content":"2023 金曲歌后是誰？"}],
    tools_table=tools_table):
    print(chunk)

In [None]:
# 測試串流方式 function_calling 功能
for chunk in get_reply_f(
    [{"role":"user", "content":"2023 金曲歌后是誰？"}],
    stream=True,
    tools_table=tools_table):
    print(chunk, end='')

In [None]:
# 測試非串流、無 function calling 功能
for chunk in get_reply_f(
    [{"role":"user", "content":"2023 金曲歌后是誰？"}]):
    print(chunk)

In [None]:
# 測試串流、無 function calling 功能
for chunk in get_reply_f(
    [{"role":"user", "content":"2023 金曲歌后是誰？"}],
    stream=True):
    print(chunk, end='')

In [None]:
# 測試串流方式 function_calling 功能
for chunk in get_reply_f(
    [{"role":"user", "content":"宮崎駿和是枝裕和的最新作品各是哪一部？"}],
    stream=True,
    tools_table=tools_table,
    # model='gpt41106'
):
    print(chunk, end='')

### 建立 function calling 版本的 chat_f() 函式

In [None]:
hist = []       # 歷史對話紀錄
backtrace = 2   # 記錄幾組對話

def chat_f(sys_msg, user_msg, stream=False, **kwargs):
    global hist

    replies = get_reply_f(    # 使用函式功能版的函式
        hist                  # 先提供歷史紀錄
        + [{"role": "user", "content": user_msg}]
        + [{"role": "system", "content": sys_msg}],
        stream, tools_table, **kwargs)
    reply_full = ''
    for reply in replies:
        reply_full += reply
        yield reply

    hist += [{"role":"user", "content":user_msg},
             {"role":"assistant", "content":reply_full}]
    hist = hist[-2 * backtrace:] # 留下最新的對話

In [None]:
sys_msg = input("你希望ㄟ唉扮演：")
if not sys_msg.strip(): sys_msg = '使用繁體中文的小助理'
print()
while True:
    msg = input("你說：")
    if not msg.strip(): break
    print(f"{sys_msg}：", end = "")
    for reply in chat_f(sys_msg, msg, stream=True):
        print(reply, end = "")
    print('\n')
hist = []


## 使用 DALL‧E 的 Image API

### Image API 用法

Dall-e-3 模型

1024x1024, 1792x1024, or 1024x1792

注意：Azure 中 api_version 要用 2023-12-01-preview 才行。

In [None]:
res = client.images.generate(   # 文字生圖
    model='Dalle3',
    prompt='夕陽下駛過海邊的火車', # 描述文字
    n=1,                        # 生圖張數
    quality='hd',
    size='1024x1024',           # 影像大小, 預設 1024x1024
    style='vivid',            # 風格, 預設 'vivid'
)
pprint(res)

In [None]:
from IPython.display import Image, display, Markdown

In [None]:
display(Image(url=res.data[0].url, width=200))

In [None]:
print(res.data[0].revised_prompt)

In [None]:
display(Markdown(f"![]({res.data[0].url})"))

### 建立文字生圖像網址的函式

In [None]:
def txt_to_img_url(prompt):
    response = client.images.generate(
        model='Dalle3',
        prompt=prompt,
        n=1,
        size='1024x1024',
        style='vivid',
        quality='hd'
    )
    return response.data[0].url

In [None]:
display(Image(url=txt_to_img_url('田邊騎著腳踏車晃的少年'), width=200))

In [None]:
tools_table.append({                    # 每個元素代表一個函式
    "chain": False,  # 生圖後不需要傳回給 API
    "func": txt_to_img_url,
    "spec": {        # function calling 需要的函式規格
        'type': 'function',
        'function': {
            "name": "txt_to_img_url",
            "description": "可由文字生圖並傳回圖像網址",
            "parameters": {
                "type": "object",
                "properties": {
                    "prompt": {
                        "type": "string",
                        "description": "描述要產生圖像內容的文字",
                    }
                },
                "required": ["prompt"],
            },
        }
    }
})

In [None]:
for chunk in chat_f('小助理', '請畫一張夕陽下海豚躍出海面的圖像', False):
    if chunk.startswith('https'):
        display(Image(url=chunk, width=300))

## 使用 gradio 套件快速建立網頁程式

### 安裝與使用 gradio

In [None]:
#!pip install gradio

建立基本的網頁介面

In [None]:
import gradio as gr

In [None]:
hist = []
web_chat = gr.Interface(
    fn = chat_f,
    inputs = ['text', 'text'],
    outputs = ['text'],
)

In [None]:
web_chat.queue()
web_chat.launch(share=True)

In [None]:
web_chat.close()

### 使用串流方式顯示輸出

In [None]:
hist = []
web_chat = gr.Interface(
    fn = chat_f,
    inputs = ['text', 'text', 'checkbox'],
    outputs = ['text']
)

In [None]:
web_chat.queue()
web_chat.launch()

In [None]:
web_chat.close()

利用包裝函式組合片段內容

In [None]:
def wrapper_chat(sys_msg, user_msg, stream):
    reply = ''
    for chunk in chat_f(sys_msg, user_msg, stream):
        reply += chunk
        yield reply

In [None]:
hist = []
web_chat = gr.Interface(
    fn = wrapper_chat,
    inputs = ['text', 'text', 'checkbox'],
    outputs = ['text']
)

In [None]:
web_chat.queue()
web_chat.launch()

In [None]:
web_chat.close()

### 客製使用者介面

In [None]:
messages = []

def wrapper_chat_bot(sys_msg, user_msg, stream):
    messages.append([user_msg, ''])
    for chunk in chat_f(sys_msg, user_msg, stream):
        messages[-1][1] += chunk
        yield messages

In [None]:
web_chat = gr.Interface(
    fn=wrapper_chat_bot,
    inputs=[
        gr.Textbox(label='系統角色', value='使用繁體中文的小助理'),
        gr.Textbox(label='使用者發言'),
        gr.Checkbox(label='使用串流', value=False)],
    outputs=[gr.Chatbot(label='AI 回覆')]
)

In [None]:
hist = []
web_chat.queue()
web_chat.launch()

In [None]:
web_chat.close()

In [None]:
def txt_to_img_md(prompt):
    return f'![{prompt}]({txt_to_img_url(prompt)})'

In [None]:
tools_table.pop()

In [None]:
tools_table.append({      # 每個元素代表一個函式
    "chain": False,  # 生圖後不需要傳回給 API
    "func": txt_to_img_md,
    "spec": {        # function calling 需要的函式規格
        'type': 'function',
        'function': {
            "name": "txt_to_img_md",
            "description": "可由文字生圖並傳回 markdown 圖像元素",
            "parameters": {
                "type": "object",
                "properties": {
                    "prompt": {
                        "type": "string",
                        "description": "描述要產生圖像內容的文字",
                    }
                },
                "required": ["prompt"],
            },
        }
    }
})

In [None]:
hist = []
messages = []
web_chat = gr.Interface(
    fn=wrapper_chat_bot,
    inputs=[
        gr.Textbox(label='系統角色', value='使用繁體中文的小助理'),
        gr.Textbox(label='使用者發言'),
        gr.Checkbox(label='使用串流', value=False)],
    outputs=[gr.Chatbot(label='AI 回覆')]
)

web_chat.queue()
web_chat.launch()

In [None]:
web_chat.close()