這份 Notebook 示範 OpenAI API 的 json mode

In [1]:
from google.colab import userdata
openai_api_key = userdata.get('openai_api_key')

In [2]:
import requests
import json
from pprint import pp

In [4]:
# 多加了 format_type 參數
def get_completion(messages, model="gpt-3.5-turbo-1106", temperature=0, max_tokens=300, format_type=None):
  payload = { "model": model, "temperature": temperature, "messages": messages, "max_tokens": max_tokens }
  if format_type:
    payload["response_format"] =  { "type": format_type }

  headers = { "Authorization": f'Bearer {openai_api_key}', "Content-Type": "application/json" }
  response = requests.post('https://api.openai.com/v1/chat/completions', headers = headers, data = json.dumps(payload) )
  obj = json.loads(response.text)
  if response.status_code == 200 :
    return obj["choices"][0]["message"]["content"]
  else :
    return obj["error"]

## JSON mode

2023/11/7 新推出的功能(適用1106版本之後) https://openai.com/blog/new-models-and-developer-products-announced-at-devday


但只保證是可以解析的 JSON，不保證符合你要的 schema 喔

1. prompt 裡面還是必須要說你要 json
2. 只有新的 gpt-3.5-turbo-1106 跟 gpt-4-1106-preview 才支援 json mode

In [5]:
messages = [{ "role": "user", "content": "請隨機產生三個 user 資料，請用 JSON 格式回傳"} ]
x = get_completion(messages, format_type="text" )
print(x)

{
  "users": [
    {
      "id": 1,
      "name": "John",
      "age": 25,
      "gender": "male"
    },
    {
      "id": 2,
      "name": "Emily",
      "age": 30,
      "gender": "female"
    },
    {
      "id": 3,
      "name": "Michael",
      "age": 22,
      "gender": "male"
    }
  ]
}


In [6]:
messages = [{ "role": "user", "content": "請隨機產生三個 user 資料，請用 JSON 格式回傳"} ]
x = get_completion(messages, format_type ="json_object" )
print(x)

{
  "users": [
    {
      "id": 1,
      "name": "John",
      "age": 25,
      "gender": "male"
    },
    {
      "id": 2,
      "name": "Emily",
      "age": 30,
      "gender": "female"
    },
    {
      "id": 3,
      "name": "Michael",
      "age": 28,
      "gender": "male"
    }
  ]
}


### 注意 gpt-4-turbo-preview 若要輸出 json 但用 text 格式，容易出現 ```json 開頭反而不適合串接程式

In [None]:
messages = [{ "role": "user", "content": "請隨機產生三個 user 資料，請用 JSON 格式回傳"} ]
x = get_completion( messages, format_type="text", model="gpt-4-0125-preview" )

print(x)

```json
[
  {
    "id": 1,
    "name": "John Doe",
    "email": "john.doe@example.com",
    "age": 28,
    "gender": "male"
  },
  {
    "id": 2,
    "name": "Jane Smith",
    "email": "jane.smith@example.com",
    "age": 32,
    "gender": "female"
  },
  {
    "id": 3,
    "name": "Alex Johnson",
    "email": "alex.johnson@example.com",
    "age": 24,
    "gender": "non-binary"
  }
]
```


In [None]:
messages = [{ "role": "user", "content": "請隨機產生三個 user 資料，請用 JSON 格式回傳"} ]
x = get_completion(messages, format_type ="json_object", model = "gpt-4-0125-preview" )
print(x)


{
  "users": [
    {
      "id": 1,
      "name": "John Doe",
      "email": "johndoe@example.com",
      "age": 28,
      "gender": "male"
    },
    {
      "id": 2,
      "name": "Jane Smith",
      "email": "janesmith@example.com",
      "age": 32,
      "gender": "female"
    },
    {
      "id": 3,
      "name": "Alex Johnson",
      "email": "alexjohnson@example.com",
      "age": 24,
      "gender": "non-binary"
    }
  ]
}


## 指定 schema 可以這樣下 prompt:

請給 few-shot 範例，並且 type 也可以打上去

在一些較難描述明確指示的任務中，蠻適合用 few-shot 的方式讓模型自己學，例如文字風格、特定的輸出結構(某種schema)

In [None]:
x = get_completion( [{ "role": "user", "content": """請隨機產生三個 user 資料，請用 JSON 格式回傳，用以下格式:
[
  "user1": {
    "name": "string", // 請用台灣常見姓名
    "age": "integer", // 年紀
    "bio": "text", // 請用台灣繁體中文
    "avatar_url": "url", // 個人圖像，請用真實可以連結的圖片
    "isSubscriber": "boolean", // 是否訂閱
}]"""} ], format_type ="json_object" )
print(x)


{
  "user1": {
    "name": "陳小明",
    "age": 28,
    "bio": "我是一個熱愛旅行的台灣人，喜歡探索世界各地的美食和文化。",
    "avatar_url": "https://example.com/avatar1.jpg",
    "isSubscriber": true
  },
  "user2": {
    "name": "林美玲",
    "age": 35,
    "bio": "我是一位瑜珈老師，喜歡幫助人們找到身心靈的平衡。",
    "avatar_url": "https://example.com/avatar2.jpg",
    "isSubscriber": false
  },
  "user3": {
    "name": "王大鵬",
    "age": 42,
    "bio": "我是一位資深工程師，熱愛挑戰和學習新技術。",
    "avatar_url": "https://example.com/avatar3.jpg",
    "isSubscriber": true
  }
}
