# Switching from OpenAI API to Zhipu AI API at the Lowest Cost
This tutorial aims to assist everyone in quickly and cost-effectively migrating from the OpenAI interface to the Zhipu AI API interface. You can complete this task by following the two steps below. Make sure that before you proceed, you have registered on the Zhipu AI Open Platform and obtained the Zhipu AI API KEY by clicking on the top right corner of the official website.

## 1. Change API Endpoint or Install ZhipuAI SDK

This section demonstrates common invocation methods, including Function Call, OpenAI API, and more. You can accomplish this by changing the API endpoint or installing the ZhipuAI SDK.


In [8]:
from openai import OpenAI

client = OpenAI(api_key=".QIxRXZ9Q0HJU2cTw", base_url="https://open.bigmodel.cn/api/paas/v4")

# from zhipuai import ZhipuAI 
# client = ZhipuAI(api_key="your ZhipuAI api key")  #如果您使用 智谱AI 的SDK，请使用这个代码

# 2. Select the Appropriate Model

The GLM-4 series consists of five models (including one visual multimodal model), namely GLM-4-0520, GLM-4-Air, GLM-4-Airx, GLM-4-Flash, and the visual understanding model GLM-4-V. For the corresponding prices, please refer to the [official website](https://open.bigmodel.cn/pricing). The four models have certain differences:
- GLM-4-0520: Our current most advanced and intelligent model, with a significant improvement in instruction obedience of 18.6%, equipped with 128k context, released on 20240605.
- GLM-4V: Supports various image understanding tasks such as visual question answering, image captioning, visual positioning, and complex target detection, with 128k context.
- GLM-4-Airx: The high-performance version of GLM-4-Air, with the same effect but 2.6 times faster inference speed, about 80 tokens/s, with 8k context.
- GLM-4-Air: The version with the best cost-performance ratio, with comprehensive performance close to GLM-4, equipped with 128k context, fast speed, and affordable price.
- GLM-4-Flash: The version suitable for simple tasks, the fastest speed, and the most affordable price, with 128k context.

In actual use, you need to set the client to the corresponding model.


## 3. Actual Demonstration Code

After completing the above two steps, the model migration is finished. Here, I have written the simplest code implementation for you, which includes tool invocation and conventional dialogue functions. You can run this code to directly use the GLM-4 series models to perform the original tasks.


In [9]:
from openai import OpenAI

client = OpenAI(api_key=".QIxRXZ9Q0HJU2cTw", base_url="https://open.bigmodel.cn/api/paas/v4")


def function_chat(use_stream=False):
    messages = [
        {
            "role": "user", "content": "What's the Celsius temperature in San Francisco?"
        },
        # Give Observations
        {
            "role": "assistant",
            "content": None,
            "function_call": None,
            "tool_calls": [
                {
                    "id": "call_1717912616815",
                    "function": {
                        "name": "get_current_weather",
                        "arguments": "{\"location\": \"San Francisco, CA\", \"format\": \"celsius\"}"
                    },
                    "type": "function"
                }
            ]
        },

        ## Add Observation Result if you need
        # {
        #     "tool_call_id": "call_1717912616815",
        #     "role": "tool",
        #     "name": "get_current_weather",
        #     "content": "23°C",
        # }
    ]
    tools = [
        {
            "type": "function",
            "function": {
                "name": "get_current_weather",
                "description": "Get the current weather",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA",
                        },
                        "format": {
                            "type": "string",
                            "enum": ["celsius", "fahrenheit"],
                            "description": "The temperature unit to use. Infer this from the users location.",
                        },
                    },
                    "required": ["location", "format"],
                },
            }
        },
    ]

    response = client.chat.completions.create(
        model="glm-4",
        messages=messages,
        tools=tools,
        stream=use_stream,
        max_tokens=256,
        temperature=0.9,
        presence_penalty=1.2,
        top_p=0.1,
        tool_choice="auto"
    )
    if response:
        if use_stream:
            for chunk in response:
                print(chunk)
        else:
            print(response)
    else:
        print("Error:", response.status_code)


def simple_chat(use_stream=False):
    messages = [
        {
            "role": "system",
            "content": "请在你输出的时候都带上“喵喵喵”三个字，放在开头。",
        },
        {
            "role": "user",
            "content": "你是谁"
        }
    ]
    response = client.chat.completions.create(
        model="glm-4-0520",
        messages=messages,
        stream=use_stream,
        max_tokens=256,
        temperature=0.4,
        presence_penalty=1.2,
        top_p=0.8,
    )
    if response:
        if use_stream:
            for chunk in response:
                print(chunk)
        else:
            print(response)
    else:
        print("Error:", response.status_code)


if __name__ == "__main__":
    simple_chat(use_stream=False)
    function_chat(use_stream=False)

ChatCompletion(id='202408161236009105f99c7c5e4f89', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='喵喵喵，我是一个人工智能助手，全名为智谱清言，基于智谱 AI 公司于 2023 联合训练的语言模型 GLM-4 开发而成，可以针对用户的问题和要求提供适当的答复和支持。', role='assistant', function_call=None, tool_calls=None))], created=1723782962, model='glm-4-0520', object=None, service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=51, prompt_tokens=26, total_tokens=77), request_id='202408161236009105f99c7c5e4f89')
ChatCompletion(id='202408161236021d6c46adfb29452a', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='', role='assistant', function_call=None, tool_calls=None))], created=1723782963, model='glm-4', object=None, service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=1, prompt_tokens=198, total_tokens=199), request_id='202408161236021d6c46adfb29452a')


## 3. Using with Open Source Frameworks

The ZhipuAI API is aligned with the OpenAI API standard, and the GLM-4 series models are well-suited to handle tasks that can be created with the OpenAI API. You can also refer to the [sample code](./glm_langchain.ipynb) we provide.

If you are using a framework other than LangChain, you only need to modify the corresponding node `base_url="https://open.bigmodel.cn/api/paas/v4"` and adjust the model accordingly.


# 4. Prompt Migration

The GLM-4 series models still have some gaps compared to the GPT-4 flagship model. In terms of prompts, the GLM-4 model's instruction-following ability can handle a large number of conventional tasks. You can complete the migration from GPT2 to GLM without modifying the prompts.

For a more cautious approach, the following suggestions can help the model perform better:
- Add Few-Shot examples to the prompts; more examples can help the model understand the intent.
- Emphasize the core content multiple times in the prompts to ensure better understanding by the model.
