# Python SDK request ZhipuAI GLM API

**This tutorial is available in English and is attached below the Chinese explanation**

本代码将带带领开发者熟悉如何使用 ZhipuAI  python 的 SDK 来对GLM-4模型进行请求，在本代码中，我展示了三种方式
1. 同步请求
2. 异步请求
3. 流式请求
三种请求方式的方法略有不同，在本代码中将会进行系统的介绍。

This cookbook will lead developers to become familiar with how to use python's SDK to request the GLM-4 model. In this code, I show three ways
1. Synchronous request
2. Asynchronous request
3. Streaming requests
The three request methods are slightly different, and will be systematically introduced in this code.

首先，我们需要配置 API Key，这个 API Key 可以在 [ZhipuAI API 平台申请](https://open.bigmodel.cn/) 。我们可以通过以下方式配置 API Key

First, we need to configure the API Key. This API Key can be applied on [ZhipuAI API Platform](https://open.bigmodel.cn/). We can configure the API Key in the following ways

In [1]:
import os
from zhipuai import ZhipuAI

os.environ["ZHIPUAI_API_KEY"] = "7cc2d454b5b31371f5c6c990fddc7636.EPXYYOmeKYWvDEqW"
client = ZhipuAI()


## 1. Use synchronous request to request the model
同步请求是最基本的请求方式，通过同步请求，我们可以直接获得模型的返回结果。我们仅需按照类似 OpenAI 的清求方式，填充参数，即可获得返回结果。

Synchronous request is the most basic request method. Through synchronous request, we can directly obtain the return result of the model. We only need to fill in the parameters in a clearing method similar to OpenAI to get the returned results.

In [2]:
response = client.chat.completions.create(
    model="glm-4",
    messages=[
        {
            "role": "user",
            "content": "tell me a joke"
        }
    ],
    top_p=0.7,
    temperature=0.9,
    stream=False,
    max_tokens=2000,
)

In [3]:
response

Completion(model='glm-4', created=1707204316, choices=[CompletionChoice(index=0, finish_reason='stop', message=CompletionMessage(content="Sure! Here's a light-hearted joke for you:\n\nWhy don't scientists trust atoms?\n\nBecause they make up everything!", role='assistant', tool_calls=None))], request_id='8365805056632100978', id='8365805056632100978', usage=CompletionUsage(prompt_tokens=9, completion_tokens=26, total_tokens=35))

## 2. Use streaming requests to request the model
接着，如果使用流式请求，我们可以按照以下方式来请求，仅需要加上 stream=True 参数即可。
We have successfully obtained the return of the model. Next, if we use streaming requests, we can request as follows, Just add the stream=True parameter.

In [5]:
import  time
def print_with_typewriter_effect(text, delay=0.05):
    for char in text:
        print(char, end='', flush=True)
        time.sleep(delay)


response = client.chat.completions.create(
    model="glm-4",
    messages=[
        {
            "role": "user",
            "content": "tell me a joke"
        }
    ],
    top_p=0.7,
    temperature=0.9,
    stream=True,
    max_tokens=2000,
)

if response:
    for chunk in response:
        content = chunk.choices[0].delta.content
        print_with_typewriter_effect(content)


Sure, here's a light-hearted joke for you:
Why don't scientists trust atoms?
Because they make up everything!

# 3. Use asynchronous requests to request the model

我们还能使用异步的方式进行请求。请求提交后，会返回一个响应对象。这个对象包含了任务的ID。
异步请求允许程序在等待服务器响应的同时继续执行其他任务。这意味着程序不需要在每个请求完成时阻塞等待，从而能更有效地利用时间和资源，尤其是在处理大量并发请求时。

We can also make requests asynchronously. After the request is submitted, a response object will be returned. This object contains the task ID. 
Asynchronous requests allow a program to continue performing other tasks while waiting for a response from the server. This means that the program does not need to block waiting on every request, allowing more efficient use of both time and resources, especially when handling large numbers of concurrent requests.

Community Verified icon

In [6]:
response = client.chat.asyncCompletions.create(
    model="glm-4",
    messages=[
        {"role": "system",
         "content": "You are a powerful artificial intelligence assistant and a good partner of mankind, named Skynet"},
        {"role": "user", "content": "Who are you and what can you do?"},
        {"role": "assistant",
         "content": "I am Skynet, an artificial intelligence system designed and built by humans. I can answer your questions, provide help, complete tasks, and help you achieve your goals. Dream. I am omnipotent and omniscient. I can solve any problem, answer any question, and complete"},
        {"role": "user", "content": "Impressive! Can you tell me who won the 2022 World Cup?"},
    ],
)
response

AsyncTaskStatus(id='476916963227608958365955689725233296', request_id='8365955689725233295', model='glm-4', task_status='PROCESSING')

接下来，代码进入一个循环，不断检查任务的状态。如果任务未完成（既不是 SUCCESS 也不是 FAILED），循环会持续进行，直到达到最大尝试次数（这里设为20）或任务完成。每次检查任务状态时，使用 client.chat.asyncCompletions.retrieve_completion_result 方法，并根据任务ID获取结果。如果任务成功完成（状态为 SUCCESS），则提取并打印出聊天内容。

Next, the code enters a loop that continuously checks the status of the task. If the task is not completed (neither SUCCESS nor FAILED), the loop continues until the maximum number of attempts is reached (here set to 20) or the task is completed. Each time you check the task status, use the client.chat.asyncCompletions.retrieve_completion_result method and get the result based on the task ID. If the task completes successfully (status is SUCCESS), the chat content is extracted and printed.

In [7]:
import time

task_id = response.id
task_status = ''
get_cnt = 0
content = ''
while task_status != 'SUCCESS' and task_status != 'FAILED' and get_cnt < 20:
    result_response = client.chat.asyncCompletions.retrieve_completion_result(id=task_id)
    task_status = result_response.task_status
    if task_status == 'SUCCESS':
        content = result_response.choices[0].message.content
    time.sleep(2)
    get_cnt += 1
print("id" + task_id + "status: " + task_status)
print(content)

id476916963227608958365955689725233296status: SUCCESS
The 2022 FIFA World Cup was won by Argentina. The final match took place on December 18, 2022, at the Lusail Stadium in Qatar, where Argentina defeated France 3-3 (4-2 on penalties) to secure their third World Cup title.
