chatglm-api

🐥 api deployment of chatglm-6b, chatglm2-6b, chatglm3-6b and calling by python scripts.

Requirements

python==3.9.19
torch==2.3.0
transformers==4.40.2
accelerate==0.30.0
fastapi
argparse
loguru
python-dotenv

Setup

Before run scripts, need setup env: in envs/api.env, LOCAL_MODELS should be local root path of models. If using remote models, just keep LOCAL_MODELS="".

Hang Models

Hang one model

CUDA_VISIBLE_DEVICES=1 python api.py --model_name THUDM/chatglm-6b --port 8000

which means that a model named THUDM/chatglm-6b is hung on cuda:1 with port 8000.

Hang multi model

CUDA_VISIBLE_DEVICES=1 python api.py --model_name THUDM/chatglm-6b --port 8000

CUDA_VISIBLE_DEVICES=2 python api.py --model_name THUDM/chatglm2-6b --port 8001

CUDA_VISIBLE_DEVICES=3 python api.py --model_name THUDM/chatglm3-6b --port 8002

the script api.sh is suitable to all of chatglm-6b, chatglm2-6b, chatglm3-6b.

Calling

input = {
    'query': "123 + 123 + 123 + 123 = ？",
    # 'history': [['123 * 4 = ?', '123 * 4 = 496']],  # if chatglm-6b or chatglm2-6b
    # 'history': [{'role': 'user', 'content': '123 * 4 = ?'}, {'role': 'assistant', 'metadata': '', 'content': '123 * 4 = 496'}], # if chatglm3-6b
    'port': 8000,   # or 8001, 8002
}
request, history = gen_response(**input)
print(request, history)

🚨 Attention: chatglm-6b, chatglm2-6b and chatglm3-6b have different history forms when called.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
envs		envs
scripts		scripts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

chatglm-api

Setup

Hang Models

Calling

About

Releases

Packages

Languages

WangYangfan/chatglm-api

Folders and files

Latest commit

History

Repository files navigation

chatglm-api

Setup

Hang Models

Calling

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages