Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support minimax ai model #1033

Merged
merged 4 commits into from
Jun 9, 2024
Merged

Conversation

hanxiantao
Copy link
Contributor

@hanxiantao hanxiantao commented Jun 7, 2024

Ⅰ. Describe what this PR did

1)支持minimax AI模型

2)修复文心一言使用OpenAI协议流式响应格式(data:后少了个空格)

Ⅱ. Does this pull request fix one issue?

fixes #953

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

docker-compose.yaml

version: '3.7'
services:
  envoy:
    image: higress-registry.cn-hangzhou.cr.aliyuncs.com/higress/gateway:1.4.0-rc.1
    entrypoint: /usr/local/bin/envoy
    # 注意这里对wasm开启了debug级别日志,正式部署时则默认info级别
    command: -c /etc/envoy/envoy.yaml --component-log-level wasm:debug
    depends_on:
      - httpbin
    networks:
      - wasmtest
    ports:
      - "10000:10000"
    volumes:
      - ./envoy.yaml:/etc/envoy/envoy.yaml
      - ./plugin.wasm:/etc/envoy/plugin.wasm
  httpbin:
    image: kennethreitz/httpbin:latest
    networks:
      - wasmtest
    ports:
      - "12345:80"
networks:
  wasmtest: {}

使用OpenAI协议

envoy.yaml

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          protocol: TCP
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                scheme_header_transformation:
                  scheme_to_overwrite: https
                stat_prefix: ingress_http
                # Output envoy logs to stdout
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                # Modify as required
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: [ "*" ]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: minimax
                            timeout: 300s
                http_filters:
                  - name: wasmtest
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: wasmtest
                          vm_config:
                            runtime: envoy.wasm.runtime.v8
                            code:
                              local:
                                filename: /etc/envoy/plugin.wasm
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                  "provider": {
                                    "type": "minimax",
                                    "apiTokens": [
                                      "YOUR_MINIMAX_API_TOKEN"
                                    ],
                                    "modelMapping": {
                                      "gpt-3": "abab6.5g-chat",
                                      "gpt-4": "abab6.5-chat",
                                      "*": "abab6.5g-chat"
                                    },
                                    "protocol": "openai",
                                    "minimaxGroupId": "YOUR_MINIMAX_GROUP_ID"
                                  }
                              }
                  - name: envoy.filters.http.router
  clusters:
    - name: httpbin
      connect_timeout: 30s
      type: LOGICAL_DNS
      # Comment out the following line to test on v6 networks
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: httpbin
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: httpbin
                      port_value: 80
    - name: minimax
      connect_timeout: 30s
      type: LOGICAL_DNS
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: minimax
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: api.minimax.chat
                      port_value: 443
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
          "sni": "api.minimax.chat"

非流式请求

示例1:调用ChatCompletion V2接口

curl -X POST 'http://localhost:10000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
    "model": "gpt-3",
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ],
    "stream": false
}'

响应:

{
    "id": "02b459a17dba50e97f7a315e98566796",
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "message": {
                "content": "我是一个使用AI技术进行语言交互的软件。我是 MiniMax 的产品,MiniMax是中国的一家科技公司。MiniMax一直致力于大模型的研究,而我则是 MiniMax 研发的最新产品。我能够回答你的各种问题,也可以根据你的要求帮助你完成一些简单的任务。总之,我希望我能帮助你解决你遇到的问题。那么,你有什么需要帮助的吗?",
                "role": "assistant"
            }
        }
    ],
    "created": 1717905059,
    "model": "abab6.5g-chat",
    "object": "chat.completion",
    "usage": {
        "total_tokens": 154
    },
    "input_sensitive": false,
    "output_sensitive": false,
    "input_sensitive_type": 0,
    "output_sensitive_type": 0,
    "base_resp": {
        "status_code": 0,
        "status_msg": ""
    }
}

使用OpenAI协议非流式请求

示例2:调用ChatCompletion Pro接口

curl -X POST 'http://localhost:10000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
    "model": "gpt-4",
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ],
    "stream": false
}'

响应:

{
    "id": "02b45a232f54be51749272e3f3807f52",
    "choices": [
        {
            "index": 0,
            "message": {
                "name": "MM智能助理",
                "role": "assistant",
                "content": "你好!我是MM智能助理,一款由MiniMax公司自主研发的大型语言模型。我可以帮助你解答问题、提供信息、进行对话和执行多种语言处理任务。如果你有任何问题或需要帮助,请随时告诉我!"
            },
            "finish_reason": "stop"
        }
    ],
    "created": 1717905189,
    "model": "abab6.5-chat",
    "object": "chat.completion",
    "usage": {
        "total_tokens": 116
    }
}

使用OpenAI协议非流式请求2

流式请求

示例1:调用ChatCompletion V2接口

curl -X POST 'http://localhost:10000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
    "model": "gpt-3",
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ],
    "stream": true
}'

响应:

data: {"id":"02b45aa28acd74677c8f2deba51a286d","choices":[{"index":0,"delta":{"content":"你好","role":"assistant"}}],"created":1717905314,"model":"abab6.5g-chat","object":"chat.completion.chunk","output_sensitive":false,"input_sensitive_type":0,"output_sensitive_type":0}

data: {"id":"02b45aa28acd74677c8f2deba51a286d","choices":[{"finish_reason":"stop","index":0,"delta":{"content":",我是MM智能助理。我是一个由MiniMax自研的大型语言模型。我拥有超过1,300亿个参数,可以回答各种问题。我可以帮助你解决各种问题。请问有什么需要帮助的吗?","role":"assistant"}}],"created":1717905315,"model":"abab6.5g-chat","object":"chat.completion.chunk","output_sensitive":false,"input_sensitive_type":0,"output_sensitive_type":0}

data: {"id":"02b45aa28acd74677c8f2deba51a286d","choices":[{"finish_reason":"stop","index":0,"message":{"content":"你好,我是MM智能助理。我是一个由MiniMax自研的大型语言模型。我拥有超过1,300亿个参数,可以回答各种问题。我可以帮助你解决各种问题。请问有什么需要帮助的吗?","role":"assistant"}}],"created":1717905315,"model":"abab6.5g-chat","object":"chat.completion","usage":{"total_tokens":120},"input_sensitive":false,"output_sensitive":false,"input_sensitive_type":0,"output_sensitive_type":0,"base_resp":{"status_code":0,"status_msg":""}}


使用OpenAI协议流式请求

示例2:调用ChatCompletion Pro接口

curl -X POST 'http://localhost:10000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
    "model": "gpt-4",
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ],
    "stream": true
}'

响应:

data: {"choices":[{"index":0,"message":{"name":"MM智能助理","role":"assistant","content":"你好"}}],"created":1717905388,"model":"abab6.5-chat","object":"chat.completion","usage":{}}

data: {"choices":[{"index":0,"message":{"name":"MM智能助理","role":"assistant","content":"!我是MM智能助理,一款由MiniMax公司自主研发的大型语言模型。我可以帮助你解答问题、提供信息、进行对话和执行"}}],"created":1717905389,"model":"abab6.5-chat","object":"chat.completion","usage":{}}

data: {"choices":[{"index":0,"message":{"name":"MM智能助理","role":"assistant","content":"多种语言处理任务。如果你有任何问题或需要帮助,请随时告诉我!"}}],"created":1717905390,"model":"abab6.5-chat","object":"chat.completion","usage":{}}

data: {"id":"02b45aebe6101dd3cc5028a5b1dae0f3","choices":[{"index":0,"message":{"name":"MM智能助理","role":"assistant","content":"你好!我是MM智能助理,一款由MiniMax公司自主研发的大型语言模型。我可以帮助你解答问题、提供信息、进行对话和执行多种语言处理任务。如果你有任何问题或需要帮助,请随时告诉我!"},"finish_reason":"stop"}],"created":1717905390,"model":"abab6.5-chat","object":"chat.completion","usage":{"total_tokens":116}}


使用OpenAI协议流式请求2

使用MiniMax协议

envoy.yaml

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          protocol: TCP
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                scheme_header_transformation:
                  scheme_to_overwrite: https
                stat_prefix: ingress_http
                # Output envoy logs to stdout
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                # Modify as required
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: [ "*" ]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: minimax
                            timeout: 300s
                http_filters:
                  - name: wasmtest
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: wasmtest
                          vm_config:
                            runtime: envoy.wasm.runtime.v8
                            code:
                              local:
                                filename: /etc/envoy/plugin.wasm
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                  "provider": {
                                    "type": "minimax",
                                    "apiTokens": [
                                      "YOUR_MINIMAX_API_TOKEN"
                                    ],
                                    "protocol": "original"
                                  }
                              }
                  - name: envoy.filters.http.router
  clusters:
    - name: httpbin
      connect_timeout: 30s
      type: LOGICAL_DNS
      # Comment out the following line to test on v6 networks
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: httpbin
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: httpbin
                      port_value: 80
    - name: minimax
      connect_timeout: 30s
      type: LOGICAL_DNS
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: minimax
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: api.minimax.chat
                      port_value: 443
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
          "sni": "api.minimax.chat"

非流式请求

示例1:调用ChatCompletion V2接口

curl -X POST 'http://localhost:10000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
    "model": "abab6.5g-chat",
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ],
    "stream": false
}'

响应:

{
    "id": "02b45d3ae75a857392ea089f4d376827",
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "message": {
                "content": "MM智能助理:我叫MM智能助理,我是由MiniMax公司研发的智能助理,可以为用户提供多种智能服务。我可以通过自然语言理解(NLU)和自然语言生成(NLG)来理解用户的问题并提供相应的解决方案,还可以根据用户的要求进行个性化的定制。",
                "role": "assistant"
            }
        }
    ],
    "created": 1717905980,
    "model": "abab6.5g-chat",
    "object": "chat.completion",
    "usage": {
        "total_tokens": 132
    },
    "input_sensitive": false,
    "output_sensitive": false,
    "input_sensitive_type": 0,
    "output_sensitive_type": 0,
    "base_resp": {
        "status_code": 0,
        "status_msg": ""
    }
}

使用MiniMax协议非流式请求2

示例2:调用ChatCompletion Pro接口

curl -X POST 'http://localhost:10000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
    "bot_setting": [
        {
            "bot_name": "MM智能助理",
            "content": "MM智能助理是一款由MiniMax自研的,没有调用其他产品的接口的大型语言模型。MiniMax是一家中国科技公司,一直致力于进行大模型相关的研究。"
        }
    ],
    "messages": [
        {
            "sender_type": "USER",
            "sender_name": "小明",
            "text": "你好,你是谁?"
        }
    ],
    "reply_constraints": {
        "sender_type": "BOT",
        "sender_name": "MM智能助理"
    },
    "model": "abab6.5s-chat",
    "tokens_to_generate": 2048,
    "temperature": 0.01,
    "top_p": 0.95,
    "stream": false
}'

响应:

{
    "created": 1717905759,
    "model": "abab6.5s-chat",
    "reply": "你好!我是MM智能助理,一款由MiniMax公司自研的大型语言模型。我可以帮助回答问题、提供信息、进行对话和执行多种语言处理任务。如果你有任何问题或需要帮助,请随时告诉我!",
    "choices": [
        {
            "finish_reason": "stop",
            "messages": [
                {
                    "sender_type": "BOT",
                    "sender_name": "MM智能助理",
                    "text": "你好!我是MM智能助理,一款由MiniMax公司自研的大型语言模型。我可以帮助回答问题、提供信息、进行对话和执行多种语言处理任务。如果你有任何问题或需要帮助,请随时告诉我!"
                }
            ]
        }
    ],
    "usage": {
        "total_tokens": 116
    },
    "input_sensitive": false,
    "output_sensitive": false,
    "id": "02b45c5d4d3be74655c8d5335188e568",
    "base_resp": {
        "status_code": 0,
        "status_msg": ""
    }
}

使用MiniMax协议非流式请求

流式请求

示例1:调用ChatCompletion V2接口

curl -X POST 'http://localhost:10000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
    "model": "abab6.5g-chat",
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ],
    "stream": true
}'

响应:

data: {"id":"02b45d9806cb2d7760a906a8590f4a4f","choices":[{"index":0,"delta":{"content":"你好","role":"assistant"}}],"created":1717906073,"model":"abab6.5g-chat","object":"chat.completion.chunk","output_sensitive":false,"input_sensitive_type":0,"output_sensitive_type":0}

data: {"id":"02b45d9806cb2d7760a906a8590f4a4f","choices":[{"index":0,"delta":{"content":",我的名字是MM智能助理,是一款由中国科技公司MiniMax研发的大模型产品。我能够处理自然语言信息,回答各种问题,同时我也可以和您聊天。您有任何问题都可以","role":"assistant"}}],"created":1717906074,"model":"abab6.5g-chat","object":"chat.completion.chunk","output_sensitive":false,"input_sensitive_type":0,"output_sensitive_type":0}

data: {"id":"02b45d9806cb2d7760a906a8590f4a4f","choices":[{"finish_reason":"stop","index":0,"delta":{"content":"向我咨询,我会尽我的能力帮助您。","role":"assistant"}}],"created":1717906074,"model":"abab6.5g-chat","object":"chat.completion.chunk","output_sensitive":false,"input_sensitive_type":0,"output_sensitive_type":0}

data: {"id":"02b45d9806cb2d7760a906a8590f4a4f","choices":[{"finish_reason":"stop","index":0,"message":{"content":"你好,我的名字是MM智能助理,是一款由中国科技公司MiniMax研发的大模型产品。我能够处理自然语言信息,回答各种问题,同时我也可以和您聊天。您有任何问题都可以向我咨询,我会尽我的能力帮助您。","role":"assistant"}}],"created":1717906074,"model":"abab6.5g-chat","object":"chat.completion","usage":{"total_tokens":123},"input_sensitive":false,"output_sensitive":false,"input_sensitive_type":0,"output_sensitive_type":0,"base_resp":{"status_code":0,"status_msg":""}}


使用MiniMax协议流式请求2

示例2:调用ChatCompletion Pro接口

curl -X POST 'http://localhost:10000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
    "bot_setting": [
        {
            "bot_name": "MM智能助理",
            "content": "MM智能助理是一款由MiniMax自研的,没有调用其他产品的接口的大型语言模型。MiniMax是一家中国科技公司,一直致力于进行大模型相关的研究。"
        }
    ],
    "messages": [
        {
            "sender_type": "USER",
            "sender_name": "小明",
            "text": "你好,你是谁?"
        }
    ],
    "reply_constraints": {
        "sender_type": "BOT",
        "sender_name": "MM智能助理"
    },
    "model": "abab6.5s-chat",
    "tokens_to_generate": 2048,
    "temperature": 0.01,
    "top_p": 0.95,
    "stream": true
}'

响应:

data: {"created":1717905811,"model":"abab6.5s-chat","reply":"","choices":[{"messages":[{"sender_type":"BOT","sender_name":"MM智能助理","text":"你好"}]}],"output_sensitive":false,"request_id":"YOUR_MINIMAX_GROUP_ID_1717905810674735"}

data: {"created":1717905812,"model":"abab6.5s-chat","reply":"","choices":[{"messages":[{"sender_type":"BOT","sender_name":"MM智能助理","text":"!我是MM智能助理,一款由MiniMax公司自研的大型语言模型。我可以帮助回答问题、提供信息、进行对话和执行多种语言处理任务。如果你有任何问题或需要帮助,请随时告诉我!"}]}],"output_sensitive":false,"request_id":"YOUR_MINIMAX_GROUP_ID_1717905810674735"}

data: {"created":1717905812,"model":"abab6.5s-chat","reply":"你好!我是MM智能助理,一款由MiniMax公司自研的大型语言模型。我可以帮助回答问题、提供信息、进行对话和执行多种语言处理任务。如果你有任何问题或需要帮助,请随时告诉我!","choices":[{"finish_reason":"stop","messages":[{"sender_type":"BOT","sender_name":"MM智能助理","text":"你好!我是MM智能助理,一款由MiniMax公司自研的大型语言模型。我可以帮助回答问题、提供信息、进行对话和执行多种语言处理任务。如果你有任何问题或需要帮助,请随时告诉我!"}]}],"usage":{"total_tokens":116},"input_sensitive":false,"output_sensitive":false,"id":"02b45c92e213b1cfa4b7e0e1ab4cc3b5","base_resp":{"status_code":0,"status_msg":""}}


使用MiniMax协议流式请求

修复文心一言使用OpenAI协议流式响应格式

envoy.yaml

admin:
  address:
    socket_address:
      protocol: TCP
      address: 0.0.0.0
      port_value: 9901
static_resources:
  listeners:
    - name: listener_0
      address:
        socket_address:
          protocol: TCP
          address: 0.0.0.0
          port_value: 10000
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
              typed_config:
                "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
                scheme_header_transformation:
                  scheme_to_overwrite: https
                stat_prefix: ingress_http
                # Output envoy logs to stdout
                access_log:
                  - name: envoy.access_loggers.stdout
                    typed_config:
                      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
                # Modify as required
                route_config:
                  name: local_route
                  virtual_hosts:
                    - name: local_service
                      domains: [ "*" ]
                      routes:
                        - match:
                            prefix: "/"
                          route:
                            cluster: baidu
                            timeout: 300s
                http_filters:
                  - name: wasmtest
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: wasmtest
                          vm_config:
                            runtime: envoy.wasm.runtime.v8
                            code:
                              local:
                                filename: /etc/envoy/plugin.wasm
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                  "provider": {
                                    "type": "baidu",
                                    "apiTokens": [
                                      "YOUR_BAIDU_API_TOKEN"
                                    ],
                                    "modelMapping": {
                                      "gpt-3": "ERNIE-4.0-8K",
                                      "*": "ERNIE-4.0-8K"
                                    },
                                    "protocol": "openai"
                                  }
                              }
                  - name: envoy.filters.http.router
  clusters:
    - name: httpbin
      connect_timeout: 30s
      type: LOGICAL_DNS
      # Comment out the following line to test on v6 networks
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: httpbin
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: httpbin
                      port_value: 80
    - name: baidu
      connect_timeout: 30s
      type: LOGICAL_DNS
      dns_lookup_family: V4_ONLY
      lb_policy: ROUND_ROBIN
      load_assignment:
        cluster_name: baidu
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: aip.baidubce.com
                      port_value: 443
      transport_socket:
        name: envoy.transport_sockets.tls
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
          "sni": "aip.baidubce.com"

流式请求

curl -X POST 'http://localhost:10000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-d '{
    "model": "gpt-3",
    "messages": [
        {
            "role": "user",
            "content": "你好,你是谁?"
        }
    ],
    "stream": true
}'

响应:

data: {"id":"as-e8yq69nwwe","choices":[{"index":0,"message":{"role":"assistant","content":"你好,"}}],"created":1717765832,"model":"ERNIE-4.0-8K","object":"chat.completion","usage":{"prompt_tokens":4,"total_tokens":4}}

data: {"id":"as-e8yq69nwwe","choices":[{"index":0,"message":{"role":"assistant","content":"我是文心一言,可以协助你完成范围广泛的任务并提供有关各种主题的信息,比如回答问题,提供定义和解释及建议。"}}],"created":1717765834,"model":"ERNIE-4.0-8K","object":"chat.completion","usage":{"prompt_tokens":4,"total_tokens":4}}

data: {"id":"as-e8yq69nwwe","choices":[{"index":0,"message":{"role":"assistant","content":"如果你有任何问题,请随时向我提问。"}}],"created":1717765835,"model":"ERNIE-4.0-8K","object":"chat.completion","usage":{"prompt_tokens":4,"total_tokens":4}}

修复文心一言使用OpenAI协议流式响应格式

Ⅴ. Special notes for reviews

Copy link
Collaborator

@CH3CHO CH3CHO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我看 MiniMax 有三个 Chat Completion 接口。现在的实现里用的是 V2。这样做有什么原因吗?

image

}

// 使用OpenAI接口协议,映射模型
if m.config.protocol == protocolOpenAI {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的判定是不是有问题?如果用户配置的不适用 OpenAI 的契约,那么上面就不能按照 OpenAI 的契约进行请求数据反序列化了。而且这个 protocol 指的是契约,不是 model。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里的判定是不是有问题?如果用户配置的不适用 OpenAI 的契约,那么上面就不能按照 OpenAI 的契约进行请求数据反序列化了。而且这个 protocol 指的是契约,不是 model。

这里的意思是只有使用openai协议时,才会进行模型映射。如果使用原生的接口协议,就使用接口传入的模型直接调用,不进行模型映射了

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不过 modelMapping 字段在定义上并没有说明只有在 OpenAI 协议上生效。这个逻辑可能会让用户感到迷惑。如果用户配置 Original 协议,一般也不会配 modelMapping 吧。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不过 modelMapping 字段在定义上并没有说明只有在 OpenAI 协议上生效。这个逻辑可能会让用户感到迷惑。如果用户配置 Original 协议,一般也不会配 modelMapping 吧。

嗯,好的,这里我调整一下

@hanxiantao
Copy link
Contributor Author

我看 MiniMax 有三个 Chat Completion 接口。现在的实现里用的是 V2。这样做有什么原因吗?

image

只有ChatCompletion v2是支持所有模型的,ChatCompletion Pro仅用于abab6.5、abab6.5s、abab5.5s模型(推荐优先使用),ChatCompletion仅用于abab5.5、abab5.5s模型(推荐优先使用),我考虑再针对minimax加一个特殊的字段共用户选择调用哪个接口,更合适一点

@CH3CHO
Copy link
Collaborator

CH3CHO commented Jun 8, 2024

我看 MiniMax 有三个 Chat Completion 接口。现在的实现里用的是 V2。这样做有什么原因吗?
image

只有ChatCompletion v2是支持所有模型的,ChatCompletion Pro仅用于abab6.5、abab6.5s、abab5.5s模型(推荐优先使用),ChatCompletion仅用于abab5.5、abab5.5s模型(推荐优先使用),我考虑再针对minimax加一个特殊的字段共用户选择调用哪个接口,更合适一点

或者根据模型自动选择接口呢?

@hanxiantao
Copy link
Contributor Author

我看 MiniMax 有三个 Chat Completion 接口。现在的实现里用的是 V2。这样做有什么原因吗?
image

只有ChatCompletion v2是支持所有模型的,ChatCompletion Pro仅用于abab6.5、abab6.5s、abab5.5s模型(推荐优先使用),ChatCompletion仅用于abab5.5、abab5.5s模型(推荐优先使用),我考虑再针对minimax加一个特殊的字段共用户选择调用哪个接口,更合适一点

或者根据模型自动选择接口呢?

也可以,如果是abab6.5、abab6.5s、abab5.5s模型会优先使用ChatCompletion Pro,abab5.5优先使用ChatCompletion,其他模型使用ChatCompletion v2,我这边会根据这个逻辑再调整下

@hanxiantao
Copy link
Contributor Author

如果是abab6.5、abab6.5s、abab5.5s模型会优先使用ChatCompletion Pro,abab5.5优先使用ChatCompletion,其他模型使用ChatCompletion v2,我这边会根据这个逻辑再调整下

ChatCompletion Pro也支持abab5.5,目前实现逻辑:如果是abab6.5、abab6.5s、abab5.5s、abab5.5模型会优先使用ChatCompletion Pro,其他模型使用ChatCompletion v2(abab6.5t、abab6.5g)

@hanxiantao hanxiantao requested a review from CH3CHO June 9, 2024 04:26
Copy link
Collaborator

@CH3CHO CH3CHO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@CH3CHO CH3CHO merged commit d53c713 into alibaba:main Jun 9, 2024
11 checks passed
@hanxiantao hanxiantao deleted the minimax-ai-proxy branch June 9, 2024 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AI 代理 Wasm 插件对接 MINIMAX
2 participants