Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

接口 /v1/embeddings 仅支持string,不支持数组 #1291

Open
TimchaStudio opened this issue Apr 8, 2024 · 6 comments
Open

接口 /v1/embeddings 仅支持string,不支持数组 #1291

TimchaStudio opened this issue Apr 8, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@TimchaStudio
Copy link

TimchaStudio commented Apr 8, 2024

let chunks: FileItemChunk[] = []

switch (fileExtension) {
  case "csv":
    chunks = await processCSV(blob)
    break
  case "json":
    chunks = await processJSON(blob)
    break
  case "md":
    chunks = await processMarkdown(blob)
    break
  case "pdf":
    chunks = await processPdf(blob)
    break
  case "txt":
    chunks = await processTxt(blob)
    break
  default:
    return new NextResponse("Unsupported file type", {
      status: 400
    })
}

let embeddings: any = []

const openai = new OpenAI({
  apiKey: profile.openai_api_key || "",
  organization: profile.openai_organization_id,
  baseURL: "https://one-api.com/v1"
})

const response = await openai.embeddings.create({
  model: "text-embedding-3-small",
  input: chunks.map(chunk => chunk.content)
})

其中input: chunks.map(chunk => chunk.content) ,chunk 是文件处理后字符串数组。
baseURL 为官方接口 https://api.openai.com/v1 时文件能成功上传并响应
baseURL 为one-api接口https://one-api.com/v1 时,文件上传返回以下错误

Failed to upload. JSON object requested, multiple (or no) rows returned

似乎接口只支持字符串,不支持数组。

https://platform.openai.com/docs/api-reference/embeddings/create

@TimchaStudio TimchaStudio added the bug Something isn't working label Apr 8, 2024
@TimchaStudio TimchaStudio changed the title 接口 v1/embeddings 仅支持JSON object,不支持文件块作为内容输入 接口 /v1/embeddings 仅支持JSON object,不支持文件块作为内容输入 Apr 8, 2024
@samiliver
Copy link

我也遇到类似问题,智谱AI的向量库 访问报错

`[INFO] 2024/04/09 - 12:06:33 | 2024040912063389521557768025579 | user 1 has enough quota 499999999798759, trusted and no need to pre-consume
[ERR] 2024/04/09 - 12:06:33 | 2024040912063389521557768025579 | panic detected: interface conversion: interface {} is []interface {}, not string
[ERR] 2024/04/09 - 12:06:33 | 2024040912063389521557768025579 | stacktrace from panic: goroutine 100 [running]:
runtime/debug.Stack()
/usr/local/go/src/runtime/debug/stack.go:24 +0x5e
github.com/songquanpeng/one-api/router.SetRelayRouter.RelayPanicRecover.func2.1()
/build/middleware/recover.go:18 +0xc5
panic({0xf6aca0?, 0xc0003f33e0?})
/usr/local/go/src/runtime/panic.go:770 +0x132
github.com/songquanpeng/one-api/relay/adaptor/zhipu.ConvertEmbeddingRequest(...)
/build/relay/adaptor/zhipu/adaptor.go:135
github.com/songquanpeng/one-api/relay/adaptor/zhipu.(*Adaptor).ConvertRequest(0x34d9e18?, 0xc0005102a0?, 0xc000718c60?, 0x7?)
/build/relay/adaptor/zhipu/adaptor.go:65 +0x35d
github.com/songquanpeng/one-api/relay/controller.RelayTextHelper(0xc0007c6000)
/build/relay/controller/text.go:71 +0x5e4
github.com/songquanpeng/one-api/controller.relayHelper(0xc0003e6be5?, 0x108fa98?)
/build/controller/relay.go:36 +0x52
github.com/songquanpeng/one-api/controller.Relay(0xc0007c6000)
/build/controller/relay.go:49 +0x145
github.com/gin-gonic/gin.(*Context).Next(0xc0007c6000)
/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/context.go:174 +0x2b
github.com/songquanpeng/one-api/router.SetRelayRouter.Distribute.func4(0xc0007c6000)
/build/middleware/distributor.go:56 +0x194
github.com/gin-gonic/gin.(*Context).Next(0xc0007c6000)
/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/context.go:174 +0x2b
github.com/songquanpeng/one-api/router.SetRelayRouter.TokenAuth.func3(0xc0007c6000)
/build/middleware/auth.go:142 +0x492
github.com/gin-gonic/gin.(*Context).Next(0xc0007c6000)
/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/context.go:174 +0x2b
github.com/songquanpeng/one-api/router.SetRelayRouter.RelayPanicRecover.func2(0xc0007c6098?)
/build/middleware/recover.go:31 +0x45
github.com/gin-gonic/gin.(*Context).Next(0xc0007c6000)
/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/context.go:174 +0x2b
main.main.Sessions.func3(0xc0007c6000)
/go/pkg/mod/github.com/gin-contrib/sessions@v0.0.5/sessions.go:54 +0x169
github.com/gin-gonic/gin.(*Context).Next(...)
/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/context.go:174
github.com/gin-gonic/gin.LoggerWithConfig.func1(0xc0007c6000)
/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/logger.go:240 +0xdd
github.com/gin-gonic/gin.(*Context).Next(0xc0007c6000)
/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/context.go:174 +0x2b
main.main.RequestId.func2(0xc0007c6000)
/build/middleware/request-id.go:17 +0x106
github.com/gin-gonic/gin.(*Context).Next(...)
/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/context.go:174
github.com/gin-gonic/gin.CustomRecoveryWithWriter.func1(0xc0007c6000)
/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/recovery.go:102 +0x7a
github.com/gin-gonic/gin.(*Context).Next(...)
/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/context.go:174
github.com/gin-gonic/gin.(*Engine).handleHTTPRequest(0xc0001541a0, 0xc0007c6000)
/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/gin.go:620 +0x66e
github.com/gin-gonic/gin.(*Engine).ServeHTTP(0xc0001541a0, {0x34d7d98, 0xc0001c8540}, 0xc0001467e0)
/go/pkg/mod/github.com/gin-gonic/gin@v1.9.1/gin.go:576 +0x1b2
net/http.serverHandler.ServeHTTP({0x34d4e00?}, {0x34d7d98?, 0xc0001c8540?}, 0x6?)
/usr/local/go/src/net/http/server.go:3137 +0x8e
net/http.(*conn).serve(0xc000412b40, {0x34d9e18, 0xc00041cb40})
/usr/local/go/src/net/http/server.go:2039 +0x5e8
created by net/http.(*Server).Serve in goroutine 1
/usr/local/go/src/net/http/server.go:3285 +0x4b4

[ERR] 2024/04/09 - 12:06:33 | 2024040912063389521557768025579 | request: POST /v1/embeddings
[ERR] 2024/04/09 - 12:06:33 | 2024040912063389521557768025579 | request body: {
"model": "embedding-2",
"input": [
"## 一、zookeeper概要、背景及作用\n\n---\n### zookeeper产生背景:\n项目从单体到分布式转变之后,将会产生多个节点之间协同的问题。如:\n1. 每天的定时任务由谁哪个节点来执行?\n2. RPC调用时的服务发现?\n3. 如何保证并发请求的幂等\n4. ....\n\n这些问题可以统一归纳为多节点协调问题,如果靠节点自身进行协调这是非常不可靠的,性能上也不可取。必须由一个独立的服务做协调工作,它必须可靠,而且保证性能。\n\n### zookeeper概 要:\nZooKeeper是用于分布式应用程序的协调服务。它公开了一组简单的API,分布式应用程序可以基于这些API用于同步,节点状态、配置等信息、服务注册等信息。其由JAVA编写,支持JAVA 和C两种语言的客户端。\n图片\n\n### znode 节点\nzookeeper 中数据基本单元叫节点,节点之下可包含子节点,最后以树级方式程现。每个节点拥有唯一的路径path。客户端基于PATH上传节点数据,zookeeper 收到后会实时通知对该路径进行监听的客户端。"
]
}
[GIN] 2024/04/09 - 12:06:33 | 2024040912063389521557768025579 | 500 | 4.678072ms | 192.168.32.6 | POST /v1/embeddings`

@songquanpeng
Copy link
Owner

这个会处理

@yorke669
Copy link

这个问题解决了嘛?我好像同样遇到这个问题 labring/FastGPT#1198 (comment)

@chawaa
Copy link

chawaa commented Apr 19, 2024

@yorke669 我也遇到同样的问题。

@TimchaStudio TimchaStudio changed the title 接口 /v1/embeddings 仅支持JSON object,不支持文件块作为内容输入 接口 /v1/embeddings 仅支持JSON object,不支持字符串数组 Apr 19, 2024
@TimchaStudio TimchaStudio changed the title 接口 /v1/embeddings 仅支持JSON object,不支持字符串数组 接口 /v1/embeddings 仅支持string,不支持数组 Apr 19, 2024
@RexWzh
Copy link

RexWzh commented Apr 23, 2024

@songquanpeng 应该不是 OneAPI 的问题。和渠道有关,比如我用 ChatAnywhere 是正常的,但智谱只能字符串,不能字符列表:

Ref 智谱文档: https://open.bigmodel.cn/dev/api#text_embedding

image

@wuyang630
Copy link

wuyang630 commented Apr 24, 2024

@songquanpeng 应该不是 OneAPI 的问题。和渠道有关,比如我用 ChatAnywhere 是正常的,但智谱只能字符串,不能字符列表:

Ref 智谱文档: https://open.bigmodel.cn/dev/api#text_embedding

image

输入的日志如下:

[DEBUG] 2024/04/24 - 14:28:53 | 2024042414285385214797631405077 | request body: {"input": [[2028, 374, 264, 1296, 4733]], "model": "text-embedding-3-large", "encoding_format": "base64"}

解析代码

func (r GeneralOpenAIRequest) ParseInput() []string {
	if r.Input == nil {
		return nil
	}
	var input []string
	switch r.Input.(type) {
	case string:
		input = []string{r.Input.(string)}
	case []any:
		input = make([]string, 0, len(r.Input.([]any)))
		for _, item := range r.Input.([]any) {
			if str, ok := item.(string); ok {
				input = append(input, str)
			}
		}
	}
	return input

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants