[update] tiktoken base upgrade #191

ohotto · 2024-05-12T17:21:19Z

对接了三家中转api

按照对应的人民币价格*10设定了后台价格

但是实际使用时chatnio计算的token远远大于中转api后台的实际值，大约是1.5-3.2倍之间

例如上述情况，使用模型为gpt-3.5-turbo-0125，中转api价格为 input: 0.0005/ktokens | output: 0.0015/ktokens

16/1000*0.0005+328/1000*0.0015=0.0005

chatnio后台设定价格为 input: 0.005/ktokens | output: 0.015/ktokens，但前台反馈消耗点数 0.014145

0.014145/10 = 0.0014145 >> 0.0005

0.0014145 / 0.0005 = 2.829 倍

即测得chatnio计算token消耗为实际消耗 2.829 倍

经过反复验证，对于gpt3.5、gpt4系列的各种模型都存在上述问题，每次计算的token值倍数还不一致，最低观测到是实际消耗的1.5倍，最高达到3.2倍左右，其余情况集中在2.5-2.9倍之间，最近几次测得的倍数为：
2.57、2.71、2.92、2.51、2.89、2.77、2.98、2.72

项目基于ubuntu-amd64，存在1panel环境，使用docker-compose搭建，使用OpenResty(Nginx)反代，已经尝试切换stable、latest两个镜像都复现该问题

AnnaStreeter · 2024-05-13T05:59:24Z

开源版目前 Tokenizer 使用 Tiktoken Legacy，关于对齐新版 OpenAI GPT-3 计费是有问题的。商业版无误。商业版下放工作不在我的工作范围内，开源版何时修复待定。

zmh-program · 2024-05-24T15:28:09Z

不是bug, tiktoken版本没更新
token计算器有出入罢了, 更新一下编码就好

Co-Authored-By: Minghan Zhang <112773885+zmh-program@users.noreply.github.com>

…ifferent device types (#204); optimize tiktoken performance (#191) and function calling fields

zmh-program changed the title ~~[bug] token计算错误~~ [update] tiktoken base upgrade May 24, 2024

zmh-program added the version up Pro version features of the devolution of the open source version label May 29, 2024

zmh-program assigned XiaomaiTX May 29, 2024

Sh1n3zZ added a commit that referenced this issue Jun 21, 2024

feat: update and optimize tokenizer performance (#191)

a470b22

Co-Authored-By: Minghan Zhang <112773885+zmh-program@users.noreply.github.com>

Sh1n3zZ assigned Sh1n3zZ and unassigned XiaomaiTX Jun 21, 2024

Sh1n3zZ added a commit that referenced this issue Jun 21, 2024

feat: update and optimize tokenizer performance (#191)

4c3843b

Co-Authored-By: Minghan Zhang <112773885+zmh-program@users.noreply.github.com>

Sh1n3zZ added a commit that referenced this issue Jun 21, 2024

feat: update and optimize tokenizer performance (#191)

c81b599

Co-Authored-By: Minghan Zhang <112773885+zmh-program@users.noreply.github.com>

zmh-program added a commit that referenced this issue Jun 22, 2024

feat: merge pr #211 from @Sh1n3zZ: define sending defaults based on d…

a51dc7f

…ifferent device types (#204); optimize tiktoken performance (#191) and function calling fields

Sh1n3zZ closed this as completed Jun 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[update] tiktoken base upgrade #191

[update] tiktoken base upgrade #191

ohotto commented May 12, 2024 •

edited

Loading

AnnaStreeter commented May 13, 2024

zmh-program commented May 24, 2024 •

edited

Loading

[update] tiktoken base upgrade #191

[update] tiktoken base upgrade #191

Comments

ohotto commented May 12, 2024 • edited Loading

AnnaStreeter commented May 13, 2024

zmh-program commented May 24, 2024 • edited Loading

ohotto commented May 12, 2024 •

edited

Loading

zmh-program commented May 24, 2024 •

edited

Loading