Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

输入时token计算模拟 #90

Closed
orangelckc opened this issue Mar 23, 2023 · 0 comments
Closed

输入时token计算模拟 #90

orangelckc opened this issue Mar 23, 2023 · 0 comments
Assignees

Comments

@orangelckc
Copy link
Member

import GPT3Tokenizer from 'gpt3-tokenizer';
const tokenizer = new GPT3Tokenizer({ type: 'gpt3' });
export function estimateTokens(str: string): number {
    const encoded: { bpe: number[]; text: string[] } = tokenizer.encode(str);
    return encoded.bpe.length;
}

参考demo

需要同时计算输入的字符+角色描述的字符
如果开启了记忆模式,还需要增加记忆所消耗的字符
总上传token必须少于4096,建议留200的余量,因为计算可能和实际有误差

需要将计算token的方法抽离,之后会用到计算返回答案的token计算

Image

@orangelckc orangelckc self-assigned this Mar 23, 2023
orangelckc added a commit that referenced this issue Mar 24, 2023
orangelckc added a commit that referenced this issue Mar 24, 2023
ayangweb pushed a commit that referenced this issue Mar 25, 2023
@ayangweb ayangweb reopened this Mar 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

2 participants