Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make OpenAI tokenizer more precise #346

Merged
merged 2 commits into from Dec 12, 2023

Conversation

langchain4j
Copy link
Owner

This PR is a rework of OpenAiTokenizer.
Added OpenAiTokenizerIT with lots of tests to ensure that OpenAiTokenizer calculates token usage very close to OpenAI.
In most cases calculation is 1:1, in some corner cases the difference is within 5%.

@langchain4j langchain4j merged commit 8e4254f into main Dec 12, 2023
3 checks passed
@langchain4j langchain4j deleted the make_openai_tokenizer_more_precise branch December 12, 2023 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant