-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Support gpt-5.1 model in Tiktoken tokenizer #7556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support gpt-5.1 model in Tiktoken tokenizer #7556
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for the gpt-5.1 model in the Tiktoken tokenizer implementation, aligning with an open feature request in the official Tiktoken library.
Key changes:
- Added
gpt-5.1model mapping to the O200kBase encoding in the tokenizer configuration - Extended test coverage to include the new
gpt-5.1model across multiple test scenarios
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/Microsoft.ML.Tokenizers/Model/TiktokenTokenizer.cs | Added gpt-5.1 model entries to prefix and exact match lookup tables for O200kBase encoding |
| test/Microsoft.ML.Tokenizers.Tests/TiktokenTests.cs | Added GPT5_1 tokenizer instance and included it in encoding tests and test data parameters |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #7556 +/- ##
==========================================
- Coverage 69.02% 69.02% -0.01%
==========================================
Files 1482 1482
Lines 274093 274096 +3
Branches 28266 28266
==========================================
+ Hits 189183 189184 +1
- Misses 77527 77528 +1
- Partials 7383 7384 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
The |
There is an open issue requesting the same support in the official Tiktoken library: openai/tiktoken#464.