Skip to content

Fix tiktoken compatibility with non-OpenAI models#155

Merged
kevin-mindverse merged 1 commit intomindverse:masterfrom
CXL-edu:fix-tiktoken-compatibility
Apr 7, 2025
Merged

Fix tiktoken compatibility with non-OpenAI models#155
kevin-mindverse merged 1 commit intomindverse:masterfrom
CXL-edu:fix-tiktoken-compatibility

Conversation

@CXL-edu
Copy link
Copy Markdown
Contributor

@CXL-edu CXL-edu commented Apr 6, 2025

Problem
When configuring custom models, tiktoken may fail to retrieve tokenizers for non-OpenAI models. Specifically, tokenizers for models like Qwen2.5 and Claude are not available in the tiktoken library, resulting in the following error:

KeyError: 'Could not automatically map qwen2.5:1.5b to a tokeniser. Please use `tiktoken.get_encoding` to explicitly get the tokeniser you expect.'

Solution
This PR adds compatibility handling by using the cl100k_base tokenizer as a default fallback for models not supported by tiktoken. This approach ensures that operations requiring tokenization can proceed without errors for non-OpenAI models like Qwen2.5 and Claude.

Copy link
Copy Markdown
Contributor

@yingapple yingapple left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your contribution

@kevin-mindverse kevin-mindverse self-requested a review April 7, 2025 02:08
@kevin-mindverse
Copy link
Copy Markdown
Contributor

Great Job!

@kevin-mindverse kevin-mindverse merged commit f36116b into mindverse:master Apr 7, 2025
1 check passed
Heterohabilis pushed a commit to Heterohabilis/Second-Me that referenced this pull request May 29, 2025
EOMZON pushed a commit to EOMZON/Second-Me that referenced this pull request Feb 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants