Skip to content

Replace placeholder tokens as specified in added_tokens_decoder#44468

Merged
itazap merged 1 commit intomainfrom
placeholder-tokens-fix
Mar 5, 2026
Merged

Replace placeholder tokens as specified in added_tokens_decoder#44468
itazap merged 1 commit intomainfrom
placeholder-tokens-fix

Conversation

@itazap
Copy link
Collaborator

@itazap itazap commented Mar 5, 2026

Replace placeholder tokens as specified in added_tokens_decoder
if we have added_tokens_decoder with specific token_ids, we need to overwrite them in spm model !
example: [UNUSED_TOKEN_146] -> <|im_start|>
see internlm2: https://huggingface.co/internlm/internlm2_5-7b-chat/blob/main/tokenizer_config.json

Copy link
Member

@hmellor hmellor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to fix this model in vLLM 🎉

@itazap itazap merged commit e498b5b into main Mar 5, 2026
27 checks passed
@itazap itazap deleted the placeholder-tokens-fix branch March 5, 2026 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants