Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: ernie-3.0-nano-zh tokenizer缺一个token #6429

Closed
1 task done
LiShaoyu5 opened this issue Jul 18, 2023 · 1 comment
Closed
1 task done

[Bug]: ernie-3.0-nano-zh tokenizer缺一个token #6429

LiShaoyu5 opened this issue Jul 18, 2023 · 1 comment
Assignees
Labels
bug Something isn't working triage

Comments

@LiShaoyu5
Copy link

LiShaoyu5 commented Jul 18, 2023

软件环境

- torch
- transformers

重复问题

  • I have searched the existing issues

错误描述

The OrderedVocab you are attempting to save contains a hole for index 12084, your vocabulary could be corrupted !

稳定复现步骤 & 代码

使用transformers加载nghuyong/ernie-3.0-nano-zh模型,使用trainer训练和保存过程中出现该提示。
检查了vocab.txt和tokenizer.json,确实是没有12084这个token。请问这是正常的吗?

@LiShaoyu5 LiShaoyu5 added the bug Something isn't working label Jul 18, 2023
@FLYLKING
Copy link

FLYLKING commented Mar 4, 2024

同问

@paddle-bot paddle-bot bot closed this as completed Mar 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

3 participants