🐛 Bug
When I was checking out bert-base-multilingual-uncased vocabulary. I receive the warning "Saving vocabulary to ./vocab.txt: vocabulary indices are not consecutive. Please check that the vocabulary is not corrupted"
I ran the similar command on two different machine and got the same warning.
from pytorch_transformers import *
tokenizer = BertTokenizer.from_pretrained('bert-base-multilingual-uncased',do_lower_case=True)
tokenizer.save_vocabulary('./')
I ran it on
- OS:
- Python version: python3.5
- PyTorch version: pytorch1.0.1.post2
- PyTorch Transformers version (or branch): 1.0
- Using GPU ? Yes
- Distributed of parallel setup ?no
🐛 Bug
When I was checking out bert-base-multilingual-uncased vocabulary. I receive the warning "Saving vocabulary to ./vocab.txt: vocabulary indices are not consecutive. Please check that the vocabulary is not corrupted"
I ran the similar command on two different machine and got the same warning.
from pytorch_transformers import *
tokenizer = BertTokenizer.from_pretrained('bert-base-multilingual-uncased',do_lower_case=True)
tokenizer.save_vocabulary('./')
I ran it on