You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Both LlamaTokenizer and AutoTokenizer return the same tokenizer as they do so for Llama2 models.
In [17]: t = LlamaTokenizer.from_pretrained('meta-llama/Llama-2-7b-chat-hf')
In [18]: type(t)
Out[18]: transformers.models.llama.tokenization_llama.LlamaTokenizer
In [19]: t = AutoTokenizer.from_pretrained('meta-llama/Llama-2-7b-chat-hf')
In [20]: type(t)
Out[20]: transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast
The text was updated successfully, but these errors were encountered:
? I don't understand why you expect both to return the same type? One is the slow tokenizer, relying on sentencepiece backend, while the other is the fast, which relies on the tokenizers backend 😉
System Info
transformers
version: 4.40.1Who can help?
@ArthurZucker
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Initializing tokenizer for Llama3 model with
LlamaTokentizer
causes following:while
AutoTokenizer
gives ustransformers.tokenization_utils_fast.PreTrainedTokenizerFast
.Expected behavior
Both
LlamaTokenizer
andAutoTokenizer
return the same tokenizer as they do so for Llama2 models.The text was updated successfully, but these errors were encountered: