Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tokenizer class YiTokenizer does not exist or is not currently imported. #23

Closed
zl-liu opened this issue Nov 6, 2023 · 8 comments
Closed
Assignees
Labels
doc-not-needed Your PR changes do not impact docs.

Comments

@zl-liu
Copy link

zl-liu commented Nov 6, 2023

Thank you for your contributions to the community.

I tried loading Yi for inference, but I got the following error:

tokenizer = self.AUTO_TOKENIZER_CLASS.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/tokenization_auto.py", line 748, in from_pretrained
    raise ValueError(
ValueError: Tokenizer class YiTokenizer does not exist or is not currently imported.

I am using transformers 4.34.0 and I set trust_remote_code=True.

I am aware that since this is a "custom" model, files like "configuration_yi.py", "tokenization_yi.py", and "modeling_yi.py" will be executed.

In addition, I am ware that AutoTokenizer does NOT have YiTokenizer pre-registered [https://github.com/huggingface/transformers/blob/main/src/transformers/models/auto/tokenization_auto.py](The source code of AutoTokenizer)

Can you please provide your valuable insights? Thank you very much!

@loofahcus
Copy link
Contributor

Unfortunately I can't reproduce this problem, maybe you can have a try with our Docker image (will be released soon: #3)

@zl-liu
Copy link
Author

zl-liu commented Nov 6, 2023

I did a deep dive into the code ---- I think this problem might be caused by loading errors in this code: [https://github.com/huggingface/transformers/blob/eef7ea98c31a333bacdc7ae7a2372bde772be8e4/src/transformers/models/auto/tokenization_auto.py] . I downloaded the Yi-34b tar file, decompressed it, and used that dir as the pretrained path.

In tokenizer_config.json:

"auto_map": {
--
  | "AutoTokenizer": ["tokenization_yi.YiTokenizer", null]
  | },

Perhaps it failed to find the YiTokenizer class?

@mallorbc
Copy link

mallorbc commented Nov 6, 2023

Run with "trust_remote_code" being set to True

@zl-liu
Copy link
Author

zl-liu commented Nov 6, 2023

@mallorbc Thanks for your reply! I already set it to True (I even tried hard-coding "True" for "trust_remote_code"

@romulowspp
Copy link

Hello! i'm receiving the same error, i also tried trust_remote_code to True and receive same error.

@MarceloCorreiaData
Copy link

I had the same problem as you guys, so I reasoned reedcli meant with https://huggingface.co/01-ai/Yi-6B/discussions/6 they expected us to apply "trust_remote_code" to both instantiations.
So this code worked like a charm, please try it out:

from transformers import AutoTokenizer, AutoModelForCausalLM
model_path = '/path/to/your/model'
model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True, dtype="bfloat16", use_accelerate=True)
P.S.: (Add/replace your "device" snippet, I use mine).

@DmitryVN
Copy link

DmitryVN commented Nov 15, 2023

Is there any way to fix this in text-generation-webui-main for ExLlama_HF? What and where should I edit and add?

@oobabooga
Copy link

Is there any way to fix this in text-generation-webui-main for ExLlama_HF? What and where should I edit and add?

You need to launch the web UI with the --trust-remote-code flag (it is disabled by default as a security measure).

@Yimi81 Yimi81 added the doc-not-needed Your PR changes do not impact docs. label Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc-not-needed Your PR changes do not impact docs.
Projects
None yet
Development

No branches or pull requests

9 participants