-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deps: Add protobuf to support ALLaM models #328
Conversation
Signed-off-by: Will Johnson <mwjohnson728@gmail.com>
Note the error occurs when loading the tokenizer for ALLaM model without protobuf: ERROR:sft_trainer.py:Traceback (most recent call last):
File "/home/tuning/.local/lib/python3.11/site-packages/tuning/sft_trainer.py", line 577, in main
trainer = train(
^^^^^^
File "/home/tuning/.local/lib/python3.11/site-packages/tuning/sft_trainer.py", line 195, in train
tokenizer = AutoTokenizer.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tuning/.local/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 916, in from_pretrained
return tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tuning/.local/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2271, in from_pretrained
return cls._from_pretrained(
^^^^^^^^^^^^^^^^^^^^^
File "/home/tuning/.local/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2505, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tuning/.local/lib/python3.11/site-packages/transformers/models/llama/tokenization_llama_fast.py", line 157, in __init__
super().__init__(
File "/home/tuning/.local/lib/python3.11/site-packages/transformers/tokenization_utils_fast.py", line 118, in __init__
fast_tokenizer = convert_slow_tokenizer(slow_tokenizer)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tuning/.local/lib/python3.11/site-packages/transformers/convert_slow_tokenizer.py", line 1597, in convert_slow_tokenizer
return converter_class(transformer_tokenizer).converted()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/tuning/.local/lib/python3.11/site-packages/transformers/convert_slow_tokenizer.py", line 538, in __init__
requires_backends(self, "protobuf")
File "/home/tuning/.local/lib/python3.11/site-packages/transformers/utils/import_utils.py", line 1531, in requires_backends
raise ImportError("".join(failed))
ImportError:
LlamaConverter requires the protobuf library but it was not found in your environment. Checkout the instructions on the
installation page of its repo: https://github.com/protocolbuffers/protobuf/tree/master/python#installation and follow the ones
that match your environment. Please note that you may need to restart your runtime after installation. |
Signed-off-by: Will Johnson <mwjohnson728@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change looks good to me, testing image build and tuning llama3 model and then good to merge after testing. Verified image build ran successfully and llama3 ran successfully.
Squashing the commit here. To add this change to main branch, would cherry-pick in the single squashed commit with this change. |
Description of the change
Add protobuf v5.28.0 to fms-hf-tuning for compatibility with certain models
Related issue number
How to verify the PR
Was the PR tested