push_to_hub from local model #1873

mmg10 · 2024-07-25T05:42:03Z

System Info

transformers version: 4.43.2
Platform: Linux-6.10.0-1-cachyos-x86_64-with-glibc2.40
Python version: 3.12.4
Huggingface_hub version: 0.24.2
Safetensors version: 0.4.3
Accelerate version: 0.33.0
Accelerate config: not found
PyTorch version (GPU?): 2.3.1+cpu (False)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using distributed or parallel set-up in script?:

Who can help?

@muellerzr @SunMarc

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

In https://github.com/huggingface/trl/blob/main/examples/scripts/sft.py, I pass the following arguments.

    --model_name_or_path /home/ubuntu/work/Meta-Llama-3.1-8B \
    --dataset_name="HuggingFaceH4/no_robots" \
    --output_dir /home/ubuntu/work/Meta-Llama-3.1-8B-SFT \
    --report_to="wandb" \
    --push_to_hub true \
    --push_to_hub_model_id "llama3.1-8b-sft" \

I understand that I am using the SFTTrainer, but it is using super().push_to_hub method, hence I created the issue in this library.

The error is:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.11/site-packages/huggingface_hub/hf_api.py", line 3761, in create_commit
    hf_raise_for_status(response)
  File "/opt/conda/lib/python3.11/site-packages/huggingface_hub/utils/_errors.py", line 358, in hf_raise_for_status
    raise BadRequestError(message, response=response) from e
huggingface_hub.utils._errors.BadRequestError:  (Request ID: Root=1-66a1dfab-4f8aaadd66a3afc164a238e4;a70493b5-d4ba-4022-978f-9947dcf083c0)

Bad request:
"base_model" with value "/home/ubuntu/work/Meta-Llama-3.1-8B" is not valid. Use a model id from https://hf.co/models.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/ubuntu/work/sft.py", line 151, in <module>
    trainer.save_model(training_args.output_dir)
  File "/opt/conda/lib/python3.11/site-packages/transformers/trainer.py", line 3458, in save_model
    self.push_to_hub(commit_message="Model save")
  File "/opt/conda/lib/python3.11/site-packages/trl/trainer/sft_trainer.py", line 475, in push_to_hub
    return super().push_to_hub(commit_message=commit_message, blocking=blocking, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/transformers/trainer.py", line 4349, in push_to_hub
    return upload_folder(
           ^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/huggingface_hub/hf_api.py", line 1398, in _inner
    return fn(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/huggingface_hub/hf_api.py", line 4857, in upload_folder
    commit_info = self.create_commit(
                  ^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/huggingface_hub/hf_api.py", line 1398, in _inner
    return fn(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/huggingface_hub/hf_api.py", line 3765, in create_commit
    raise ValueError(f"Invalid metadata in README.md.\n{message}") from e
ValueError: Invalid metadata in README.md.
- "base_model" with value "/home/ubuntu/work/Meta-Llama-3.1-8B" is not valid. Use a model id from https://hf.co/models.
wandb: | 0.037 MB of 0.037 MB uploadedded

Expected behavior

The wandb logging and training completes sucessfully. But the push fails. This is because I am loading a model from the local directory /home/ubuntu/work/Meta-Llama-3.1-8B and not from an HF repo. However, the push should work fine!

The text was updated successfully, but these errors were encountered:

LysandreJik · 2024-07-25T08:53:12Z

I don't think there is support for push_to_hub_model_id in that script in sft.py.

Moving your issue to TRL, cc @kashif maybe :)

github-actions · 2024-08-24T15:06:00Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

qgallouedec · 2024-09-09T09:21:12Z

Use hub_model_id instead, see the TrainingArgumentsdoc

(max_steps 100 for demo)

python examples/scripts/sft.py \
    --model_name_or_path facebook/opt-350m \
    --dataset_name timdettmers/openassistant-guanaco \
    --output_dir opt-350m-sft \
    --dataset_text_field text \
    --push_to_hub \
    --max_steps 100 \
    --hub_model_id my_hub_model_id

https://huggingface.co/qgallouedec/my_hub_model_id

valayDave · 2024-10-03T00:36:22Z

I am still facing the same error EVEN when I pass the hub_model_id to the TrainingArgs. I am not using the sft.py script but I am rather directly using the SFTTrainer.

My Model is loaded directly from local machine and then I run some FT on it. After the trainer.train() is done, I call trainer.push_to_hub() and the code crashes with the exact same error. What is a workaround here.

Versions of Libraries:

        "transformers": "4.44.2",
        "peft": "0.12.0",
        "trl": "0.10.1",

mmg10 added the 🐛 bug Something isn't working label Jul 25, 2024

LysandreJik transferred this issue from huggingface/transformers Jul 25, 2024

qgallouedec closed this as completed Sep 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

push_to_hub from local model #1873

push_to_hub from local model #1873

mmg10 commented Jul 25, 2024

LysandreJik commented Jul 25, 2024

github-actions bot commented Aug 24, 2024

qgallouedec commented Sep 9, 2024 •

edited

Loading

valayDave commented Oct 3, 2024 •

edited

Loading

push_to_hub from local model #1873

push_to_hub from local model #1873

Comments

mmg10 commented Jul 25, 2024

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

LysandreJik commented Jul 25, 2024

github-actions bot commented Aug 24, 2024

qgallouedec commented Sep 9, 2024 • edited Loading

valayDave commented Oct 3, 2024 • edited Loading

qgallouedec commented Sep 9, 2024 •

edited

Loading

valayDave commented Oct 3, 2024 •

edited

Loading