Don't try to load training_args.bin #373

lpfhs · 2023-07-05T18:41:19Z

While trying to load a fine-tuned model, I was getting this exception:

Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/opt/venv/lib/python3.10/site-packages/vllm/entrypoints/api_server.py", line 82, in <module>
    engine = AsyncLLMEngine.from_engine_args(engine_args)
  File "/opt/venv/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 212, in from_engine_args
    engine = cls(engine_args.worker_use_ray,
  File "/opt/venv/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 49, in __init__
    self.engine = engine_class(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 97, in __init__
    worker = worker_cls(
  File "/opt/venv/lib/python3.10/site-packages/vllm/worker/worker.py", line 45, in __init__
    self.model = get_model(model_config)
  File "/opt/venv/lib/python3.10/site-packages/vllm/model_executor/model_loader.py", line 49, in get_model
    model.load_weights(
  File "/opt/venv/lib/python3.10/site-packages/vllm/model_executor/models/llama.py", line 248, in load_weights
    for name, loaded_weight in hf_model_weights_iterator(
  File "/opt/venv/lib/python3.10/site-packages/vllm/model_executor/weight_utils.py", line 74, in hf_model_weights_iterator
    state = torch.load(bin_file, map_location="cpu")
  File "/opt/venv/lib/python3.10/site-packages/torch/serialization.py", line 809, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "/opt/venv/lib/python3.10/site-packages/torch/serialization.py", line 1172, in _load
    result = unpickler.load()
  File "/opt/venv/lib/python3.10/site-packages/torch/serialization.py", line 1165, in find_class
    return super().find_class(mod_name, name)
AttributeError: Can't get attribute 'TrainingArguments' on <module 'vllm.entrypoints.api_server' from '/opt/venv/lib/python3.10/site-packages/vllm/entrypoints/api_server.py'>

zhuohan123 · 2023-07-06T16:57:15Z

Hi @lpfhs! Thanks for your contribution! Can you provide the name of the model that can cause this error for us to test out?

lpfhs · 2023-07-06T17:30:21Z

@zhuohan123 The model is not public, so I can't share it. It's a fine tuned vicuna 7b model that was trained using the training code in FastChat. You can see that the TrainingArguments class is present in the FastChat training script: https://github.com/lm-sys/FastChat/blob/0a827abe0cc60a3733b4406a070beb1ac8d0e5e1/fastchat/train/train.py#L50

zhuohan123 · 2023-07-06T17:33:01Z

Just want to make sure this is a common pattern instead of a specific case for a specific model. Is this only introduced by the specific training script in FastChat, or any fine-tuned HuggingFace model will have this training_args.bin?

While trying to load a fine-tuned model, I was getting this exception: ``` Traceback (most recent call last): File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/opt/venv/lib/python3.10/site-packages/vllm/entrypoints/api_server.py", line 82, in <module> engine = AsyncLLMEngine.from_engine_args(engine_args) File "/opt/venv/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 212, in from_engine_args engine = cls(engine_args.worker_use_ray, File "/opt/venv/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 49, in __init__ self.engine = engine_class(*args, **kwargs) File "/opt/venv/lib/python3.10/site-packages/vllm/engine/llm_engine.py", line 97, in __init__ worker = worker_cls( File "/opt/venv/lib/python3.10/site-packages/vllm/worker/worker.py", line 45, in __init__ self.model = get_model(model_config) File "/opt/venv/lib/python3.10/site-packages/vllm/model_executor/model_loader.py", line 49, in get_model model.load_weights( File "/opt/venv/lib/python3.10/site-packages/vllm/model_executor/models/llama.py", line 248, in load_weights for name, loaded_weight in hf_model_weights_iterator( File "/opt/venv/lib/python3.10/site-packages/vllm/model_executor/weight_utils.py", line 74, in hf_model_weights_iterator state = torch.load(bin_file, map_location="cpu") File "/opt/venv/lib/python3.10/site-packages/torch/serialization.py", line 809, in load return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args) File "/opt/venv/lib/python3.10/site-packages/torch/serialization.py", line 1172, in _load result = unpickler.load() File "/opt/venv/lib/python3.10/site-packages/torch/serialization.py", line 1165, in find_class return super().find_class(mod_name, name) AttributeError: Can't get attribute 'TrainingArguments' on <module 'vllm.entrypoints.api_server' from '/opt/venv/lib/python3.10/site-packages/vllm/entrypoints/api_server.py'> ```

lpfhs · 2023-07-06T17:59:24Z

@zhuohan123 I think the issue is that FastChat has a custom TrainingArguments class, so when we're loading the pickle file, we need the definition of this custom class. From some web search, it seems transformers package is able to load training_args.bin if it's not a custom class (i.e. it's transformers.TrainingArguments).

bryanhpchiang · 2023-07-08T19:38:34Z

running into the same issue

zhuohan123

Seems like this is a common issue. Thanks for your contribution!

Updating docker links & version references

lpfhs force-pushed the training-args branch from 4c5220d to 1530cb0 Compare July 6, 2023 17:39

zhuohan123 approved these changes Jul 8, 2023

View reviewed changes

zhuohan123 merged commit 75beba2 into vllm-project:main Jul 8, 2023

WoosukKwon mentioned this pull request Jul 14, 2023

Exception is raised while loading the fine tuned model #455

Closed

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

Don't try to load training_args.bin (vllm-project#373)

0e8ae00

sjchoi1 pushed a commit to casys-kaist-internal/vllm that referenced this pull request May 7, 2024

Don't try to load training_args.bin (vllm-project#373)

a0fbe42

Xaenalt pushed a commit to opendatahub-io/vllm that referenced this pull request Oct 14, 2024

Update docs for 1.18.0 (vllm-project#373)

4183a07

Updating docker links & version references

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't try to load training_args.bin #373

Don't try to load training_args.bin #373

lpfhs commented Jul 5, 2023

zhuohan123 commented Jul 6, 2023

lpfhs commented Jul 6, 2023 •

edited

Loading

zhuohan123 commented Jul 6, 2023

lpfhs commented Jul 6, 2023

bryanhpchiang commented Jul 8, 2023

zhuohan123 left a comment

Don't try to load training_args.bin #373

Don't try to load training_args.bin #373

Conversation

lpfhs commented Jul 5, 2023

zhuohan123 commented Jul 6, 2023

lpfhs commented Jul 6, 2023 • edited Loading

zhuohan123 commented Jul 6, 2023

lpfhs commented Jul 6, 2023

bryanhpchiang commented Jul 8, 2023

zhuohan123 left a comment

Choose a reason for hiding this comment

lpfhs commented Jul 6, 2023 •

edited

Loading