-
-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: vLLM ModelConfig doesn't pass hf_overrides to get_hf_image_processor_config, which could contain auth token for hugging face (not in ENV) #14854
Comments
I don't think using |
@DarkLight1337 It works fine if you just have one HF token. But if you have multiple, you need to reset the environment variable every time. Wouldn't it make more sense to use the inbuilt auth token functionality in transformers? It exists to support multiple token access. I have an issue right now with my unsloth build, since vllm cuts off the auth token pathway from unsloth to transformers. |
What API is required for vLLM to support this? I'm not familiar with how this works in Unsloth. |
Unsloth uses vLLM for fast inference. It calls a LLM class (vllm/entrypoints/llm.py) with the vLLM defined set of arguments. This doesn't take in a token, so we need to pass the token through hf_overrides (supported by the class). This is passed on to the create_engine_config (vllm/engine/arg_utils.py) , which passes on to ModelConfig. In this class, the hf_override variable is properly parsed. But, it's only sent to get_hf_text_config(). This method doesn't make use of the any auth token. The hf_overrides parameter is however, not passed to the subsequent get_hf_image_process. This method internally called get_image_processor_config(), which can take a token as a parameter. If passed through get_hf_image_process, we can achieve multiple token support. I have tried my best to explain what happens here. Let me know if you need anything else, or if it's too confusing. |
If Unsloth supports directly passing through the LLM arguments, we can introduce a new LLM engine argument to pass the tokens so they aren't logged as I noted previously. |
I can set up a PR on Unsloth to do that. This would be very helpful, thank you. Let me know once this feature is supported, or if I can help in any other way. Thank you. |
I'm busy with other PRs, so let's see if anyone is available to take this up on vLLM side |
Your current environment
The output of `python collect_env.py`
🐛 Describe the bug
vLLM ModelConfig in config.py doesn't pass hf_overrides to get_hf_image_processor_config.
This hf_overrides might contain the auth token from huggingface. From unsloth, the only way to pass the auth_token to vllm (not in env), is through hf_overrides. When pulling custom models from hugging face through vllm, we face this error:
This limits the usage of multiple HF Tokens with vLLM. As you can see from the above snippet, get_image_processor_config in transformers accepts token. We just need to pass it when calling get_hf_image_processor_config from vllm/config.py
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: