Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Usage]: Run local models using vLLM #5011

Closed
bibutikoley opened this issue May 23, 2024 · 1 comment
Closed

[Usage]: Run local models using vLLM #5011

bibutikoley opened this issue May 23, 2024 · 1 comment
Labels
usage How to use vllm

Comments

@bibutikoley
Copy link

Your current environment

I tried running the vLLM using the TheBloke/Mistral-7B-Instruct-v0.1-GGUF with the below command

python -m vllm.entrypoints.openai.api_server --model /aimodels/mistral-7b-instruct-v0.1.Q4_K_S.gguf --host 0.0.0.0 --port 5555 --tokenizer=hf-internal-testing/llama-tokenizer --trust-remote-code

But got the below error.
OSError: It looks like the config file at '/aimodels/mistral-7b-instruct-v0.1.Q4_K_S.gguf' is not a valid JSON file.

How would you like to use vllm

I would like to use the TheBloke/Mistral-7B-Instruct-v0.1-GGUF using vLLM

@bibutikoley bibutikoley added the usage How to use vllm label May 23, 2024
@mgoin
Copy link
Collaborator

mgoin commented May 23, 2024

Hi @bibutikoley, vLLM doesn't support models in the GGUF format. Please use original precision models or one of the many quantizations we do support, such as GPTQ, AWQ, FP8, etc. Thanks!

@mgoin mgoin closed this as completed May 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
usage How to use vllm
Projects
None yet
Development

No branches or pull requests

2 participants