Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mistral AWQ not generating any output #2833

Closed
annthsu opened this issue Feb 11, 2024 · 2 comments
Closed

Mistral AWQ not generating any output #2833

annthsu opened this issue Feb 11, 2024 · 2 comments

Comments

@annthsu
Copy link

annthsu commented Feb 11, 2024

I have tried all the methods mentioned in #2728, but none of them have worked at all,responses are always empty
I use 2*A6000 for GPU
Here is my Docker run command
Can anyone help me for this issue?Thank you !

docker run --rm --runtime nvidia --gpus all --env NCCL_P2P_DISABLE=1 --env CUDA_VISIBLE_DEVICES=0,1 --shm-size 10g -p 8002:8002 \
    -e MODEL=/app/model \
    -e API=se \
    -v /home/merge_model/:/app/model \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --ipc=host \
    vllm/vllm-openai:latest \
    --port 8002 \
    --model /app/model/mistral/testtest \
    --served-model-name "openchat" \
    --max-num-seqs 2000 \
    --max-num-batched-tokens 327680 \
    --tensor-parallel-size 2 \
    --disable-custom-all-reduce \
    --enforce-eager \
    --quantization 'awq' \
    --gpu-memory-utilization 1 \
    --max-model-len 21440
@annthsu annthsu changed the title Mixtral AWQ not generating any output Mistral AWQ not generating any output Feb 11, 2024
@whitebill-eng
Copy link

Hi! Try to use another chat template ex.: --chat-template ./examples/template_chatml.jinja

@hmellor
Copy link
Collaborator

hmellor commented Aug 28, 2024

Closing as stale

@hmellor hmellor closed this as not planned Won't fix, can't repro, duplicate, stale Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants