[Feature]: AWQ DeepSeek support on MI300X

I'm testing [`RedHatAI/DeepSeek-R1-0528-quantized.w4a16`](https://huggingface.co/RedHatAI/DeepSeek-R1-0528-quantized.w4a16) on 4xMI300X with this command:

```sh
vllm serve RedHatAI/DeepSeek-R1-0528-quantized.w4a16 --host 0.0.0.0 --port 3000 --max-model-len 8192 --max-seq-len-to-capture 8192 --enable-chunked-prefill --enable-prefix-caching --trust-remote-code --disable-log-requests --tensor-parallel-size 4 --gpu-memory-utilization 0.95 --served-model-name deepseek-chat
```
And I get:
```
'_OpNamespace' '_C' object has no attribute 'gptq_marlin_repack'
```

I've tried `VLLM_USE_TRITON_AWQ=1` (seems like it's activated automatically for rocm devices anyway), but it looks like there is no `gptq_marlin_repack` in `awq_triton.py` so that didn't help: https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/layers/quantization/awq_triton.py

#### Related:

* https://github.com/vllm-project/vllm/issues/11249

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature]: AWQ DeepSeek support on MI300X #19727

Related:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Feature]: AWQ DeepSeek support on MI300X #19727

Description

Related:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions