Running vLLM on Intel B60 Cards with Arrow Lake CPU

I have been trying to get vLLM running on Intel B60 Cards with Arrow Lake CPU. The procedure I followed are as follows. It is taken from this site. I have 2 B60 Cards in the machine. Using Ubuntu 24.04.

Linux ARLS-ASRK-02 6.14.0-33-generic #33~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Sep 19 17:02:30 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

https://docs.vllm.ai/en/stable/getting_started/installation/gpu.html#build-wheel-from-source

git clone https://github.com/vllm-project/vllm.git
cd vllm

docker build -f docker/Dockerfile.xpu -t vllm-xpu-env --shm-size=4g .
docker run -it \
             --rm \
             --network=host \
             --device /dev/dri \
             -v /dev/dri/by-path:/dev/dri/by-path \
             vllm-xpu-env

python -m vllm.entrypoints.openai.api_server \
     --model=facebook/opt-13b \
     --dtype=bfloat16 \
     --max_model_len=1024 \
     --distributed-executor-backend=mp \
     --pipeline-parallel-size=2 \
     -tp=8

Should I be using llm-scaler for 2 Intel B60 Cards with Arrow Lake CPU?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Running vLLM on Intel B60 Cards with Arrow Lake CPU #140

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Running vLLM on Intel B60 Cards with Arrow Lake CPU #140

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions