[CI/Build] fix pip cache with vllm_nccl & refactor dockerfile to build wheels #3859

youkaichao · 2024-04-04T20:53:08Z

the vllm_nccl package must be installed from source distribution
pip is too smart to store a wheel in the cache, and other CI jobs
will directly use the wheel from the cache, which is not what we want.
we need to remove it manually

simon-mo

Actually this won't work because we don't want to ship devel image for production. We should still use the runtime base image. The right fix should be change

FROM nvidia/cuda:12.1.0-runtime-ubuntu22.04 AS vllm-base

to

FROM nvidia/cuda:12.1.0-base-ubuntu22.04 AS vllm-base

youkaichao · 2024-04-04T21:33:20Z

@simon-mo please take a look and see if the modification of dockerfile is good. The test seems to be ok with the modification.

youkaichao · 2024-04-04T21:49:56Z

@simon-mo so we want to build wheel using dev image, and use that wheel for the rest images, right?

simon-mo · 2024-04-04T22:15:27Z

Yes. Also also manually test the openai image locally to ensure it has all the necessary dependencies.

We want to avoid using the devel image for testing and production in general

youkaichao · 2024-04-04T22:27:52Z

Ideal case:

nvidia/cuda:12.1.0-devel-ubuntu22.04 --> dev (install requirements) --> build (for building vllm wheels) and flash-attn-builder (for building flash-attn wheels)

nvidia/cuda:12.1.0-base-ubuntu22.04 --> vllm-base (install vllm from wheel) --> test ( for testing, only need to copy tests folder), and vllm-openai (for openai server, only need to install accelerate hf_transfer modelscope and start the server)

simon-mo

plz confirm locally that the openai server still work as expected.

youkaichao · 2024-04-05T04:03:36Z

plz confirm locally that the openai server still work as expected.

don't get it. I think we have tests for api server 👀

simon-mo · 2024-04-05T04:41:38Z

Similar to release process, can you build the final container locally (DOCKER_BUILDKIT=1 docker build . --target vllm-openai --tag vllm/vllm-openai --build-arg max_jobs=1) and confirm the server works (docker run --runtime nvidia --gpus all -p 8000:8000 vllm/vllm-openai). Our CI uses the test container but not the openai server container.

youkaichao · 2024-04-05T04:52:51Z

Okay, confirmed it works.

[CI/Build] fix pip cache with vllm_nccl & refactor dockerfile to build wheels (vllm-project#3859)

fix pip cache with vllm_nccl

dc0b2d1

youkaichao requested a review from WoosukKwon April 4, 2024 20:53

WoosukKwon approved these changes Apr 4, 2024

View reviewed changes

try to fix dockerfile

41b5479

youkaichao requested a review from simon-mo April 4, 2024 21:23

simon-mo requested changes Apr 4, 2024

View reviewed changes

youkaichao and others added 3 commits April 4, 2024 23:03

refactor dockerfile

799417d

reduce diff

cf14ce6

use both ccache and pip cache

55ec019

youkaichao changed the title ~~[CI/Build] fix pip cache with vllm_nccl~~ [CI/Build] fix pip cache with vllm_nccl & refactor dockerfile to build wheels Apr 4, 2024

youkaichao added 4 commits April 4, 2024 16:33

add examples into test

1681b11

Merge branch 'main' into fix_ci_nccl

0c175cf

add dirs for tests

342593a

switch code for better cache

278f61e

youkaichao requested a review from simon-mo April 5, 2024 00:05

youkaichao and others added 4 commits April 4, 2024 17:37

add code in doc build

630b243

fix doc build

216f79f

fix lint

df9b2f0

fix pytest

53a0649

simon-mo approved these changes Apr 5, 2024

View reviewed changes

youkaichao merged commit d03d64f into main Apr 5, 2024
35 checks passed

youkaichao deleted the fix_ci_nccl branch April 5, 2024 04:53

youkaichao mentioned this pull request Apr 5, 2024

[Misc] Define common requirements #3841

Merged

youkaichao mentioned this pull request Apr 5, 2024

[CI] Build wheels in CI instead of source install #3857

Closed

z103cb pushed a commit to z103cb/opendatahub_vllm that referenced this pull request Apr 22, 2024

[CI/Build] refactor dockerfile & fix pip cache

cdf3439

[CI/Build] fix pip cache with vllm_nccl & refactor dockerfile to build wheels (vllm-project#3859)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI/Build] fix pip cache with vllm_nccl & refactor dockerfile to build wheels #3859

[CI/Build] fix pip cache with vllm_nccl & refactor dockerfile to build wheels #3859

youkaichao commented Apr 4, 2024

simon-mo left a comment

youkaichao commented Apr 4, 2024

youkaichao commented Apr 4, 2024

simon-mo commented Apr 4, 2024

youkaichao commented Apr 4, 2024

simon-mo left a comment

youkaichao commented Apr 5, 2024

simon-mo commented Apr 5, 2024

youkaichao commented Apr 5, 2024

[CI/Build] fix pip cache with vllm_nccl & refactor dockerfile to build wheels #3859

[CI/Build] fix pip cache with vllm_nccl & refactor dockerfile to build wheels #3859

Conversation

youkaichao commented Apr 4, 2024

simon-mo left a comment

Choose a reason for hiding this comment

youkaichao commented Apr 4, 2024

youkaichao commented Apr 4, 2024

simon-mo commented Apr 4, 2024

youkaichao commented Apr 4, 2024

simon-mo left a comment

Choose a reason for hiding this comment

youkaichao commented Apr 5, 2024

simon-mo commented Apr 5, 2024

youkaichao commented Apr 5, 2024