Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build(vllm-tensorizer): Compile vllm-flash-attn from source #70

Merged
merged 1 commit into from
Jun 16, 2024

Conversation

Eta0
Copy link
Collaborator

@Eta0 Eta0 commented Jun 15, 2024

Compile vllm-flash-attn from Source

vLLM replaced their usages of the regular flash-attn library with their own vllm-flash-attn fork in vllm-project/vllm#4686, which, as of right now, is fairly easy to compile. This change compiles it from source for compatibility with the ml-containers/torch base images.

This is necessary to enable updating our vllm-tensorizer images to include the newest versions of vLLM.

vLLM replaced their usages of the regular `flash-attn` library with
their own `vllm-flash-attn` fork, which, as of right now,
is fairly easy to compile. This change compiles it from source
for compatibility with the `ml-containers/torch` base images.

[skip ci]
@Eta0 Eta0 added bug Something isn't working enhancement New feature or request labels Jun 15, 2024
@Eta0 Eta0 requested a review from sangstar June 15, 2024 01:12
Copy link
Contributor

@sangstar sangstar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Comments/questions on a thing or two but I don't expect any changes are justified from them.

RUN git clone --filter=blob:none --depth 1 --no-single-branch --no-checkout \
https://github.com/vllm-project/flash-attention.git && \
cd flash-attention && \
git checkout "v${VLLM_FLASH_ATTN_VERSION}" && \
Copy link
Contributor

@sangstar sangstar Jun 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine it's safe to assume the branch names won't ever conflict here and lead to ambiguity, so it's unnecessary to explicitly do git checkout "tags/v${VLLM_FLASH_ATTN_VERSION}"?

-v --no-cache-dir --no-build-isolation --no-deps \
-c /tmp/constraints.txt \
./ && \
pip3 uninstall -y vllm-flash-attn
Copy link
Contributor

@sangstar sangstar Jun 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is pip uninstalling here necessary?

@sangstar sangstar merged commit 467a303 into ss/vllm-tensorizer Jun 16, 2024
@sangstar sangstar deleted the es/vllm-tensorizer branch June 16, 2024 13:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants