Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to enable vllm #536

Closed
lucasjinreal opened this issue Jul 4, 2023 · 4 comments
Closed

How to enable vllm #536

lucasjinreal opened this issue Jul 4, 2023 · 4 comments

Comments

@lucasjinreal
Copy link

Feature request

How to enable vllm

Motivation

How to enable vllm

Your contribution

How to enable vllm

@OlivierDehaene
Copy link
Member

Use 0.9 and a supported model

@lucasjinreal
Copy link
Author

lucasjinreal commented Jul 4, 2023

@OlivierDehaene Hi, don' know if you are stuff in huggingface or not,
but PLEASE indicates more details not just type use 0.9

  1. Does it enabled by default?
  2. if not, how to enable?
  3. any documentations indicates how to use?
    I never know a huggingface stuff can be so impatient to github community member.

@Narsil
Copy link
Collaborator

Narsil commented Jul 4, 2023

can be so impatient to github community member.

Please read your initial "Feature request" and tell me you did the effort to actually express intelligibly a feature request ?
Is this a bug ?

There are many ways you could have phrased that, starting with a request to improve the docs because you couldn't find what you want. What is the actual question you had, where did you look for it, and what did you find instead.

Also filling in properly the template instead of repeating the same thing over.

Our effort in replying scale with your effort.

  1. Yes
  2. It's always on ,there's no opt-out, it's just better overall that what was there previously.

There's no doc for it because well, it's not necessary, if you use flash models, you just get it.

It's also NOT vllm. It's a custom variant of PagedAttention, which is what makes vllm faster. We do reuse a slightly modified version of their low level kernel.

@lucasjinreal
Copy link
Author

@Narsil Hi, where does PagedAttention custom kernel included in?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants