Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: update vllm to use gptq quanitzed model #378

Merged
merged 3 commits into from
Apr 10, 2024

Conversation

YrrepNoj
Copy link
Member

This PR changes the default model that the vLLM container uses.

This PR captures the work completed in defenseunicorns/leapfrogai-backend-vllm:PR#21 that was not migrated over to the monorepo yet.

@YrrepNoj YrrepNoj requested a review from a team as a code owner April 10, 2024 19:07
Copy link

netlify bot commented Apr 10, 2024

Deploy Preview for leapfrogai-docs canceled.

Name Link
🔨 Latest commit 3e85b92
🔍 Latest deploy log https://app.netlify.com/sites/leapfrogai-docs/deploys/6616ed310bf7eb000828ed81

@gerred gerred merged commit dc1029d into main Apr 10, 2024
7 checks passed
@gerred gerred deleted the 357-update-vllm-to-gptq-quantized-model branch April 10, 2024 22:19
andrewrisse pushed a commit that referenced this pull request Apr 17, 2024
* chore: update vllm to use gptq quanitzed model

* bug: fix catch-all wildcard for e2e workflow

Signed-off-by: Andrew Risse <andrewrisse@gmail.com>
andrewrisse pushed a commit that referenced this pull request Apr 17, 2024
* chore: update vllm to use gptq quanitzed model

* bug: fix catch-all wildcard for e2e workflow

Signed-off-by: Andrew Risse <andrewrisse@gmail.com>
andrewrisse pushed a commit that referenced this pull request Apr 17, 2024
* chore: update vllm to use gptq quanitzed model

* bug: fix catch-all wildcard for e2e workflow

Signed-off-by: Andrew Risse <andrewrisse@gmail.com>
andrewrisse pushed a commit that referenced this pull request Apr 17, 2024
* chore: update vllm to use gptq quanitzed model

* bug: fix catch-all wildcard for e2e workflow
andrewrisse pushed a commit that referenced this pull request Apr 18, 2024
* chore: update vllm to use gptq quanitzed model

* bug: fix catch-all wildcard for e2e workflow
andrewrisse pushed a commit that referenced this pull request Apr 18, 2024
* chore: update vllm to use gptq quanitzed model

* bug: fix catch-all wildcard for e2e workflow
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants