chore: update vllm to use gptq quanitzed model #378

YrrepNoj · 2024-04-10T19:07:26Z

This PR changes the default model that the vLLM container uses.

This PR captures the work completed in defenseunicorns/leapfrogai-backend-vllm:PR#21 that was not migrated over to the monorepo yet.

netlify · 2024-04-10T19:07:41Z

✅ Deploy Preview for leapfrogai-docs canceled.

Name	Link
🔨 Latest commit	`3e85b92`
🔍 Latest deploy log	https://app.netlify.com/sites/leapfrogai-docs/deploys/6616ed310bf7eb000828ed81

* chore: update vllm to use gptq quanitzed model * bug: fix catch-all wildcard for e2e workflow Signed-off-by: Andrew Risse <andrewrisse@gmail.com>

* chore: update vllm to use gptq quanitzed model * bug: fix catch-all wildcard for e2e workflow

chore: update vllm to use gptq quanitzed model

f580e03

YrrepNoj requested a review from a team as a code owner April 10, 2024 19:07

YrrepNoj and others added 2 commits April 10, 2024 15:46

Merge branch 'main' into 357-update-vllm-to-gptq-quantized-model

2643297

bug: fix catch-all wildcard for e2e workflow

3e85b92

gerred approved these changes Apr 10, 2024

View reviewed changes

gerred merged commit dc1029d into main Apr 10, 2024
7 checks passed

gerred deleted the 357-update-vllm-to-gptq-quantized-model branch April 10, 2024 22:19

andrewrisse pushed a commit that referenced this pull request Apr 17, 2024

chore: update vllm to use gptq quanitzed model (#378)

11cf6d0

* chore: update vllm to use gptq quanitzed model * bug: fix catch-all wildcard for e2e workflow Signed-off-by: Andrew Risse <andrewrisse@gmail.com>

andrewrisse pushed a commit that referenced this pull request Apr 17, 2024

chore: update vllm to use gptq quanitzed model (#378)

24c1aaa

* chore: update vllm to use gptq quanitzed model * bug: fix catch-all wildcard for e2e workflow Signed-off-by: Andrew Risse <andrewrisse@gmail.com>

andrewrisse pushed a commit that referenced this pull request Apr 17, 2024

chore: update vllm to use gptq quanitzed model (#378)

77b20e2

* chore: update vllm to use gptq quanitzed model * bug: fix catch-all wildcard for e2e workflow Signed-off-by: Andrew Risse <andrewrisse@gmail.com>

andrewrisse pushed a commit that referenced this pull request Apr 17, 2024

chore: update vllm to use gptq quanitzed model (#378)

31f24f5

* chore: update vllm to use gptq quanitzed model * bug: fix catch-all wildcard for e2e workflow

YrrepNoj mentioned this pull request Apr 18, 2024

Capture vLLM updates that was missed when migrating to monorepo #357

Closed

andrewrisse pushed a commit that referenced this pull request Apr 18, 2024

chore: update vllm to use gptq quanitzed model (#378)

a1fbfe6

* chore: update vllm to use gptq quanitzed model * bug: fix catch-all wildcard for e2e workflow

andrewrisse pushed a commit that referenced this pull request Apr 18, 2024

chore: update vllm to use gptq quanitzed model (#378)

047f4f8

* chore: update vllm to use gptq quanitzed model * bug: fix catch-all wildcard for e2e workflow

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: update vllm to use gptq quanitzed model #378

chore: update vllm to use gptq quanitzed model #378

YrrepNoj commented Apr 10, 2024

netlify bot commented Apr 10, 2024 •

edited

Loading

chore: update vllm to use gptq quanitzed model #378

chore: update vllm to use gptq quanitzed model #378

Conversation

YrrepNoj commented Apr 10, 2024

netlify bot commented Apr 10, 2024 • edited Loading

✅ Deploy Preview for leapfrogai-docs canceled.

netlify bot commented Apr 10, 2024 •

edited

Loading