feat: support running multiple engines at the same time #891

vansangpfiev · 2024-07-18T04:11:07Z

Describe Your Changes

Currently we only allow to run one engine at a time. Because cortex.trt-llm and cortex.onnx don't support embeddings yet, therefore we need to run cortex.llamacpp for embeddings along side with cortex.trt-llm/cortex.onnx for chat completions.
This PR also includes a commit to refactor the codebase.

louis-jan

LGTM

vansangpfiev added 3 commits July 18, 2024 10:35

feat: support running multiple engines at the same time

eab798a

refactor: rename functions

6de90a9

fix: append models to list

26a4768

vansangpfiev requested a review from louis-jan July 18, 2024 06:54

vansangpfiev marked this pull request as ready for review July 18, 2024 06:54

louis-jan approved these changes Jul 18, 2024

View reviewed changes

vansangpfiev merged commit fd7c40d into dev Jul 18, 2024

vansangpfiev deleted the feat/multi-engines branch July 18, 2024 06:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: support running multiple engines at the same time #891

feat: support running multiple engines at the same time #891

Uh oh!

vansangpfiev commented Jul 18, 2024

Uh oh!

louis-jan left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: support running multiple engines at the same time #891

feat: support running multiple engines at the same time #891

Uh oh!

Conversation

vansangpfiev commented Jul 18, 2024

Describe Your Changes

Uh oh!

louis-jan left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants