Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

epic: Refactor Inference Engines #2451

Closed
15 of 28 tasks
louis-jan opened this issue Mar 21, 2024 · 4 comments
Closed
15 of 28 tasks

epic: Refactor Inference Engines #2451

louis-jan opened this issue Mar 21, 2024 · 4 comments
Assignees
Labels
type: epic A major feature or initiative
Milestone

Comments

@louis-jan
Copy link
Contributor

louis-jan commented Mar 21, 2024

Motivation

Currently, integrating new inference providers involves significant code duplication, including duplicating SSE helper functions, handling engine settings file I/O operations, and managing event handling for loading, unloading models, and inference processes.

Specs

  • Implementing a structured inheritance model to eliminate redundant code by inheriting from a base inference engine class and dynamically registering engine-specific settings.
  • Encouraging individual inference engines to manage their models, rather than having the application handle all models under a single directory (/models). This approach enhances model and engine maintenance and facilitates migration.
  • Shifting from hardcoded API key inputs to registering proper settings from the extension level, improving flexibility and security.
  • Addressing the scalability issues caused by numerous if-else statements and engine name comparisons in the frontend. This refactor aims to make it easier to support new engines and enhance scalability.

Tasklist

Out of scope

Not in Scope

  • Customize UI at the extensions level
@louis-jan louis-jan added the type: epic A major feature or initiative label Mar 21, 2024
@louis-jan louis-jan self-assigned this Mar 21, 2024
@louis-jan
Copy link
Contributor Author

cc @namchuai @metaspartan

@hiro-v
Copy link
Contributor

hiro-v commented Mar 22, 2024

@louis-jan
Copy link
Contributor Author

Since we scoped down the epic and create according out-of-scope sub tasks. Let's close this. cc @Van-QA

@dmatora
Copy link

dmatora commented Apr 14, 2024

No hope for Gemini support anytime soon?
I'm using Gemini 1.5 Pro via Vertex AI and getting results way above what GPT4 can do, would really love to have proper chat interface

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: epic A major feature or initiative
Projects
Archived in project
Development

No branches or pull requests

6 participants