idea: Production Level Queue System #580

dan-homebrew · 2023-11-28T14:12:57Z

Objective

Do we need a queue system that scales to thousands of requests

Motivation

Nullpointer Errors?

Currently, inference requests are handled FIFO
We are adopting an OpenAI API, which means that we will receive requests across Chat, Audio, Vision etc
Given that users are on laptops with limited RAM and VRAM, we are likely to have to switch models

Preparing for Cloud Native

Our long-term future is likely as an enterprise OpenAI-alternative, which will be multi-user and have a queue system
Should we bake in this abstraction, and use a local file-based queue (which is later swapped out for a more sophisticated queue?)

Van-QA · 2024-04-16T08:08:22Z

Quoted from users from janhq/jan#2704:

Problem
When a generation is ongoing, entering a new prompt causes the generation to be interrupted. It would be nice if subsequent prompts to the same model be queued instead of resulting in interruptions.

Success Criteria
It would be nice if we could queue up a couple of prompts and then get back to the responses after a while.

louis-jan · 2024-04-25T13:05:39Z

Scoped for refactoring the Cortex Backend.

louis-jan · 2024-08-28T04:01:38Z

Should be on cortex-cpp

0xSage · 2024-09-06T08:09:04Z

@vansangpfiev dont we already have a basic queue in place? If so can close this issue 🙏

Update: Nvm, modifying this issue to track a prod-level queue system long term. Out of scope for now.

dan-homebrew added the type: epic A major feature or initiative label Nov 28, 2023

dan-homebrew changed the title ~~epic: Queue System for Inference?~~ epic: Queue System for Inference Dec 12, 2023

dan-homebrew changed the title ~~epic: Queue System for Inference~~ feat: Queue System for Inference Dec 12, 2023

dan-homebrew added this to the Jan Server milestone Dec 12, 2023

dan-homebrew changed the title ~~feat: Queue System for Inference~~ feat: Queue System for Inference? Dec 12, 2023

dan-homebrew removed the type: epic A major feature or initiative label Dec 12, 2023

0xSage assigned hiro-v Dec 22, 2023

hiro-v assigned louis-jan Feb 11, 2024

hiro-v removed their assignment Mar 14, 2024

Van-QA mentioned this issue Apr 16, 2024

feat: Enqueue requests from UI janhq/jan#2704

Closed

louis-jan mentioned this issue Apr 25, 2024

Refactor Headless Backend janhq/jan#2781

Closed

15 tasks

Van-QA transferred this issue from janhq/jan May 16, 2024

0xSage mentioned this issue Jul 1, 2024

epic: Queue #523

Closed

louis-jan assigned vansangpfiev and unassigned louis-jan Aug 28, 2024

imtuyethan unassigned vansangpfiev Aug 28, 2024

0xSage assigned vansangpfiev Sep 6, 2024

0xSage added the type: question label Sep 6, 2024

0xSage changed the title ~~feat: Queue System for Inference?~~ epic: Queue System for Inference? Sep 6, 2024

0xSage changed the title ~~epic: Queue System for Inference?~~ epic: Production Level Queue System Sep 6, 2024

0xSage added P2: enhancement low impact on functionality category: app shell Installer, updaters, distributions and removed type: question labels Sep 6, 2024

0xSage unassigned vansangpfiev Sep 6, 2024

dan-homebrew changed the title ~~epic: Production Level Queue System~~ idea: Production Level Queue System Sep 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

idea: Production Level Queue System #580

idea: Production Level Queue System #580

dan-homebrew commented Nov 28, 2023 •

edited by 0xSage

Loading

Van-QA commented Apr 16, 2024

louis-jan commented Apr 25, 2024

louis-jan commented Aug 28, 2024

0xSage commented Sep 6, 2024 •

edited

Loading

idea: Production Level Queue System #580

idea: Production Level Queue System #580

Comments

dan-homebrew commented Nov 28, 2023 • edited by 0xSage Loading

Objective

Motivation

Van-QA commented Apr 16, 2024

louis-jan commented Apr 25, 2024

louis-jan commented Aug 28, 2024

0xSage commented Sep 6, 2024 • edited Loading

dan-homebrew commented Nov 28, 2023 •

edited by 0xSage

Loading

0xSage commented Sep 6, 2024 •

edited

Loading