workload-prediction

Here is 1 public repository matching this topic...

Sakura66 / sagesched

SageSched: Intelligent LLM Request Scheduler with Workload Prediction — QoS-aware dual-queue scheduling for black-box LLM APIs (OpenAI/Azure/Doubao/Gemini)

api-gateway scheduler load-balancer openai qos faiss fastapi workload-prediction llm llm-inference llm-proxy gittins-index

Updated May 18, 2026
Python

Improve this page

Add a description, image, and links to the workload-prediction topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the workload-prediction topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

workload-prediction

Here is 1 public repository matching this topic...

Sakura66 / sagesched

Improve this page

Add this topic to your repo