The Intelligent Inference Scheduler for Large-scale Inference Services

English | 中文

About

AIGW is an intelligent inference scheduler for large-scale inference services. It provides intelligent routing, overload protection, and multi-tenant QoS capabilities through a global routing solution that is aware of load, KVCache, and Lora. This helps achieve higher throughput, lower latency, and efficient use of resources.

Status

Early & quick developing

Architecture

Highlights

A flexible, powerful, and easy-to-maintain Envoy Golang extension
Near real-time load metric collection
A balanced multi-factor composite decision-making algorithm
A highly available architecture that supports horizontal scaling

Developer Guide

Community

AIGW is built based on Envoy and Istio. We express our sincere gratitude to them.

Roadmap

Precise cache-awareness
SLO-aware algorithm based on latency prediction
PD separation scheduling
DP level scheduling

License

This project is licensed under the Apache 2.0 License.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
cmd/libgolang		cmd/libgolang
docs		docs
etc		etc
pkg		pkg
plugins		plugins
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.licenserc.yaml		.licenserc.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
README_ZH.md		README_ZH.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

The Intelligent Inference Scheduler for Large-scale Inference Services

About

Status

Architecture

Highlights

Developer Guide

Community

Roadmap

License

About

Uh oh!

Releases

Packages

Languages

License

aigw-project/aigw

Folders and files

Latest commit

History

Repository files navigation

The Intelligent Inference Scheduler for Large-scale Inference Services

About

Status

Architecture

Highlights

Developer Guide

Community

Roadmap

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages