v0.1.0
Production Stack v0.1.0
Container image
ghcr.io/kaito-project/gpu-node-mocker:0.1.0
Helm charts
Add the chart repository (once):
helm repo add production-stack https://kaito-project.github.io/production-stack/charts/kaito-project
helm repo update production-stack
The following charts are published from this release (versions taken
from each chart's Chart.yaml at this tag):
production-stack/gpu-node-mockerproduction-stack/modeldeploymentproduction-stack/modelharness
See README.md for installation steps.
What's Changed
- feat: add common ci and makefile snippets by @rambohe-ch in #2
- Create GPU mocker by @techworldhello in #1
- feat: add production stack arch contents by @rambohe-ch in #3
- Add GPU node mocker framework by @techworldhello in #4
- Fix rbac so that gpu node mocker can run by @techworldhello in #6
- feat: use llm-d-inference-sim as llm mocker by @rambohe-ch in #5
- feat: support verify boilerplate by @rambohe-ch in #8
- feat: add e2e tests framework by @rambohe-ch in #7
- fix: the error to extract model name from original pod by @rambohe-ch in #14
- E2E test setup for AKS cluster by @techworldhello in #16
- Add remaining repo setup by @techworldhello in #17
- Add local e2e setup by @techworldhello in #19
- feat: extract inferenceset part from e2e workflow to e2e cases by @rambohe-ch in #21
- feat: collect all versions of components by @rambohe-ch in #22
- Add gpu mocker tests by @techworldhello in #20
- feat: add keda and keda-kaito-scaler installation by @rambohe-ch in #23
- Add model routing e2e tests by @techworldhello in #26
- chore: bump keda-kaito-scaler to version v0.4.0 by @rambohe-ch in #28
- Add prefix cache aware routing tests by @techworldhello in #27
- fix: upgrade readme file by @rambohe-ch in #30
- feat: use CONTAINER_TOOL variable to support docker and podman by @tnsimon in #31
- fix: create shadow pods in model namespace instead of kaito-shadow by @tnsimon in #33
- fix: simplify e2e-up with username-based naming by @tnsimon in #34
- feat: add a modeldeployment helm chart by @rambohe-ch in #36
- feat: improve e2e tests by using modeldeployment helm chart by @rambohe-ch in #37
- feat: improve e2e tests by @rambohe-ch in #39
- feat: add e2e api key test by @tnsimon in #38
- feat: extract base setup workflow and improve install component in parallel by @rambohe-ch in #41
- feat: upgrade producation-stack arch and readme by @rambohe-ch in #43
- feat: add benchmark workflow by @rambohe-ch in #42
- fix the names for gateway host and steps by @rambohe-ch in #44
- fix: update llm-d-inference-sim in arch by @rambohe-ch in #45
- feat: add network policy e2e by @tnsimon in #25
- feat: add modelharness helm chart by @rambohe-ch in #46
- feat: add scaling test cases by @rambohe-ch in #40
- feat: upgrade arch with modelharness and modeldeployment by @rambohe-ch in #47
- chore: bump llm-gateway-auth to 0.0.7-alpha by @tnsimon in #49
- feat: add release workflow for production stack by @rambohe-ch in #52
New Contributors
- @techworldhello made their first contribution in #1
- @tnsimon made their first contribution in #31
Full Changelog: https://github.com/kaito-project/production-stack/commits/v0.1.0