Releases: kaito-project/production-stack
Releases · kaito-project/production-stack
v0.2.1
Production Stack v0.2.1
Container image
ghcr.io/kaito-project/gpu-node-mocker:0.2.1
Helm charts
Add the chart repository (once):
helm repo add production-stack https://kaito-project.github.io/production-stack/charts/kaito-project
helm repo update production-stack
The following charts are published from this release (versions taken
from each chart's Chart.yaml at this tag):
production-stack/gpu-node-mockerproduction-stack/modeldeploymentproduction-stack/modelharness
See README.md for installation steps.
What's Changed
- feat(charts): add gateway error-mapping, ownership labels, and namespce management by @rambohe-ch in #99
- feat(charts): consolidate whole-path outage local_reply with component-first attribution by @rambohe-ch in #100
- release: bump charts to 0.2.1 by @rambohe-ch in #101
Full Changelog: v0.2.0...v0.2.1
v0.2.0
Production Stack v0.2.0
Container image
ghcr.io/kaito-project/gpu-node-mocker:0.2.0
Helm charts
Add the chart repository (once):
helm repo add production-stack https://kaito-project.github.io/production-stack/charts/kaito-project
helm repo update production-stack
The following charts are published from this release (versions taken
from each chart's Chart.yaml at this tag):
production-stack/gpu-node-mockerproduction-stack/modeldeploymentproduction-stack/modelharness
See README.md for installation steps.
What's Changed
- fix release step by @rambohe-ch in #54
- test(e2e): probe EPP pod for netpol deny assertions by @tnsimon in #53
- e2e: add provider switch (upstream/azure), bump keda-kaito-scaler to v0.5.1, instrument timings by @rambohe-ch in #50
- fix nightly e2e workflow by @rambohe-ch in #48
- chore: Surface error when docker or podman is not installed by @techworldhello in #56
- chore: add workflow to push helm charts to mcr by @t0rr3sp3dr0 in #63
- Replace model-not-found Service with Envoy direct_response EnvoyFilter by @rambohe-ch in #60
- feat: separate image preparation from cluster setup for accurate E2E timing by @rambohe-ch in #61
- feat: Garbage collector for GPU mocker by @techworldhello in #58
- feat: upgrade readme according to the latest helm chart by @rambohe-ch in #64
- Refactor E2E install around the productionstack umbrella chart by @rambohe-ch in #67
- Generate E2E coverage report by @techworldhello in #73
- Improve networkpolicy harness by @tnsimon in #57
- Pin BBR and keda-kaito-scaler versions in productionstack chart by @rambohe-ch in #69
- Make nightly env setup consistent to e2e by @techworldhello in #75
- test(e2e/netpol): dump canary pod sidecar state on enforcement precheck failure by @tnsimon in #77
- feat(productionstack): add llm-gateway-apikey as OCI Helm dependency by @tnsimon in #82
- Proposal: add End-to-End Error Handling Across Cluster, Modelharness, and Modeldeployment Levels by @rambohe-ch in #76
- fix label filters for report data by @techworldhello in #81
- fix(modelharness): scope NetworkPolicy selector to production-stack pods (#83) by @tnsimon in #84
- docs: recommend Cilium dataplane; update NetworkPolicy → CiliumNetworkPolicy references by @tnsimon in #85
- chore: bump llm-gateway-apikey to 0.0.10-alpha by @tnsimon in #93
- add highly-available requirements for llm-gateway-auth and bbr by @rambohe-ch in #86
- feat(bbr): harden body-based-router for HA and add e2e coverage (#89) by @rambohe-ch in #95
- chore: bump llm-gateway-apikey to 0.0.11-alpha; per-namespace ext_authz EnvoyFilter by @tnsimon in #97
- feat(charts/productionstack): fail-closed cluster filters + unified outage local_reply by @rambohe-ch in #94
- release: bump charts to 0.2.0 and publish productionstack by @rambohe-ch in #98
New Contributors
- @t0rr3sp3dr0 made their first contribution in #63
Full Changelog: v0.1.0...v0.2.0
v0.1.0
Production Stack v0.1.0
Container image
ghcr.io/kaito-project/gpu-node-mocker:0.1.0
Helm charts
Add the chart repository (once):
helm repo add production-stack https://kaito-project.github.io/production-stack/charts/kaito-project
helm repo update production-stack
The following charts are published from this release (versions taken
from each chart's Chart.yaml at this tag):
production-stack/gpu-node-mockerproduction-stack/modeldeploymentproduction-stack/modelharness
See README.md for installation steps.
What's Changed
- feat: add common ci and makefile snippets by @rambohe-ch in #2
- Create GPU mocker by @techworldhello in #1
- feat: add production stack arch contents by @rambohe-ch in #3
- Add GPU node mocker framework by @techworldhello in #4
- Fix rbac so that gpu node mocker can run by @techworldhello in #6
- feat: use llm-d-inference-sim as llm mocker by @rambohe-ch in #5
- feat: support verify boilerplate by @rambohe-ch in #8
- feat: add e2e tests framework by @rambohe-ch in #7
- fix: the error to extract model name from original pod by @rambohe-ch in #14
- E2E test setup for AKS cluster by @techworldhello in #16
- Add remaining repo setup by @techworldhello in #17
- Add local e2e setup by @techworldhello in #19
- feat: extract inferenceset part from e2e workflow to e2e cases by @rambohe-ch in #21
- feat: collect all versions of components by @rambohe-ch in #22
- Add gpu mocker tests by @techworldhello in #20
- feat: add keda and keda-kaito-scaler installation by @rambohe-ch in #23
- Add model routing e2e tests by @techworldhello in #26
- chore: bump keda-kaito-scaler to version v0.4.0 by @rambohe-ch in #28
- Add prefix cache aware routing tests by @techworldhello in #27
- fix: upgrade readme file by @rambohe-ch in #30
- feat: use CONTAINER_TOOL variable to support docker and podman by @tnsimon in #31
- fix: create shadow pods in model namespace instead of kaito-shadow by @tnsimon in #33
- fix: simplify e2e-up with username-based naming by @tnsimon in #34
- feat: add a modeldeployment helm chart by @rambohe-ch in #36
- feat: improve e2e tests by using modeldeployment helm chart by @rambohe-ch in #37
- feat: improve e2e tests by @rambohe-ch in #39
- feat: add e2e api key test by @tnsimon in #38
- feat: extract base setup workflow and improve install component in parallel by @rambohe-ch in #41
- feat: upgrade producation-stack arch and readme by @rambohe-ch in #43
- feat: add benchmark workflow by @rambohe-ch in #42
- fix the names for gateway host and steps by @rambohe-ch in #44
- fix: update llm-d-inference-sim in arch by @rambohe-ch in #45
- feat: add network policy e2e by @tnsimon in #25
- feat: add modelharness helm chart by @rambohe-ch in #46
- feat: add scaling test cases by @rambohe-ch in #40
- feat: upgrade arch with modelharness and modeldeployment by @rambohe-ch in #47
- chore: bump llm-gateway-auth to 0.0.7-alpha by @tnsimon in #49
- feat: add release workflow for production stack by @rambohe-ch in #52
New Contributors
- @techworldhello made their first contribution in #1
- @tnsimon made their first contribution in #31
Full Changelog: https://github.com/kaito-project/production-stack/commits/v0.1.0