v0.12.0
·
451 commits
to main
since this release
Immutable
release. Only release title and notes can be modified.
Changelog
New Features
- cbaba36: feat(bundler): add --dynamic flag for install-time values (#515) (#527) (@lockwobr)
- 1e550c7: feat(bundler): enable --attest and --data for argocd-helm (#573) (#627) (@lockwobr)
- 142c0d2: feat(ci): add aggregate merge-gate workflow (#651) (@mchmarny)
- 1caf260: feat(ci): add daily Slack issue status report (@mchmarny)
- ad682ef: feat(ci): add daily image vulnerability scan workflow (@mchmarny)
- 42cfd26: feat(ci): auto-assign issues based on area labels (#513) (@mchmarny)
- 9b09c94: feat(cli): add dynamic shell completion for flag values (#339) (#512) (@lockwobr)
- 1b25135: feat(cli): auto-hydrate RecipeMetadata overlays in validate and bundle (#595) (@njhensley)
- f2aeaf2: feat(evidence): add NIM support to evidence collection and restructure conformance docs (#479) (@yuanchen8911)
- 6137c0b: feat(evidence): split ai_service_metrics and fix imagePullPolicy for local images (#463) (@yuanchen8911)
- 4e158cf: feat(performance): add GB200 EKS support for NCCL all-reduce bandwidth check (#640) (@njhensley)
- 7f91140: feat(recipe): add NFD as standalone shared component (#518) (@ArangoGutierrez)
- f47e95f: feat(recipe): add mixin composition for OS and platform fragments (#501) (@yuanchen8911)
- 94fb041: feat(recipe): merge external validator catalog with embedded when provided through DataProvider (#588) (@njhensley)
- a66de21: feat(recipes): add NIM Operator recipe for CNCF AI Conformance (#478) (@yuanchen8911)
- 16d670d: feat(release): add pre-release support (#639) (@mchmarny)
- 306cb9b: feat(validator): add --node-selector and --toleration flags for validation workload scheduling (#444) (@atif1996)
- 1db88e4: feat(validator): add AICR_VALIDATOR_IMAGE_TAG env-var override (#666) (@yuanchen8911)
- 3a86364: feat(validator): add inference performance validation (#641) (@yuanchen8911)
- 6f7b4c1: feat: Add AKS UAT chainsaw tests for training and inference CUJs (#476) (@Jont828)
- 306b785: feat: GB200 EKS NET/NVLS NCCL validation and driver bump (#668) (@njhensley)
- ba20188: feat: add HardwareDetector interface and measurement keys for NFD integration (#482) (@ArangoGutierrez)
- 83c18bc: feat: add component contributor test harness (#508) (@ArangoGutierrez)
- 340452b: feat: add support for Akamai (#517) (@lalitadithya)
- 5a33265: feat: auto-install shell completions via install (#504) (@lockwobr)
- 81cf701: feat: implement NFD-based GPU hardware detection (#494) (@ArangoGutierrez)
- 7283e9c: feat: two-phase GPU collection with hardware detection support (#495) (@ArangoGutierrez)
- 02002ca: feat: wire NFDHardwareDetector into production snapshot pipeline (#502) (@ArangoGutierrez)
Bug Fixes
- 9d57dfb: fix(build): use FullCommit in goreleaser to match CI image tags (#658) (@mchmarny)
- 228e518: fix(bundler): add pre-flight finalizer check to undeploy.sh (#406) (#561) (@lockwobr)
- 64b8759: fix(bundler): allow Helm-style array indexing in --set paths (#643) (@yuanchen8911)
- c12c783: fix(bundler): fix undeploy template pre/post-flight checks (#602) (@yuanchen8911)
- 6f4ec0e: fix(bundler): harden filepath.Join with SafeJoin for path-traversal protection (#578) (@lockwobr)
- 966d775: fix(bundler): resolve ArgoCD RepoURL placeholder in child applications (#520) (@mchmarny)
- ca9d96c: fix(bundler): scope cleanup to bundle components and remove stale skyhook taints (#477) (@yuanchen8911)
- 50825cc: fix(bundler): skip helm commands for manifest-only components in README (@mchmarny)
- 8eac760: fix(ci): add --platform to aiperf-bench E2E docker build (#674) (@xdu31)
- 57e6b8d: fix(ci): add -mod=vendor to snapshot agent build (#534) (@yuanchen8911)
- 0a78409: fix(ci): add MDX safety check for non-self-closing img tags (#620) (@pdmack)
- 9e481d7: fix(ci): add diagnostic logging and multi-assignee support to issue triage (@mchmarny)
- db9f3ab: fix(ci): add failure diagnostics and fix Grafana resource starvation in Kind (#563) (@yuanchen8911)
- e2586f0: fix(ci): auto-label new issues by area and assign owners (#535) (@yuanchen8911)
- 7ad4f96: fix(ci): correct artifact action SHA pins in vuln scan workflow (@mchmarny)
- f5f7387: fix(ci): deduplicate conformance coverage in GPU CI (#577) (@yuanchen8911)
- 4a08c63: fix(ci): enable manual trigger for fern-docs-ci workflow (@mchmarny)
- d30e235: fix(ci): expand GPU test triggers to cover collector, snapshotter, validator, and add run-gpu-tests label (#514) (@xdu31)
- f489db3: fix(ci): fix fern instances URL basepath and surface publish URL in step summary (#568) (@pdmack)
- 42877f6: fix(ci): fix fern preview metadata and add continuous staging publish (#546) (@pdmack)
- d821306: fix(ci): improve GPU test reliability and deploy timeout handling (#539) (@yuanchen8911)
- 4988346: fix(ci): install gke-gcloud-auth-plugin before cluster connect (@mchmarny)
- 7b0dbb1: fix(ci): make issue report counts clickable Slack links (@mchmarny)
- 39e3114: fix(ci): match artifact download pattern to upload names (@mchmarny)
- 1fb1695: fix(ci): move GPU concurrency to test jobs (#581) (@yuanchen8911)
- 945a57d: fix(ci): pin e2e goreleaser and exclude local build artifacts (#580) (@yuanchen8911)
- c334bfc: fix(ci): query GPU snapshot by subtype name instead of index (#509) (@yuanchen8911)
- c294788: fix(ci): remove invalid --base-image flag from ko build (@mchmarny)
- abad89a: fix(ci): replace middle-dot separators with commas in issue report (@mchmarny)
- d988b02: fix(ci): replace push path filters with runtime path gate in GPU workflows (#558) (@yuanchen8911)
- 352b006: fix(ci): safe manifest publishing (#586) (@njhensley)
- 40bb85d: fix(ci): set GKE cluster name at correct config path (@mchmarny)
- d61dbfa: fix(ci): set deployment.destroy as boolean, not string (@mchmarny)
- 9e44eb7: fix(ci): shorten GKE deployment ID to fit SA name limit (@mchmarny)
- 5785d81: fix(ci): skip capacity pre-check for shared GCP reservations (@mchmarny)
- a7c8bf6: fix(ci): surface fern generate errors in preview (#650) (@mchmarny)
- 0eaa16f: fix(ci): use --bare flag for ko build in vuln scan workflow (@mchmarny)
- 67f03f4: fix(ci): use KEY_CONTENT env var for GKE provisioner credentials (@mchmarny)
- 56f20f6: fix(ci): use anchored regex for lychee exclude-path patterns (#547) (@pdmack)
- d117fbd: fix(ci): use config-based destroy for GKE provisioner (@mchmarny)
- 984b244: fix(ci): use correct field name 'subtype' in GPU snapshot validation (#511) (@yuanchen8911)
- bb072b1: fix(ci): use explicit empty mapping for workflow_dispatch (@mchmarny)
- b2f20fa: fix(ci): use search API for first-time contributor detection (#524) (@yuanchen8911)
- 8015fa8: fix(cli): --no-cluster must not deploy the snapshot-capture agent (#604) (@yuanchen8911)
- 173dba5: fix(recipe): handle null override in mergeValues to delete keys (#458) (@Jont828)
- a228540: fix(recipe): scope mixin fallback to affected candidates (#521) (@yuanchen8911)
- 0b98847: fix(recipes): disable Dynamo ssh-keygen on Kind (#670) (@yuanchen8911)
- 08b2cb2: fix(recipes): fix NIM operator validation and demo script issues (#483) (@yuanchen8911)
- 7466275: fix(scan): add pillow and python CVEs to grype ignore list (@mchmarny)
- 97be223: fix(scan): revert aiperf-bench base image to python:3-slim (@mchmarny)
- 5788bca: fix(scan): revert aiperf-bench base image to python:3.12-slim (@mchmarny)
- 7c547ad: fix(scan): suppress all critical/high CVEs for python:3.12-slim (@mchmarny)
- 7c6a111: fix(scan): update aiperf-bench base image and suppress unfixed CVEs (@mchmarny)
- e7eeca4: fix(scan): update grype ignore list for python:3.12-slim CVEs (@mchmarny)
- 4ee0bfd: fix(scan): use correct 'config' param for anchore/scan-action (@mchmarny)
- bbb4e44: fix(scan): use nvidia distroless python dev image for aiperf-bench (@mchmarny)
- 7c91652: fix(tests): strengthen UAT recipe and validation assertions (#537) (@ayuskauskas)
- e9521fd: fix(tools): bump-rc now targets minor version, remove beta support (@mchmarny)
- 826640f: fix(tools): changelog bash 3.2 compat, skip RC tags, add changelog-file target (@mchmarny)
- 997589a: fix(validator): resolve dev-build images to SHA-tagged images (#655) (@mchmarny)
- 49d287f: fix(validator): restore local image pull policy and optimize GPU CI (#505) (@yuanchen8911)
- c225ce2: fix(validator): use DRA GA API for dra-support check (#523) (@yuanchen8911)
- 0e12dc2: fix: align issue management with OSS policy (#480) (@lockwobr)
- 110b8e3: fix: bump lodash-es to >=4.18.0 to address prototype pollution (CVE-2025-13465) (@mchmarny)
- a0bd741: fix: keep priority labels off PRs (#667) (@yuanchen8911)
- cb00ae1: fix: remove h100-cluster.sh test script accidentally committed (#457) (@Jont828)
Other Tasks
- fa73b5b: Add GB200 overlays for OKE (#497) (@OguzPastirmaci)
- 9a1b068: Correct spelling of 'ArgoCD' to 'Argo CD' in README (#569) (@terrytangyuan)
- 11e322f: Feat/skyhook to nodewright (#642) (@ayuskauskas)
- 5ffd094: Fix invalid relative links in welcome workflow comments (#488) (@kannon92)
- a0a1dc9: Remove image from s3c.md (@mchmarny)
- deffe9d: Update to correct spelling of Argo CD (#570) (@terrytangyuan)
- 6bd821d: add kueue components as an option (#490) (@kannon92)
- 52da1cc: chore(ci): remove auto-assignment of issues during triage (#653) (@mchmarny)
- 6f588f5: chore(gitignore): add /CLAUDE.local.md for Claude Code overlay (#606) (@yuanchen8911)
- 62cf8ad: chore(infra): disable CodeRabbit docstring check (#612) (@yuanchen8911)
- 7db4275: chore(recipes): drop NCCL perf checks from gb200-eks-inference (#678) (@njhensley)
- e4de881: chore: bump dompurify from 3.3.3 to 3.4.0 in /site in the npm_and_yarn group across 1 directory (#589) (@dependabot[bot])
- 7fc96ab: chore: change uat capacity (@mchmarny)
- 72aeb07: chore: configure CodeRabbit for all branches (#598) (@mchmarny)
- 4914594: chore: deps: bump Go and npm dependencies (@mchmarny)
- e64eb7a: chore: deps: bump actions/deploy-pages from 4.0.5 to 5.0.0 (#466) (@dependabot[bot])
- 11ba1f4: chore: deps: bump actions/github-script from 8.0.0 to 9.0.0 (#530) (@dependabot[bot])
- 157bee2: chore: deps: bump actions/setup-go from 6.3.0 to 6.4.0 (#471) (@dependabot[bot])
- 55801fb: chore: deps: bump actions/setup-node from 6.3.0 to 6.4.0 (#614) (@dependabot[bot])
- a78f0d8: chore: deps: bump actions/upload-artifact from 7.0.0 to 7.0.1 (#549) (@dependabot[bot])
- 807e065: chore: deps: bump actions/upload-pages-artifact from 4.0.0 to 5.0.0 (#552) (@dependabot[bot])
- 24e8b56: chore: deps: bump aws-actions/configure-aws-credentials from 6.0.0 to 6.1.0 (#500) (@dependabot[bot])
- 59ceed5: chore: deps: bump dependabot/fetch-metadata from 2.5.0 to 3.0.0 (#467) (@dependabot[bot])
- 2cbcf7b: chore: deps: bump dependabot/fetch-metadata from 3.0.0 to 3.1.0 (#615) (@dependabot[bot])
- 10d3b64: chore: deps: bump docker/build-push-action from 7.0.0 to 7.1.0 (#551) (@dependabot[bot])
- d14c597: chore: deps: bump dorny/paths-filter from 3.0.2 to 4.0.1 (#583) (@dependabot[bot])
- 0772b38: chore: deps: bump github.com/sigstore/protobuf-specs from 0.5.0 to 0.5.1 (#499) (@dependabot[bot])
- 30c9d60: chore: deps: bump github.com/sigstore/sigstore from 1.10.4 to 1.10.5 (#470) (@dependabot[bot])
- 31cc57e: chore: deps: bump github.com/sigstore/timestamp-authority/v2 from 2.0.5 to 2.0.6 (#562) (@dependabot[bot])
- f41b976: chore: deps: bump github.com/urfave/cli/v3 from 3.7.0 to 3.8.0 (#465) (@dependabot[bot])
- ad053e2: chore: deps: bump github/codeql-action from 4.33.0 to 4.35.1 (#472) (@dependabot[bot])
- cc1877f: chore: deps: bump github/codeql-action from 4.35.1 to 4.35.2 (#592) (@dependabot[bot])
- 91d7eb6: chore: deps: bump golang.org/x/text from 0.35.0 to 0.36.0 in the golang-x group across 1 directory (#529) (@dependabot[bot])
- c91bd22: chore: deps: bump google-github-actions/auth from 2.1.10 to 3.0.0 (#648) (@dependabot[bot])
- 894d3f5: chore: deps: bump google-github-actions/setup-gcloud from 2.1.4 to 3.0.1 (#647) (@dependabot[bot])
- 8443cd2: chore: deps: bump goreleaser/goreleaser-action from 7.0.0 to 7.1.0 (#616) (@dependabot[bot])
- b50149a: chore: deps: bump lycheeverse/lychee-action from 2.3.0 to 2.8.0 (#550) (@dependabot[bot])
- 77e7ad4: chore: deps: bump sigstore/cosign-installer from 4.1.0 to 4.1.1 (#468) (@dependabot[bot])
- c537177: chore: deps: bump the kubernetes group with 3 updates (#591) (@dependabot[bot])
- 556ff39: chore: deps: upgrade vendor dependencies and fix gosec G703 lint (@mchmarny)
- 03b82bd: chore: enable codetabbit config (#594) (@lalitadithya)
- cf558e4: chore: remove capacity check (@mchmarny)
- e5cc5b8: chore: standardize SPDX file headers (#635) (@pdmack)
- 550017b: chore: update actuator image (@mchmarny)
- 6e16d87: chore: update cuda images (@mchmarny)
- fc20b35: chore: update dependencies (@mchmarny)
- 404b367: chore: update golang a patch release to 1.26.2 (#576) (@lockwobr)
- 1caf06b: chore: update uat config (@mchmarny)
- b63aa0a: ci(uat): add nightly GKE UAT workflow for GCP (#644) (@mchmarny)
- 8ebc0f8: ci: add njhensley to copy-pr-bot.yaml (#585) (@njhensley)
- db6b38a: ci: remove duplicate aicr build in aicr-build action (#633) (@njhensley)
- 355c0fe: ci: replace Trivy with Anchore scan-action for vulnerability scanning (#462) (@xdu31)
- 31eb1b1: ci: switch GPU smoke test runner from T4 to L40G (#637) (@dims)
- 5e6a3d5: docs(agents): add errorlint rule, lint gate, and local overlay (#590) (@yuanchen8911)
- cfcbc60: docs(conformance): add ai_service_metrics evidence for CNCF submission (#460) (@yuanchen8911)
- 17d42f9: enhance(validator): add targeted post-deployment GPU readiness checks (#611) (@yuanchen8911)
- f1212ad: feat(bundler/api): add --dynamic query parameter to bundle handler (#603) (@lockwobr)
- 00a2dbc: fix(docs/api): fix, clean up docs and add doc update audit rule (#625) (@yuanchen8911)
- 089ac95: fix(recipe bug): treat empty string as "any" in Criteria.Specificity() (#492) (@yuanchen8911)
- f032582: ix(bundler): v0.12 rc1 smoke-test follow-ups (#677) (@lockwobr)
- 2d2c3cf: refactor(ci): reorder and rename GPU workflow steps (#548) (@yuanchen8911)
- 1cbe5d9: refactor(ci): unify GPU Chainsaw layout and validation flow (#587) (@yuanchen8911)
- 066a7ee: refactor(ci): unify GPU training and inference workflows (#579) (@yuanchen8911)
- b82af6f: refactor(deployer): unify deployer interface across helm, argocd, argocd-helm (#571) (#575) (@lockwobr)
- abeea2b: refactor(recipe): add maximal leaf candidate selection to overlay resolver (#496) (@yuanchen8911)
- a370746: refactor(recipes): lift validation blocks from ubuntu leaves to intent overlays (#493) (@yuanchen8911)
- 9e356cc: refactor: replace magic literals with named constants (#593) (@mchmarny)
- ceffa8c: script(bundler): clarify deploy.sh completion != readiness (#609) (@yuanchen8911)