Problem
aicr validate --phase performance against the GKE H100 COS Dynamo
inference recipe is a no-op:
[cli] phase requested but no checks defined in recipe;
phase will be empty: phase=performance
[cli] running validation phase: phase=performance catalog=4 selected=0
[cli] phase completed: phase=performance status=skipped
The validator catalog ships 4 performance-phase validators, but the
recipe's overlay doesn't subscribe to any of them, so the phase
reports status=skipped. End-to-end perf signal is silently missing
for the GKE flow.
Root cause
recipes/overlays/h100-gke-cos-inference-dynamo.yaml has no
validation.performance block. The EKS counterpart
recipes/overlays/h100-eks-ubuntu-inference-dynamo.yaml has one
(added in #641, "feat(validator): add inference performance
validation"). PR #641 wired the new validator into the EKS overlay
but didn't extend it to GKE — a sibling-overlay omission, not an
intentional gap (git history shows no performance block ever
existed on the GKE overlay).
Reproduction
aicr recipe --service gke --accelerator h100 --os cos \
--intent inference --platform dynamo --output gke-rec.yaml
aicr validate --recipe gke-rec.yaml --phase performance
Expected: at least one performance-phase check runs against the
deployed inference workload.
Actual: validators=0 passed=0 failed=0 status=skipped.
Suggested fix
Add a performance block to
recipes/overlays/h100-gke-cos-inference-dynamo.yaml mirroring the
EKS overlay's stanza:
performance:
checks:
- inference-perf
constraints:
- name: inference-throughput
value: ">= 5000" # tok/s — placeholder, tune for H100+COS
- name: inference-ttft-p99
value: "<= 200" # ms p99 — placeholder, tune for H100+COS
The EKS thresholds are noted in-source as "placeholder thresholds —
pending empirical tuning". Adopting them on GKE keeps the two flows
aligned and unblocks --phase performance. Threshold tuning for
COS / H100-mega-80gb (vs EKS Ubuntu / H100-80gb) is a follow-up.
Context
Surfaced while validating a freshly-deployed vllm-agg.yaml workload
on GKE H100 COS — --phase deployment (4/4 pass) and --phase conformance (11/11 pass) both ran; only --phase performance was
silently empty.
Related: #641 (where this should have been extended).
Problem
aicr validate --phase performanceagainst the GKE H100 COS Dynamoinference recipe is a no-op:
The validator catalog ships 4 performance-phase validators, but the
recipe's overlay doesn't subscribe to any of them, so the phase
reports
status=skipped. End-to-end perf signal is silently missingfor the GKE flow.
Root cause
recipes/overlays/h100-gke-cos-inference-dynamo.yamlhas novalidation.performanceblock. The EKS counterpartrecipes/overlays/h100-eks-ubuntu-inference-dynamo.yamlhas one(added in #641, "feat(validator): add inference performance
validation"). PR #641 wired the new validator into the EKS overlay
but didn't extend it to GKE — a sibling-overlay omission, not an
intentional gap (git history shows no
performanceblock everexisted on the GKE overlay).
Reproduction
Expected: at least one performance-phase check runs against the
deployed inference workload.
Actual:
validators=0 passed=0 failed=0 status=skipped.Suggested fix
Add a
performanceblock torecipes/overlays/h100-gke-cos-inference-dynamo.yamlmirroring theEKS overlay's stanza:
The EKS thresholds are noted in-source as "placeholder thresholds —
pending empirical tuning". Adopting them on GKE keeps the two flows
aligned and unblocks
--phase performance. Threshold tuning forCOS / H100-mega-80gb (vs EKS Ubuntu / H100-80gb) is a follow-up.
Context
Surfaced while validating a freshly-deployed
vllm-agg.yamlworkloadon GKE H100 COS —
--phase deployment(4/4 pass) and--phase conformance(11/11 pass) both ran; only--phase performancewassilently empty.
Related: #641 (where this should have been extended).