-
Notifications
You must be signed in to change notification settings - Fork 16
Description
Description
Currently, CNCF AI Conformance evidence collection requires two separate steps:
aicr validate --phase conformance --evidence-dir ./evidence— structural validation./docs/conformance/cncf/collect-evidence.sh all— behavioral tests
The behavioral tests (DRA GPU allocation, gang scheduling co-scheduling, HPA autoscaling with gpu-burn, device isolation) deploy actual workloads and verify end-to-end functionality. These can't be covered by structural validation alone.
Current Status
Integrating into the validate command is the goal. I'm keeping them separate for now because the evidence tests (deploying GPU workloads, running GPU intensive workloads, waiting for HPA scaling and metric propagation) are not yet robust enough and could break or block aicr validate.
Once the tests are cleaned up and streamlined, we'll integrate them into aicr validate --phase conformance so a single command collects all evidence.
Current Workflow
Documented in docs/conformance/cncf/README.md:
# Step 1: Structural validation evidence
aicr validate -r recipe.yaml --phase conformance --evidence-dir ./evidence
# Step 2: Behavioral test evidence
./docs/conformance/cncf/collect-evidence.sh allProposal
Integrate behavioral tests into aicr validate --phase conformance so a single command collects all evidence:
aicr validate -r recipe.yaml \
--phase conformance \
--evidence-dir ./evidence \
--behavioral-tests # new flag to run workload-based testsConsiderations
- Behavioral tests are long-running (HPA test takes ~5 minutes for gpu-burn + metric propagation)
- They deploy and clean up workloads (test namespaces, pods, HPAs)
- They require GPU resources on the cluster
- Could be implemented as Go test binaries (like readiness/deployment checks) or by invoking the shell script
- A
--behavioral-testsflag keeps them opt-in soaicr validateremains fast by default