Skip to content

feat: integrate behavioral conformance tests into aicr validate #192

@yuanchen8911

Description

@yuanchen8911

Description

Currently, CNCF AI Conformance evidence collection requires two separate steps:

  1. aicr validate --phase conformance --evidence-dir ./evidence — structural validation
  2. ./docs/conformance/cncf/collect-evidence.sh all — behavioral tests

The behavioral tests (DRA GPU allocation, gang scheduling co-scheduling, HPA autoscaling with gpu-burn, device isolation) deploy actual workloads and verify end-to-end functionality. These can't be covered by structural validation alone.

Current Status

Integrating into the validate command is the goal. I'm keeping them separate for now because the evidence tests (deploying GPU workloads, running GPU intensive workloads, waiting for HPA scaling and metric propagation) are not yet robust enough and could break or block aicr validate.

Once the tests are cleaned up and streamlined, we'll integrate them into aicr validate --phase conformance so a single command collects all evidence.

Current Workflow

Documented in docs/conformance/cncf/README.md:

# Step 1: Structural validation evidence
aicr validate -r recipe.yaml --phase conformance --evidence-dir ./evidence

# Step 2: Behavioral test evidence
./docs/conformance/cncf/collect-evidence.sh all

Proposal

Integrate behavioral tests into aicr validate --phase conformance so a single command collects all evidence:

aicr validate -r recipe.yaml \
  --phase conformance \
  --evidence-dir ./evidence \
  --behavioral-tests   # new flag to run workload-based tests

Considerations

  • Behavioral tests are long-running (HPA test takes ~5 minutes for gpu-burn + metric propagation)
  • They deploy and clean up workloads (test namespaces, pods, HPAs)
  • They require GPU resources on the cluster
  • Could be implemented as Go test binaries (like readiness/deployment checks) or by invoking the shell script
  • A --behavioral-tests flag keeps them opt-in so aicr validate remains fast by default

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions