Proposal
Would MLCube be open to an optional run audit manifest for MLCube executions?
MLCube already focuses on portability and reproducibility. A small sidecar manifest could make benchmark runs easier to review, compare, cite, and publish safely without changing the core MLCube task interface.
Suggested manifest shape
{
"schema_version": "mlcube.run_audit.v1",
"mlcube_task": "run",
"runner": "docker",
"image": "...",
"mlcube_config_hash": "...",
"benchmark": "...",
"dataset_refs": [
{
"source_id": "...",
"kind": "dataset",
"provenance": "...",
"redaction_status": "safe_for_public_log"
}
],
"result_paths": ["..."],
"provenance": {
"repo": "...",
"commit": "...",
"created_at": "..."
},
"claim_status": "diagnostic",
"redaction_status": "safe_for_public_log"
}
Why this may help
- makes it clearer which MLCube/config/image produced a benchmark result
- preserves provenance such as runner, image, config hash, repo commit, dataset refs, and result paths
- separates diagnostic/internal runs from results intended for public reports or model cards
- gives downstream benchmark users a standard place for audit-safe metadata
- avoids storing raw secrets, private paths, tokens, or full sensitive arguments in public logs
Scope I would keep small
If maintainers think this is useful, I can prepare a follow-up PR that:
- documents an optional manifest schema
- adds a minimal example manifest under docs/examples or docs/getting-started
- keeps the manifest opt-in and backward compatible
- does not change existing runner behavior by default
- does not add external service dependencies
This is motivated by work in the AANA project around audit-safe AI evaluation artifacts, but the contribution would be generic to MLCube and would not require AANA as a dependency.
Proposal
Would MLCube be open to an optional run audit manifest for MLCube executions?
MLCube already focuses on portability and reproducibility. A small sidecar manifest could make benchmark runs easier to review, compare, cite, and publish safely without changing the core MLCube task interface.
Suggested manifest shape
{ "schema_version": "mlcube.run_audit.v1", "mlcube_task": "run", "runner": "docker", "image": "...", "mlcube_config_hash": "...", "benchmark": "...", "dataset_refs": [ { "source_id": "...", "kind": "dataset", "provenance": "...", "redaction_status": "safe_for_public_log" } ], "result_paths": ["..."], "provenance": { "repo": "...", "commit": "...", "created_at": "..." }, "claim_status": "diagnostic", "redaction_status": "safe_for_public_log" }Why this may help
Scope I would keep small
If maintainers think this is useful, I can prepare a follow-up PR that:
This is motivated by work in the AANA project around audit-safe AI evaluation artifacts, but the contribution would be generic to MLCube and would not require AANA as a dependency.