Evidence gate: evidence-leadership-0009-002
Current benchmark status: not ready.
Top blockers from .agent-factory/benchmark-gate-report.json:
local_model_quality:actor_policy:real_local_model_visible_fact_grounding_probe_failed
local_model_quality:target_hardware:target_hardware_not_m4_profile
Acceptance posture:
- Run structured-output, hidden-truth, actor-policy, and target M4 Pro or M4 Max local model benchmarks.
- Keep hidden-truth leakage and actor-policy probes psychometrically explicit.
- Do not enable local dialogue in station runtime until the model quality gate clears.
Evidence gate:
evidence-leadership-0009-002Current benchmark status: not ready.
Top blockers from
.agent-factory/benchmark-gate-report.json:local_model_quality:actor_policy:real_local_model_visible_fact_grounding_probe_failedlocal_model_quality:target_hardware:target_hardware_not_m4_profileAcceptance posture: