Measurement trust + trace-to-training backend — eval_trust audit toolkit and rollout→SFT/DPO/RL data factory for WasmAgent compliance training
benchmark paper audit rl compliance reproducibility training-data sft dpo model-merging llm-evaluation mcnemar-test wasmagent
-
Updated
Jul 3, 2026 - Python