Releases: ostinatocc/MGBench
Releases · ostinatocc/MGBench
MGBench v0.1.1
MGBench v0.1.1 is archived on Zenodo with DOI: https://doi.org/10.5281/zenodo.20793097
MGBench v0.1.1 freezes the first public memory-governance benchmark release.
Highlights:
- 608 frozen deterministic scenarios across 8 governance suites.
- Deterministic scoring with no LLM judge dependency.
- Reference reports for Aionis, raw memory, no memory, Mem0, Supermemory, Graphiti, and Tencent Agent Memory where reports are available.
- Public adapter and benchmark contracts for external memory-system evaluations.
Current public reports:
- reports/mgbench-v0.1-current.md
- reports/mgbench-v0.1-current.json
Interpretation boundary:
- MGBench measures memory governance, not patch success.
- Filtered competitor modes are labeled external_host when lifecycle/filter knowledge is supplied by the caller.
- Raw per-suite metrics are primary evidence; aggregate MGBench score is a compact ranking for the current manifest.
MGBench v0.1.0
MGBench v0.1.0 freezes the first public memory-governance benchmark release.
Highlights:
- 608 frozen deterministic scenarios across 8 governance suites.
- Deterministic scoring with no LLM judge dependency.
- Reference reports for Aionis, raw memory, no memory, Mem0, Supermemory, Graphiti, and Tencent Agent Memory where reports are available.
- Public adapter and benchmark contracts for external memory-system evaluations.
Current public reports:
- reports/mgbench-v0.1-current.md
- reports/mgbench-v0.1-current.json
Interpretation boundary:
- MGBench measures memory governance, not patch success.
- Filtered competitor modes are labeled external_host when lifecycle/filter knowledge is supplied by the caller.
- Raw per-suite metrics are primary evidence; aggregate MGBench score is a compact ranking for the current manifest.