Make cross-modal benchmark scoring explicitly memory-level by brianmeyer · Pull Request #24 · brianmeyer/recallforge

brianmeyer · 2026-03-27T20:52:33Z

Summary

split benchmark evaluation into explicit memory-level and raw asset-level scoring
keep the existing top-line benchmark metrics memory-centric for compatibility
add asset-level metrics to stage payloads and per-query results for clearer analysis

Linked work

REC-172
advances REC-160

Verification

pytest -q tests/test_cross_modal_benchmark_defs.py -q
pytest -q tests/test_search_pipeline.py tests/test_config_tools.py tests/test_json_compliance.py -q
pytest -x -m 'not live' --tb=short

chatgpt-codex-connector · 2026-03-27T20:52:40Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

Add explicit memory-level benchmark scoring

0be196a

brianmeyer merged commit 88873e5 into master Mar 27, 2026
4 checks passed

brianmeyer deleted the codex/rec-172-memory-benchmark-scoring branch March 27, 2026 20:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make cross-modal benchmark scoring explicitly memory-level#24

Make cross-modal benchmark scoring explicitly memory-level#24
brianmeyer merged 1 commit intomasterfrom
codex/rec-172-memory-benchmark-scoring

brianmeyer commented Mar 27, 2026

Uh oh!

chatgpt-codex-connector Bot commented Mar 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

brianmeyer commented Mar 27, 2026

Summary

Linked work

Verification

Uh oh!

chatgpt-codex-connector Bot commented Mar 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant