Skip to content

Make cross-modal benchmark scoring explicitly memory-level#24

Merged
brianmeyer merged 1 commit intomasterfrom
codex/rec-172-memory-benchmark-scoring
Mar 27, 2026
Merged

Make cross-modal benchmark scoring explicitly memory-level#24
brianmeyer merged 1 commit intomasterfrom
codex/rec-172-memory-benchmark-scoring

Conversation

@brianmeyer
Copy link
Copy Markdown
Owner

Summary

  • split benchmark evaluation into explicit memory-level and raw asset-level scoring
  • keep the existing top-line benchmark metrics memory-centric for compatibility
  • add asset-level metrics to stage payloads and per-query results for clearer analysis

Linked work

  • REC-172
  • advances REC-160

Verification

  • pytest -q tests/test_cross_modal_benchmark_defs.py -q
  • pytest -q tests/test_search_pipeline.py tests/test_config_tools.py tests/test_json_compliance.py -q
  • pytest -x -m 'not live' --tb=short

@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@brianmeyer brianmeyer merged commit 88873e5 into master Mar 27, 2026
4 checks passed
@brianmeyer brianmeyer deleted the codex/rec-172-memory-benchmark-scoring branch March 27, 2026 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant