[codex] Add benchmark leakage audit assistant by Davifek · Pull Request #207 · SCIBASE-AI/SCIBASE.AI

Davifek · 2026-05-18T18:41:40Z

Summary

Adds a distinct benchmark-leakage-audit-assistant/ slice for the AI-Powered Research Assistant Suite bounty. This module focuses on ML/research evaluation hygiene before publication or release review.

It detects:

train/test overlap by record ID or normalized content fingerprint
benchmark contamination in the training corpus
final holdout or test-set use during model selection
weak split provenance such as missing seed or manifest hash
missing reproducibility packet artifacts

Demo

Included short demo artifact: benchmark-leakage-audit-assistant/demo.gif

Run locally:

node benchmark-leakage-audit-assistant/demo.js

Verification

Passed locally:

node benchmark-leakage-audit-assistant/test.js
node benchmark-leakage-audit-assistant/demo.js
git diff --cached --check

The tests cover a contaminated project that is blocked, a clean project that passes without false positives, and reviewer-ready evidence/remediation fields for every finding.

Notes

Dependency-free Node.js standard library implementation
Synthetic sample data only
No credentials, external services, or network access required
Includes a requirement map for AI-Powered Research Assistant Suite #16

feat: add benchmark leakage audit assistant

e52c6b1

algora-pbc Bot added the 🙋 Bounty claim label May 18, 2026

algora-pbc Bot mentioned this pull request May 18, 2026

AI-Powered Research Assistant Suite #16

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] Add benchmark leakage audit assistant#207

[codex] Add benchmark leakage audit assistant#207
Davifek wants to merge 1 commit into
SCIBASE-AI:mainfrom
Davifek:codex/benchmark-leakage-audit-16

Davifek commented May 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Davifek commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Demo

Verification

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Davifek commented May 18, 2026 •

edited

Loading