Presence-claim audit workflow + README/lead-paragraph scoring fixes by saagpatel · Pull Request #25 · saagpatel/GithubRepoAuditor

saagpatel · 2026-05-30T08:36:50Z

What this is

An external dynamic-workflow audit that independently re-checks the portfolio-truth snapshot's six presence claims against on-disk ground truth — and uses it to find and verify two scoring fixes in the auditor itself. Read-only; never mutates repos, the snapshot, or git.

The audit (read-only)

src/run_instructions_audit.py — deterministic pre-step: stratified pilot selection, evidence prep, live tool_today recompute, git-drift detection, and the bucket logic (TDD'd).
scripts/presence-claims-audit.workflow.js — the Workflow: fans out one Haiku verifier per repo (judging all 6 claims in one read, blind to the tool's answer) → deterministic per-(repo, claim) tally → one Sonnet synthesis.
scripts/run-instructions-audit.workflow.js — original single-claim version, superseded by presence-claims (kept as the simpler example).

The fixes (`analyze_project_context`), both verified by the audit

README fallback — presence claims now consider the top-level README.md, not only the primary context file. Wires the previously-dormant readme_text parameter; primary-file identity unchanged (surgical).
Lead-paragraph fallback — a project summary is detected as the prose under the # Title, not only under an ## Overview section.

Verification (deterministic — verifier verdicts held constant, only `tool_today` recomputed)

step	overall agreement	project_summary
baseline	76/96 = 79%	12/16
+ README fallback	82/96 = 85%	12/16
+ lead-paragraph	86/96 = 90%	16/16 (100%)

stack and project_summary reach 100%. Adds the first direct unit coverage for analyze_project_context (it had none).

Test plan

pytest -q — 2091 passed, 2 skipped, 0 regressions
ruff check clean
Every audit disagreement hand-validated against the files on disk

Notes / out of scope

Canonical portfolio-truth-latest.json is not regenerated — these fixes shift context_quality portfolio-wide; that actualization (plus merge-gate/tier review) is a separate step.
Residual ~10% is mostly non-heuristic: a malformed-AGENTS.md-generator fence, a boilerplate-vs-real judgment, an auditor-audits-itself branch confound, and deferred bespoke-heading cases.
Full design + pilot results in docs/plans/2026-05-29-run-instructions-external-audit.md.

…ixes Adds an external dynamic-workflow audit that independently re-checks the portfolio-truth snapshot's six presence claims against on-disk ground truth, and uses it to find and verify two scoring fixes in the auditor itself. Audit (read-only): - src/run_instructions_audit.py: deterministic pre-step (stratified pilot selection, evidence prep, live tool_today recompute, git drift) + bucket logic - scripts/presence-claims-audit.workflow.js: Workflow that fans out one Haiku verifier per repo (judging all 6 claims), deterministic tally, Sonnet synthesis - scripts/run-instructions-audit.workflow.js: original single-claim version, superseded by presence-claims (kept as the simpler example) Auditor fixes in analyze_project_context, both verified by the audit: - README fallback: presence claims now consider the top-level README, not only the primary context file (wires the previously-dormant readme_text param) - lead-paragraph fallback: a project summary is detected as the prose under the H1 title, not only under an "## Overview" section Verified deterministically on a 16-repo pilot: overall agreement 79% -> 90%, project_summary 75% -> 100%, stack 75% -> 100%. Adds direct unit coverage for analyze_project_context (previously untested). Full suite: 2091 passed. Canonical portfolio-truth-latest.json intentionally NOT regenerated (the fixes shift context_quality portfolio-wide; that actualization is a separate step).

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7c101c6058

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-30T08:40:11Z

+        if readme_path.is_file():
+            readme_text = _read_small_text(readme_path)


Keep README fallback within context-file limits

When a top-level README.md exists but was excluded by _collect_context_files because it exceeds MAX_CONTEXT_BYTES, this fallback still reads the entire file and lets it drive all presence claims and context_quality. In the normal discovery path, that means oversized READMEs that the collector intentionally filtered out can now be scored as valid context, and the read is no longer bounded; gate this fallback on README.md being in context_file_names or apply the same size limit before reading.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-30T08:40:11Z

+def _has_lead_summary(text: str) -> bool:
+    return _is_nontrivial_text(_lead_paragraph_text(text))


Don't treat list-only leads as summaries

For READMEs whose area before the first ## is only a table of contents or navigation list, _lead_paragraph_text preserves the link/list text and _is_nontrivial_text will mark project_summary_present=True once it has four words. That over-claims the project summary and can promote context_quality even though there is no prose saying what the project is; strip list-only/TOC leads or require at least one non-list prose sentence before accepting the fallback.

Useful? React with 👍 / 👎.

chatgpt-codex-connector Bot reviewed May 30, 2026

View reviewed changes

saagpatel merged commit d41e46a into main May 30, 2026
3 checks passed

saagpatel deleted the feat/portfolio-claim-audit branch May 30, 2026 08:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Presence-claim audit workflow + README/lead-paragraph scoring fixes#25

Presence-claim audit workflow + README/lead-paragraph scoring fixes#25
saagpatel merged 1 commit into
mainfrom
feat/portfolio-claim-audit

saagpatel commented May 30, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 30, 2026

Uh oh!

chatgpt-codex-connector Bot May 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		if readme_path.is_file():
		readme_text = _read_small_text(readme_path)

		def _has_lead_summary(text: str) -> bool:
		return _is_nontrivial_text(_lead_paragraph_text(text))

Conversation

saagpatel commented May 30, 2026

What this is

The audit (read-only)

The fixes (analyze_project_context), both verified by the audit

Verification (deterministic — verifier verdicts held constant, only tool_today recomputed)

Test plan

Notes / out of scope

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

The fixes (`analyze_project_context`), both verified by the audit

Verification (deterministic — verifier verdicts held constant, only `tool_today` recomputed)