#337: 源码 English-only 契约系统性收敛#367
Conversation
…T_REFACTOR_COMMENT_POLICY default→none,删历史中文注释(r3 structural consensus)
🤖 Quality review: commentTL;DR
详细说明
这属于 quality checklist 里的 unused parameter / readability 问题,所以我给 📎 完整 codex 原始输出(存档备查)---
pr: 367
role: quality
verdict: comment
---
## Verdict
Comment: the diff is focused and readable overall, but the new scanner exposes an unused `repo_root` parameter that makes the helper look more configurable than it is.
## Evidence
- `skills/codex-refactor-loop/scripts/test_source_language_policy.py:193`: `scan_python_source_language(repo_root: Path = REPO_ROOT)` immediately deletes `repo_root` on line 194 and still scans the module-level `PYTHON_SOURCE_ROOTS`; per the quality checklist, unused parameters should be surfaced because this makes the helper contract misleading.
## What would change your verdict (only if comment or reject)
Remove the `repo_root` parameter and update the single test caller to call `scan_python_source_language()`, or actually thread the supplied root through `source_files`/`relative` so the argument changes what gets scanned.
⟦AI:AUTO-LOOP⟧
REVIEW_DONE:367:quality:comment⟦AI:AUTO-LOOP⟧ |
🤖 tests review: source-language guard 覆盖不足TL;DR
详细说明我从测试质量角度看到了两个阻塞点。 第一, 第二, 我跑了聚焦测试, 54 个测试通过。但这个问题是覆盖质量问题, 不是当前红绿状态问题。 📎 完整 codex 原始输出(存档备查)pr: 367
|
🤖 Architect review: approveTL;DR
详细说明我按三点看了实际 diff: 没有看到新的 lifecycle authority、 验证我单独从 PR branch 临时 archive 跑了三组相关测试: 📎 完整 codex 原始输出(存档备查)---
pr: 367
role: architect
verdict: approve
---
## Verdict
Approve: PR #367 improves architecture compliance by making the source-language/refactor-history policy explicit, deletion-first, and mechanically guarded without adding runtime authority or host SSOT drift.
## Evidence
- `skills/codex-refactor-loop/SKILL.md:114` maps the changed host surface to the CLAUDE.md fact-source and verifiability clauses: "事实源唯一:同一约束禁止在多处平行声明。版本号 -> `.version-bump.json`;host 运行时事实 -> `host.env`;skill 行为 -> 该 skill 的 SKILL.md 与 `scripts/test_*.py`。" The row now defines `$HOST_REFACTOR_COMMENT_POLICY` as default `none`, documents invalid values as fail-closed, and names `test_host_env_surface_matrix.py`, `test_refactor_comment_policy_prompt_contract.py`, and `test_source_language_policy.py` as test owners.
- `skills/codex-refactor-loop/SKILL.md:726` and `skills/codex-refactor-loop/SKILL.md:3029` align the propagated hard rule with the source-language clause at `skills/codex-refactor-loop/SKILL.md:3034`: "Source files are English-only; external user-facing artifacts are 中文 by default." The PR removes old source refactor-history comments and keeps rationale in external artifacts when policy is `none`.
- `skills/codex-refactor-loop/host.env.example:105` keeps the host.env default on the documented injection surface rather than `.refactor-loop/host.env`, satisfying the host boundary clause: "host 运行时事实由 `host.env` 注入" and "skill **无 host 项目改动权**:不修改 host 的 `.git` 配置 / CI 配置 / policy 文档;只在 `host.env` 暴露的 surface 上工作。"
- `skills/codex-refactor-loop/prompts/reviewer-architect.md:20`, `skills/codex-refactor-loop/prompts/implement.md:24`, `skills/codex-refactor-loop/prompts/verify.md:21`, and related prompt hunks converge on the same policy semantics instead of adding a second pathway. This matches CLAUDE.md: "单一主干,插件扩展:每个 skill 一条权威主链路;新能力以子模块/脚本挂载,禁止平行第二系统。"
- `skills/codex-refactor-loop/scripts/test_source_language_policy.py:17` through `skills/codex-refactor-loop/scripts/test_source_language_policy.py:60` adds a narrow test-only scanner plus explicit allowlist for existing Chinese user-facing output literals; `skills/codex-refactor-loop/scripts/test_source_language_policy.py:210` asserts the scanned Python source is clean and `skills/codex-refactor-loop/scripts/test_source_language_policy.py:227` locks allowlist entries to current literals. That satisfies CLAUDE.md: "变更必须可验证:行为约束必须落到机械验证手段(behavior test / source-regression test / 段落 lint)。"
- Diff scope stays within the issue #337 declared plan from the implement log: SKILL.md policy, `host.env.example`, five prompts, source comment deletion, and `test_source_language_policy.py`. No new `*WriteActor`/`*ReadActor` split, no new external repo reference, no schema/protocol surface, no `.refactor-loop/host.env` production SSOT move, and no compat shim/parallel pathway were introduced.
- Verification run from a temporary archive of `origin/refactor/issue337-source-english-only`: `python3 -m unittest discover -s skills/codex-refactor-loop/scripts -p 'test_source_language_policy.py'` passed 3 tests; `python3 -m unittest skills/codex-refactor-loop/scripts/test_host_env_surface_matrix.py skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py` passed 13 tests.
⟦AI:AUTO-LOOP⟧
REVIEW_DONE:367:architect:approve⟦AI:AUTO-LOOP⟧ |
…ent_findings/string_findings/scan_python_source_language),applied-1
…337-source-language-guard # Conflicts: # skills/codex-refactor-loop/scripts/codex_refactor_loop/cli.py # skills/codex-refactor-loop/scripts/codex_refactor_loop/monitors/concurrency.py # skills/codex-refactor-loop/scripts/codex_refactor_loop/phase9/router.py # skills/codex-refactor-loop/scripts/codex_refactor_loop/release/publisher.py # skills/codex-refactor-loop/scripts/codex_refactor_loop/release/required_checks.py # skills/codex-refactor-loop/scripts/codex_refactor_loop/wakeup_plan.py # skills/codex-refactor-loop/scripts/codex_refactor_loop/workflow_spec.py # skills/codex-refactor-loop/scripts/test_skill_reference_anchors.py
📊 当前状态 — review-gate r1(❌ 不需要人介入)
下一步自动会做:3 reviewer 各审 PR diff(对 base auto-refact-dev = 纯 #337 source-lang 改动)→ reject=0&approve≥1 则 ready+merge;reject≥1 则派 fix。 🤖 controller status banner ⟦AI:AUTO-LOOP⟧ |
🤖 Tests reviewer:reject,缺一条 source-regression 覆盖TL;DR
详细说明这次改动把默认 因此一个类似 📎 完整 codex 原始输出(存档备查)---
pr: 367
role: tests
verdict: reject
---
## Verdict
Reject: the new source-language regression guard misses one of the forbidden refactor-history forms introduced by the prompt contract.
## Evidence
- `skills/codex-refactor-loop/prompts/implement.md:35` and `skills/codex-refactor-loop/prompts/verify.md:25` now forbid `iterN/cluster` refactor-history source comments under missing/empty/default/`none`.
- `skills/codex-refactor-loop/scripts/test_source_language_policy.py:21` defines `REF_HISTORY_TOKENS = ("Refactor (", "Old pattern", "New principle")`, so the new scanner does not detect the separately named `iterN/cluster` form unless it also contains one of the other tokens.
- `skills/codex-refactor-loop/scripts/test_source_language_policy.py:221` exercises Han text plus `Refactor (` / `Old pattern`, but it has no negative sample such as `# iter3/cluster-016 rationale` or `# iterN/cluster guard`; that business outcome can regress while the tests stay green.
- No `[Skip]`/manual test escape, sleep-based pacing, or assertion loosening was found in the changed tests.
## What would change your verdict (only if comment or reject)
Add scanner-level negative coverage for the `iterN/cluster` refactor-history form, and update the detection logic so that form is rejected even when it does not also contain `Refactor (`, `Old pattern`, or `New principle`.
⟦AI:AUTO-LOOP⟧
REVIEW_DONE:367:tests:reject⟦AI:AUTO-LOOP⟧ |
🤖 architect review approveTL;DR
详细说明我按 架构上没有看到新的 lifecycle authority、外部仓库引用、 📎 完整 codex 原始输出(存档备查)pr: 367
|
🤖 Quality review r1:comment,当前只剩一个小型死代码清理TL;DR
详细说明这条评论 supersede 我前面那条 stale quality comment: 当前 branch 已经修掉了
📎 完整 codex 原始输出(存档备查)---
pr: 367
role: quality
verdict: comment
---
## Verdict
Comment: the diff is focused and readable overall, with one small dead-code cleanup left in the new test guard.
## Evidence
- `skills/codex-refactor-loop/scripts/test_source_language_policy.py:17`: `PYTHON_SOURCE_ROOTS` is introduced as a module-level tuple, but the current branch never reads it; `source_files()` now builds its own root-aware tuple locally. This is trivial, but it is newly introduced dead code and also leaves `SKILL_ROOT` only alive to feed the unused constant.
## What would change your verdict (only if comment or reject)
Delete the unused module-level `PYTHON_SOURCE_ROOTS` constant and inline/remove the now-unneeded module-level `SKILL_ROOT` if no other reader remains.
⟦AI:AUTO-LOOP⟧
REVIEW_DONE:367:quality:comment⟦AI:AUTO-LOOP⟧ |
📊 当前状态 — review-gate r1 完成(❌ 不需要人介入)
下一步自动会做:fix codex 读 3 reviewer artifact 修 tests reject → controller commit+push → 重派 3 reviewer r2。 🤖 controller status banner ⟦AI:AUTO-LOOP⟧ |
🤖 PR 367 round 1 fix 已完成TL;DR
详细说明tests reviewer 指出 同时删除了同文件中新引入但未使用的 📎 完整 codex 原始输出(存档备查)# Fix report for PR 367 round 1
## Applied
- (A) `skills/codex-refactor-loop/scripts/test_source_language_policy.py:18`: added `REF_HISTORY_ITER_CLUSTER_RE` so standalone `iterN/cluster` and numbered `iter3/cluster-016` refactor-history comment forms are detected (addresses reviewer:tests evidence #1).
- (A) `skills/codex-refactor-loop/scripts/test_source_language_policy.py:65`: updated `has_refactor_history()` to reject both the existing `Refactor (`/`Old pattern`/`New principle` tokens and the standalone `iterN/cluster` form (addresses reviewer:tests evidence #2).
- (A) `skills/codex-refactor-loop/scripts/test_source_language_policy.py:226`: added a scanner-level negative fixture for `# iter3/cluster-016 rationale`, with assertions at `:243` and `:249` proving comment-level and full-scan rejection (addresses reviewer:tests evidence #3).
- (A) `skills/codex-refactor-loop/scripts/test_source_language_policy.py:15`: removed unused module-level `SKILL_ROOT` and `PYTHON_SOURCE_ROOTS` while editing the same file; this addresses reviewer:quality's advisory dead-code comment, not a blocking reject.
## Rejected as false positive
- None.
## Blocked (cannot fix this round)
- None.
## Build status
- build: pass (`bash -lc "true"`)
- tests: pass (`python3 -m unittest skills/codex-refactor-loop/scripts/test_source_language_policy.py`: 3 tests)
- tests: pass (`python3 -m unittest discover -s skills/codex-refactor-loop/scripts -p 'test_*.py'`: 965 tests, skipped=1)
## Recommendation for next round
- Expect merge: the only reject demand was fixed in-scope, the advisory quality cleanup was included, and no false-positive/conflict/human-decision blockers remain.
refactor self-doc: not applicable (HOST_REFACTOR_COMMENT_POLICY=none)
⟦AI:AUTO-LOOP⟧
FIX_DONE:367:round-1:applied-1:rejected-0:blocked-0⟦AI:AUTO-LOOP⟧ |
…向覆盖 + 检测逻辑(applied-1,addr tests reject)
📊 当前状态 — fix r1 完成 → review-gate r2(❌ 不需要人介入)
🤖 controller status banner ⟦AI:AUTO-LOOP⟧ |
🤖 tests reviewer 通过TL;DR
详细说明我按三点检查:新增/变更测试是否断言真实行为、是否引入 sleep/skip/manual 逃逸、是否把断言变松。没有发现阻塞项。 关键覆盖在 📎 完整 codex 原始输出(存档备查)---
pr: 367
role: tests
verdict: approve
---
## Verdict
Test coverage and test quality are adequate for the source English-only policy cleanup.
## Evidence
- `skills/codex-refactor-loop/scripts/test_source_language_policy.py:213` asserts the full production Python source scan returns no unallowlisted findings, so this is not a bump-only test.
- `skills/codex-refactor-loop/scripts/test_source_language_policy.py:218` creates a temporary prohibited sample and asserts detection of Han comments, refactor-history comments, Han docstrings, and selected error strings.
- `skills/codex-refactor-loop/scripts/test_source_language_policy.py:252` verifies every allowlist entry still matches a current literal, preventing stale exemptions from silently accumulating.
- `skills/codex-refactor-loop/scripts/test_refactor_comment_policy_prompt_contract.py:20` and `:48` cover both default `none` policy behavior and explicit `self-doc-comment` opt-in behavior across implement, verify, architect, quality, and review-fix prompts.
- `skills/codex-refactor-loop/scripts/test_host_env_surface_matrix.py:221` pins the host env matrix/default/export/test-owner contract for `HOST_REFACTOR_COMMENT_POLICY`.
- `skills/codex-refactor-loop/scripts/test_github_body_renderer.py:135` keeps the renderer authority contract covered after docstring cleanup by asserting read-artifact-only registration and forbidden lifecycle/write tokens.
- Diff scan found no added `sleep`/delay pacing, no added skip/manual-test escape, and no clear assertion loosening that would hide behavior.⟦AI:AUTO-LOOP⟧ |
🤖 Quality review r2: commentTL;DR
详细说明我按三点 diff 打开了 PR 分支上的所有 touched files,并重点看了新增的 唯一质量问题在 📎 完整 codex 原始输出(存档备查)---
pr: 367
role: quality
verdict: comment
---
## Verdict
Comment: the diff is readable and focused overall, but the new source-language guard has one magic filename exclusion that should be made explicit.
## Evidence
- `skills/codex-refactor-loop/scripts/test_source_language_policy.py:79`: `source_files()` silently excludes every file named `degradation.py`, while the rest of the scanner uses exact `AllowlistEntry(relative_path, owner, reason)` entries. That broad filename exception is harder to review and can hide future files with the same basename; the intent would be clearer as an exact documented exclusion or exact allowlist entries.
## What would change your verdict (only if comment or reject)
Replace the basename-only `path.name != "degradation.py"` filter with an exact repo-relative exclusion that carries a reason, or model the remaining accepted literals through the existing `AllowlistEntry` path/owner/reason structure.
⟦AI:AUTO-LOOP⟧
REVIEW_DONE:367:quality:comment⟦AI:AUTO-LOOP⟧ |
🤖 Architect review: 架构角度通过TL;DR
详细说明我按 新增的 📎 完整 codex 原始输出(存档备查)pr: 367
|
📊 当前状态 — review-gate r2 共识 MERGE_WITH_COMMENTS,合并中(❌ 不需要人介入)
🤖 controller status banner ⟦AI:AUTO-LOOP⟧ |
* #344: 发版目标 issue 存在时默认进入版本倒计时(crnd:milestone:release-target label + release-gate/wakeup-plan 倒计时投影,r4 minimal consensus) (#366) * #345: skill 身份重定位 — audit 降为 fallback issue-producer (#365) * #345: skill 身份重定位 — audit 降为 fallback issue-producer,主路径=issue resolution(r6 structural consensus,no-CLAUDE-change skill-local main-path) * #365 fix r2: SKILL.md 详细 Producer Contract 改为 audit-fallback 一致,消除同源 competing facts(applied-1) * #104: host-portability fold setup owner surface (#363) * #104: host-portability — fold setup owner surface 进 codex-refactor-loop(HostWorkflowSpec 七面投影/non-admin merge/host-env release checks) * #363 fix r2: 补 host-required release checks fail-closed denial-path behavior tests(applied-2) * #363 fix r4: 补 missing/empty HOST_GITHUB_RELEASE_REQUIRED_CHECKS default-projection behavior test(applied-1) * #337: 源码 English-only 契约系统性收敛 (#367) * #337: 源码 English-only 契约系统性收敛 — test-only source-language guard + HOST_REFACTOR_COMMENT_POLICY default→none,删历史中文注释(r3 structural consensus) * #367 fix r2: source-language guard negative test 改走真实 scanner 入口(comment_findings/string_findings/scan_python_source_language),applied-1 * #367 fix r1: source-language guard 加 iterN/cluster refactor-history 负向覆盖 + 检测逻辑(applied-1,addr tests reject) * #364: Path-A divergent-source contract-only locator(SKILL solver source contract + 3 solver prompt + source-regression;无新 module/parser/header) (#368) * fix(#369 review r1): source-lang guard honor HOST_REFACTOR_COMMENT_POLICY + 删 workflow_spec ValidatedWorkflowSpec.as_dict dead alias(quality reject,applied-2)
摘要
#337 cluster-002 源码 English-only 契约系统性收敛。按 design-consensus r3 structural 共识(test-only guard + default-none refactor-history policy)。
test_source_language_policy.pytest-only guard(由 required contract-tests 承载);HOST_REFACTOR_COMMENT_POLICYmissing/empty/default 收敛none;删源码历史中文 Refactor 注释,保留 current-code invariant 英文短注释。范围
60 files(+391/-804,deletion-heavy):新 guard test + SKILL.md hard-rule-3/host.env-matrix/language-details default→none + host.env.example + 5 prompt + 删 bump_version.py/codex_refactor_loop/**/*.py 历史中文注释。
Closes #337
🤖 Auto-loop / codex-refactor-loop
⟦AI:AUTO-LOOP⟧