docs: clarify support-level and reason consistency in audit-comet-expression skill#4447
Merged
Merged
Conversation
…ression skill Expand the audit skill so it produces consistent, accurate findings about `getSupportLevel`, `getIncompatibleReasons`, `getUnsupportedReasons`, and `convert`. Step 3 gains a dedicated subsection covering: - A decision rule for `Unsupported` vs `Incompatible` (wrong answer vs fallback/error). - The runtime-vs-docs split: support-level `notes` flow into EXPLAIN via the dispatcher, while `get*Reasons()` is read only by `GenerateDocs`. - Six consistency rules covering every branch, dead-reason detection, shared `private val` for reason strings, `Incompatible(Some(reason))` preference, and gating decisions in `getSupportLevel` rather than `convert`. - Wording guidelines for reason strings (user-observable effect first, sentence case, backtick configs/types, tracking issue links). - Real antipattern callouts naming specific serdes (`CometHour`, `CometMinute`, `CometSecond`, `CometInitCap`). Step 5 gains a structured 9-item support-level consistency audit checklist that produces findings against the four methods. Step 7 splits into two asks: one for missing tests, one for consistency fixes, so the user can opt in to either independently.
…omatically Tightens Step 6 / Step 7 of the audit-comet-expression skill so the workflow produces concrete output without pausing for user approval on mechanical steps: - Step 6 reorganises the priorities into three named buckets: correctness divergences, missing coverage, and consistency issues. - Step 7 requires correctness findings to be captured as SQL file tests in the same PR as the audit. Walks through the "search for existing issue, file if missing" workflow and the trivial-fix-vs-`query ignore(<url>)` decision rule, with a concrete example. - Step 7 also applies every Step 5 consistency finding automatically. These are mechanical edits (extract `private val`, switch `Incompatible(None)` to `Some(reason)`, add missing `get*Reasons`, move a check from `convert` into `getSupportLevel`, hoist a shared reason into a private companion) and do not need user approval. - Only "missing test coverage" still pauses for user input, because adding tests for cases that already work is incremental polish. Surfaced while applying the skill to a multi-expression audit in apache#4448; both behaviours felt obviously right when the audit ran on a larger surface and several recurring patterns made the asking step feel like friction rather than safety.
comphead
reviewed
May 27, 2026
| etc.). Each one becomes a captured test in Step 7. | ||
|
|
||
| ### Medium priority | ||
| ### Medium priority: missing test coverage |
Contributor
There was a problem hiding this comment.
this prob should be a high priority? as missing tests can lead to correctness divergence
Member
Author
There was a problem hiding this comment.
Good point — split the bucket in 5e9ccc3. High priority now covers correctness divergences and untested cases on a known-fragile axis (overflow, NULL handling, timezones, ANSI, leap years, version-specific semantics). Those get captured as tests automatically in Step 7 (regression test if it passes today, query ignore(<url>) if it fails). Medium priority is reserved for low-risk gaps on well-exercised code paths.
Address PR review feedback (apache#4447): missing tests can mask correctness divergences. Split the Step 6 'missing test coverage' bucket — gaps on known-fragile axes (overflow, NULL handling, timezones, ANSI mode, leap years, version-specific semantics) are now High priority and captured as tests automatically; only low-risk gaps on well-exercised paths stay Medium priority and require user approval. Step 7's 'Correctness divergences: capture as tests' is renamed to 'High-priority findings: capture as tests' and now covers both confirmed divergences and high-risk untested cases (run the case manually first, then commit either a regression test or a query ignore(<url>) test).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #.
Rationale for this change
The
audit-comet-expressionskill produced inconsistent guidance on when to useUnsupportedvsIncompatible, and howgetSupportLevel,getIncompatibleReasons,getUnsupportedReasons, andconvertneed to stay aligned. As a result, audits flagged some real issues but missed others, and the recommended fixes were vague.While exercising the updated skill on a multi-expression audit (#4448), two further gaps in the workflow became obvious: correctness divergences need to be captured as regression tests in the same PR, and mechanical consistency fixes should be applied automatically rather than gating on user approval. Both changes are folded in here so the skill is consistent end-to-end before the wider audit lands.
What changes are included in this PR?
Updates to
.claude/skills/audit-comet-expression/SKILL.md:Step 3 gains a dedicated "Audit
getSupportLevel,getIncompatibleReasons,getUnsupportedReasons, andconvert" subsection covering:UnsupportedvsIncompatible(wrong answer vs fallback/error).notesflow into EXPLAIN via the dispatcher inQueryPlanSerde.exprToProtoInternal;get*Reasons()is read only byGenerateDocs.private val, preferSome(reason)overNone, gate decisions ingetSupportLevelrather thanconvert).Step 5 gains a structured 9-item support-level consistency audit checklist that produces findings against the four methods.
Step 6 reorganises the priorities into three named buckets: correctness divergences, missing coverage, and consistency issues.
Step 7 is restructured so the audit produces concrete output without pausing for user approval on mechanical steps:
query ignore(<url>)decision rule.private val, switchIncompatible(None)toSome(reason), add missingget*Reasons, move a check fromconvertintogetSupportLevel, hoist a shared reason into a private companion) are all mechanical and do not need user approval.How are these changes tested?
This change is documentation for a Claude skill and has no code or test impact. The Step 6/7 changes were validated by applying the skill to a 38-function audit (#4448), which exercised every branch (correctness findings captured as
query ignore(...)tests, consistency fixes applied as mechanical edits across 7 serdes, doc audit sub-bullets added).