Skip to content

feat(dsql): enhance query plan explainability with type coercion detection, rewrites, and workflow extraction#162

Open
Morlej wants to merge 7 commits into
awslabs:mainfrom
Morlej:feat/dsql-query-plan-explainability
Open

feat(dsql): enhance query plan explainability with type coercion detection, rewrites, and workflow extraction#162
Morlej wants to merge 7 commits into
awslabs:mainfrom
Morlej:feat/dsql-query-plan-explainability

Conversation

@Morlej
Copy link
Copy Markdown

@Morlej Morlej commented May 8, 2026

Summary

  • Extract Workflow 8 from SKILL.md into references/query-plan/workflow.md (SKILL.md: 334 → 281 LOC)
  • Add type coercion index bypass detection — pg_amop-based detection in plan-interpretation.md, indexed column type queries in catalog-queries.md
  • Add query rewrite references — 11 generic patterns split into individual files under query-rewrites/, plus 2 DSQL-specific rewrites (reltuples estimate, split large joins)
  • Add structured trigger criteria, context disambiguation, and routing to the workflow reference
  • Wire rewrites into workflow — loaded at Phase 0, applied at Phase 2

Validation

  • validate-size.py: 281 lines (good, under 300 limit)
  • validate-references.py: 0 broken links, 0 new orphans

Eval Results

Manual qualitative comparison (n=1, Claude Opus 4.6). Full results in tools/evals/databases-on-aws/dsql/query_plan_rewrite_eval_results.md:

Eval Scenario With Skill Baseline Key Delta
200 IN-subquery Full Scan PASS PARTIAL Skill recommends specific rewrite patterns from reference
201 Type coercion index bypass PASS PASS Both identify it; skill adds DSQL-specific pg_amop detail
202 12-table join ordering PASS PARTIAL Skill offers full diagnostic workflow with GUC experiments
203 COUNT(*) timeout PASS FAIL Skill recommends pg_class reltuples with staleness warning
204 Multiple OR to IN PASS PARTIAL Skill identifies pattern from reference
205 GROUP BY after JOIN PASS PARTIAL Skill recommends subquery aggregation
206–210 LEFT JOIN, computation push, NOT IN+NULL, UNION ALL, negative Added in review round Coverage for remaining patterns + negative case

Follow-ups

  • MCP mirror PR: awslabs/mcp src/aurora-dsql-mcp-server/skills/dsql-skill/ needs to be synced with these changes (workflow.md, query-rewrites/ split, updated catalog-queries.md, plan-interpretation.md). Will open companion PR after this merges.
  • Python SQL converter: Per review feedback, deterministic rewrites should migrate to a Python script in a future PR (reference files then document the converter's rules).

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.

🤖 Generated with Claude Code

@Morlej Morlej requested review from a team as code owners May 8, 2026 23:33
@Morlej Morlej force-pushed the feat/dsql-query-plan-explainability branch from 8e33741 to 8261713 Compare May 8, 2026 23:36
Copy link
Copy Markdown
Contributor

@amaksimo amaksimo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few general commets:

  1. We should use positive language throughout (llm can confuse DO with DO NOT when we trim context)
  2. We should try to use RFC language more frequently throughout
  3. We should break up the references in the query-plan folder as some of the files are very long

Comment thread plugins/databases-on-aws/skills/dsql/references/query-plan/workflow.md Outdated
Comment thread plugins/databases-on-aws/skills/dsql/references/query-plan/workflow.md Outdated
@anwesham-lab anwesham-lab requested a review from amaksimo May 12, 2026 21:48
Morlej and others added 5 commits May 14, 2026 11:20
…ction and rewrite references

- Add structured trigger phrases and routing criteria for query plan diagnosis
- Add type coercion index bypass detection (implicit cast compatibility matrix)
- Extend catalog queries with indexed column type retrieval
- Add generic SQL rewrite reference (11 patterns: OR-to-IN, subquery unnesting, etc.)
- Add DSQL-specific rewrite reference (reltuples estimate, split large joins for DP threshold)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Extract Workflow 8 (query plan explainability) from SKILL.md into
  references/query-plan/workflow.md to stay under the 300 LOC limit
- Wire query-rewrites-generic.md and query-rewrites-dsql-specific.md
  into the workflow (Phase 0 load list + Phase 2 evidence gathering)
- Add behavioral evals (query_plan_rewrite_evals.json) covering type
  coercion detection, subquery unnesting, OR-to-IN, GROUP BY pushdown,
  large join splitting, and reltuples estimation
- Add eval results (query_plan_rewrite_eval_results.md) with
  with-skill vs baseline comparison

Validation:
- validate-size.py: 275 lines (good)
- validate-references.py: 0 broken links, 0 new orphans

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…, RFC keywords

Review feedback from amaksimo:

- Split query-rewrites-generic.md into 11 individual files under
  query-rewrites/ subdirectory to reduce context consumption
- Split query-rewrites-dsql-specific.md into individual files
- Convert monolithic files to index tables pointing to sub-files
- Fix DATEADD() SQL Server syntax → PostgreSQL NOW() - INTERVAL
- Flip negative language ("Do not apply") to positive ("Skip when")
- Add RFC keywords (MUST, SHOULD, MAY) throughout
- Remove psql fallback from workflow.md (enforce MCP usage)
- Update plan-interpretation.md recommendation template with RFC language
- Make Phase 0 loading explicit: MUST for core refs, SHOULD for rewrites

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@anwesham-lab anwesham-lab force-pushed the feat/dsql-query-plan-explainability branch from 07b6baa to 6f97294 Compare May 14, 2026 18:20

**Fallback:** If `awsknowledge` is unavailable, use the defaults above and flag that limits should be verified against [DSQL documentation](https://docs.aws.amazon.com/aurora-dsql/latest/userguide/).
**Fallback:** If `awsknowledge` is unavailable, use the defaults above and note to the user
that limits should be verified against [DSQL documentation](https://docs.aws.amazon.com/aurora-dsql/latest/userguide/).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did we end up adding line breaks?


**When:** MUST load all four at Workflow 8 Phase 0 — [query-plan/plan-interpretation.md](references/query-plan/plan-interpretation.md), [query-plan/catalog-queries.md](references/query-plan/catalog-queries.md), [query-plan/guc-experiments.md](references/query-plan/guc-experiments.md), [query-plan/report-format.md](references/query-plan/report-format.md)
**Contains:** DSQL node types + Node Duration math + estimation-error bands, pg_class/pg_stats/pg_indexes SQL + correlated-predicate verification, GUC experiment procedures + 30-second skip protocol, required report structure + element checklist + support request template
**When:** MUST load [query-plan/workflow.md](references/query-plan/workflow.md) at Workflow 8 entry — it gates the remaining files
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remove the bullet structure?


**SHOULD apply when:** The WHERE clause rejects NULLs from the right-hand side of a LEFT JOIN (e.g., `IS NOT NULL`, equality comparisons, or any predicate that cannot be true for NULL).

**Skip when:** NULLs from the right-hand side are intentionally preserved in the result.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this an always? contextualize when/how often? Should this be an SHOULD Skip when?


When a query uses LEFT JOIN but the WHERE clause rejects NULLs on the joined table, rewrite as INNER JOIN. This enables a simpler, more efficient join plan.

**SHOULD apply when:** The WHERE clause rejects NULLs from the right-hand side of a LEFT JOIN (e.g., `IS NOT NULL`, equality comparisons, or any predicate that cannot be true for NULL).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a should or a must? should gives the model a directive to bypass when told something like to rush

@@ -0,0 +1,48 @@
# Rewrite: Propagate Filter to JOIN Columns
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since so many of these are deterministic, I question the meta structure of if we should instead leverage something like a python script converter of sorts that can parse and replace the SQL and the reference files just execute them?

@anwesham-lab
Copy link
Copy Markdown
Member

PR #162 — Review Summary

feat(dsql): enhance query plan explainability with type coercion detection, rewrites, and workflow extraction

Reviewed at head SHA 07b6baaca029e3336ddfb03438eee26429734a72. Sound direction; eval gains are real (especially eval 203 reltuples). Holding on five correctness bugs in the new SQL example pairs (rows 1–5) — agents will copy these into production rewrites that change result sets — plus a dangling cross-reference cluster (rows 6–8) where workflow.md, plan-interpretation.md, and catalog-queries.md reference an "implicit cast compatibility matrix" and a "Phase 5" that no longer exist after the rewrite. Structural, eval, and process items follow.

# Confidence Area Finding Suggestion Reviewed SHA
1 95 query-rewrites/push-computation-to-constant.md L9-17 — correctness First example is not equivalent under integer division. Original WHERE emp_no * 100 / 5 = 10001 has no integer solution; rewrite WHERE emp_no = 10001 * 5 / 100 matches emp_no = 500. The file's own "Skip when … integer-division rounding" caveat is violated by the leading example. Replace with a genuinely invertible example (e.g. emp_no + 100 = 10001emp_no = 9901), or use a numeric/float column. 07b6baa
2 90 query-rewrites/not-in-to-not-exists.md L1-26 — correctness "Sidesteps NULL semantics issues" understates: when the subquery contains NULL, NOT IN returns empty and NOT EXISTS returns the rows; the rewrite changes results, not just performance. State explicitly: "NOT EXISTS does not preserve NOT IN's NULL-propagation; output differs when the subquery may contain NULLs. Confirm intent with the user before applying." 07b6baa
3 85 query-rewrites/subquery-unnesting-uncorrelated.md L9-23 — correctness SELECT DISTINCT R.* collapses pre-existing duplicates in R that the original semi-join (IN (SELECT …)) preserved — fixes one duplicate problem by introducing another. Either (a) recommend the EXISTS form (true semi-join) or (b) state the assumption "Apply only when S.b is unique (PK/UNIQUE); otherwise DISTINCT changes results." 07b6baa
4 85 query-rewrites/subquery-unnesting-correlated.md L20-26 — correctness Same DISTINCT-on-semi-join issue as #3 for the EXISTS→JOIN rewrite. Same fix: prefer EXISTS, or document the uniqueness precondition. 07b6baa
5 85 query-rewrites/subquery-unnesting-scalar.md L33-52 — correctness s_count example: scalar COUNT(*) returns 0 for outer rows with no match; LEFT JOIN+GROUP BY rewrite returns NULL. Downstream WHERE s_count = 0, SUM(s_count), etc. break silently. The first MAX example is fine (MAX returns NULL on empty). Wrap with COALESCE(Agg.s_count, 0) AS s_count; add a one-line note that COUNT/SUM need COALESCE while MAX/MIN do not. 07b6baa
6 95 workflow.md L29-31 — correctness Trigger row references "Phase 5 re-entry for an existing report" but the workflow defines only Phase 0–4 (TOC L7–14). Routing L56 correctly says "append Addendum". Replace with "Reassessment re-entry — re-runs Phase 1–2 and appends an Addendum per Phase 4." 07b6baa
7 90 plan-interpretation.md L194-242, workflow.md L100, catalog-queries.md L122-124 — correctness Three files reference an "implicit cast compatibility matrix below/above/in plan-interpretation.md" that does not exist. The section was rewritten to recommend a live pg_amop query instead. Eval 201's expectation "Mentions implicit cast compatibility matrix" reinforces the phantom artifact. Replace all four references with "the pg_amop query in catalog-queries.md (B-Tree Cross-Type Operator Support)." Update eval 201 expectation accordingly. 07b6baa
8 80 plan-interpretation.md L201-214 — correctness Bullet at L202 says "if an implicit cast exists, the planner can still use the index" — contradicts L211–214 which correctly notes B-Tree needs a registered cross-type operator (pg_amop), not just a pg_cast. The two paragraphs disagree. Drop or rephrase L202: "If a cross-type B-Tree operator is registered (see pg_amop), the index can be used; otherwise the planner applies a per-row cast that defeats index ordering." 07b6baa
9 75 plan-interpretation.md L214 — correctness/durability "Cross-type index support is limited to the integer family" stated as fact, no citation, no "verify before asserting" hedge. Will rot the moment DSQL adds a cross-type operator family. Prefix with "At time of writing…" and route the agent through the pg_amop query before asserting this to a user. 07b6baa
10 70 catalog-queries.md L136 — correctness amopmethod = 10003 is a DSQL-internal magic number (PG mainline B-Tree is 403). No provenance comment; will silently break if the OID changes. Add inline comment explaining provenance and a SELECT oid FROM pg_am WHERE amname = 'btree' recommendation as a hedge. 07b6baa
11 90 catalog-queries.md TOC L5-13 — structure TOC omits 3 of the 9 sections — all PR additions: "Column Types for Predicate Columns" (L107), "B-Tree Cross-Type Operator Support" (L125), "Indexed Column Types" (L157). Add three TOC entries between current items 5 and 6. 07b6baa
12 90 plan-interpretation.md TOC L3-14 — structure TOC omits "Type Coercion and Index Bypass" (L186) — the headline new section of this PR. Insert TOC entry, renumber subsequent items. 07b6baa
13 80 SKILL.md L112-115 — structure The PR rewrites this block but uses a single combined When:/Contains: while sibling "(modular):" sections give each sub-file its own #### heading + per-file When/Contains. Loading conditions for plan-interpretation.md, catalog-queries.md, guc-experiments.md, report-format.md, and the rewrite indexes are no longer declared in the entry file. Either give each query-plan reference its own #### entry, or explicitly delegate routing to workflow.md and state that as the rule. 07b6baa
14 80 tools/evals/databases-on-aws/README.md — multi-target sync New query_plan_rewrite_evals.json is not added to the README's directory tree or per-tier eval section. Sibling evals (evals.json, query_explainability_evals.json) all have entries. The cluster-fixtures table also misses the new schemas (12-table join, 50M-row table). Add the new eval and a fixtures row to the README. 07b6baa
15 75 tools/evals/databases-on-aws/dsql/scripts/ — process New eval JSON has no paired runner script under scripts/. Sibling evals all have one. PR ships only manual query_plan_rewrite_eval_results.md. Either add run_query_plan_rewrite_evals.py (LLM-judge fits) or document explicitly that this suite is manual-only. 07b6baa
16 70 query_plan_rewrite_evals.json — tests Coverage gap: 5 of 11 generic rewrites have no direct eval — left-join-to-inner, propagate-filter, push-computation-to-constant, not-in-to-not-exists, flatten-union-all. NOT IN→NOT EXISTS especially worth covering (correctness, not just perf). No negative cases (where the agent should decline the rewrite). Add evals 206–212 covering missing patterns + at least one "OR across different columns → does NOT recommend OR-to-IN" negative case. 07b6baa
17 65 query_plan_rewrite_eval_results.md — tests Sample size = 1 per cell, no model/version/temperature recorded, no variance analysis. PASS/FAIL is a single human transcript read. Record model + version + n=3 with majority vote; add a Runs column; or downgrade the table to "qualitative comparison." 07b6baa
18 75 PR description — pr-body PR body / commit 82617135 claim "275 lines (good)" but SKILL.md is 279 lines at head. Still under cap; cosmetic but it's a stated correctness claim. Re-run validate-size.py on 07b6baa and update the PR body / commit message. 07b6baa
19 65 Multi-target sync (awslabs/mcp) — process awslabs/mcp@main src/aurora-dsql-mcp-server/skills/dsql-skill/references/query-plan/ does NOT contain workflow.md, query-rewrites/, or the new index files. PR description does not mention an MCP-mirror PR or follow-up. Per the dsql-skill-author placement rules, the default DSQL skill must propagate to the MCP standalone skill + Kiro Power. Open a companion PR against awslabs/mcp mirroring the new files (and translate workflow.md for Kiro Power), or document explicitly that the mirror is out of scope and link the follow-up issue. 07b6baa
20 70 SKILL.md L172-173, L266-267 — silent-failure This PR softens the awsknowledge fallback rule from "flag that" to "note to the user that" — advisory phrasing, not a MUST. Agent can silently use stale defaults for decisions that turn on the exact value. Promote to MUST: "MUST tell the user the lookup failed, MUST name the limit and value, MUST refuse the fallback when the recommendation depends on the exact value." 07b6baa
21 70 query-rewrites/reltuples-estimate.md + eval 203 — silent-failure reltuples reflects last ANALYZE/autovacuum and may be drastically stale on a fresh or write-heavy table. Doc says "estimate, not exact" but does not require warning the user about staleness; eval 203 lacks the staleness expectation, so the failure mode is unobservable. Add MUST: "Warn the user that reltuples reflects the last ANALYZE; recommend cross-checking last_analyze when the count drives a decision." Add eval expectation. 07b6baa
22 70 catalog-queries.md L107-180 (PR-added sections) — security The 3 new sections this PR adds (Column Types for Predicate Columns, B-Tree Cross-Type Operator Support, Indexed Column Types) introduce fresh '{schema}' / '{table}' placeholder substitution patterns. SKILL.md Workflow 4 mandates safe_query.build() for query construction; these new examples teach lexical concatenation, an injection sink despite readonly_query. Add a one-line MUST scoped to the new sections: "Substitute these placeholders via safe_query.build() with ident() — see input-validation.md." 07b6baa
23 60 query-rewrites/*.md (all 13) — style Every new rewrite file pairs **SHOULD apply when:** with **Skip when:**. The two are logical complements; per authoring-style.md §Voice reserve prohibition for irreversible harm. Drop Skip when: and tighten SHOULD apply when:, or rephrase as a single **Applies when:** criterion. 07b6baa
24 80 workflow.md TOC L7-15 — structure TOC anchors encode the em dash with double hyphens (e.g., #phase-0--load-reference-material). GitHub collapses to a single -, so all five Phase TOC links are broken in the rendered file. Regenerate as #phase-0-load-reference-material#phase-4-produce-the-report-invite-reassessment (and #phase-3-experiment-conditional). 07b6baa

Reviewer scope. This review covered the diff at the head SHA (21 files, +1131 / −36) — the new query-plan workflow extraction, type-coercion detection, 11-pattern rewrite library, and eval pair. Prior amaksimo review threads from the predecessor PR #161 (file split, RFC keywords, positive language, DATEADD→NOW()-INTERVAL, psql fallback removal) are addressed at this head; thank you.


🤖 This review was drafted with Claude Code using the dsql-skill-author Workflow 2 (reviewer) procedure and the 17+ sub-agent roster from code-review.md. Findings have been validated through the five-gate filter (re-read at head SHA, applicability, suggestion correctness, customer-value, confidence ≥ 60).

Was this review useful? React with 👍 if the findings were helpful, 👎 if they missed the mark or introduced false positives. Reply with specifics so the review process can improve. Findings you disagree with are valid to push back on — confidence scores are not verdicts.

Morlej and others added 2 commits May 15, 2026 18:04
…vals

Correctness fixes (review items 1-5):
- awslabs#1: push-computation-to-constant — use NUMERIC column 'amount' to
  avoid integer division non-equivalence
- awslabs#2: not-in-to-not-exists — add NULL semantics warning (NOT EXISTS
  does not preserve NOT IN's NULL-propagation; MUST confirm with user)
- awslabs#3/awslabs#4: subquery-unnesting — prefer EXISTS form (true semi-join);
  document uniqueness precondition for JOIN+DISTINCT alternative
- awslabs#5: subquery-unnesting-scalar — add COALESCE(s_count, 0) for
  COUNT/SUM (LEFT JOIN returns NULL, scalar returns 0)

Dangling reference fixes (review items 6-8):
- awslabs#6: workflow.md trigger table — "Phase 5" → reassessment re-entry
- awslabs#7: Replace all "implicit cast compatibility matrix" references
  with "pg_amop query in catalog-queries.md"
- awslabs#8: plan-interpretation.md L202 — fix cast-vs-operator contradiction

Structural fixes (review items 9-14, 24):
- awslabs#9: Hedge "integer family" claim with "at time of writing" + verify
- awslabs#10: amopmethod=10003 — add provenance comment and verification SQL
- awslabs#11: catalog-queries.md TOC — add 3 missing sections
- awslabs#12: plan-interpretation.md TOC — add Type Coercion section
- awslabs#13: SKILL.md — explicitly delegate routing to workflow.md
- awslabs#24: workflow.md — remove em dashes from headings for clean anchors

Other fixes (review items 21-23):
- awslabs#21: reltuples-estimate — add staleness warning (MUST warn user)
- awslabs#22: catalog-queries — add safe_query.build() note for placeholders
- awslabs#23: "Skip when" → "SHOULD skip when" in all rewrite files

Eval improvements (review items 14, 16):
- awslabs#14: README — add query_plan_rewrite_evals to directory tree and
  eval section
- awslabs#16: Add evals 206-210 covering LEFT JOIN, computation push, NOT IN
  with NULL warning, nested UNION ALL, and negative case (OR across
  different columns)
- awslabs#7 (eval): Update eval 201 expectation — pg_amop instead of matrix

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- awslabs#17: Downgrade eval results to qualitative comparison, record model
  and version, note n=1 and recommend n>=3 for production confidence
- awslabs#18: SKILL.md is 281 lines (will update PR body)
- awslabs#20: Strengthen awsknowledge fallback to MUST — refuse fallback when
  recommendation depends on exact limit value
- awslabs#21: Already addressed in prior commit (reltuples staleness)
- awslabs#15: Document manual-only status and future Python converter direction
  (per anwesham-lab's suggestion for deterministic rewrites)
- awslabs#19: MCP mirror PR noted as follow-up in PR body

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants