test: capture datetime audit findings as ignored SQL tests; update audit skill#4453
Closed
andygrove wants to merge 4 commits into
Closed
test: capture datetime audit findings as ignored SQL tests; update audit skill#4453andygrove wants to merge 4 commits into
andygrove wants to merge 4 commits into
Conversation
…ression skill Expand the audit skill so it produces consistent, accurate findings about `getSupportLevel`, `getIncompatibleReasons`, `getUnsupportedReasons`, and `convert`. Step 3 gains a dedicated subsection covering: - A decision rule for `Unsupported` vs `Incompatible` (wrong answer vs fallback/error). - The runtime-vs-docs split: support-level `notes` flow into EXPLAIN via the dispatcher, while `get*Reasons()` is read only by `GenerateDocs`. - Six consistency rules covering every branch, dead-reason detection, shared `private val` for reason strings, `Incompatible(Some(reason))` preference, and gating decisions in `getSupportLevel` rather than `convert`. - Wording guidelines for reason strings (user-observable effect first, sentence case, backtick configs/types, tracking issue links). - Real antipattern callouts naming specific serdes (`CometHour`, `CometMinute`, `CometSecond`, `CometInitCap`). Step 5 gains a structured 9-item support-level consistency audit checklist that produces findings against the four methods. Step 7 splits into two asks: one for missing tests, one for consistency fixes, so the user can opt in to either independently.
Add per-version audit sub-bullets to every implemented date/time expression in spark_expressions_support.md using the audit-comet-expression skill. Covers 38 SQL function names across the 33 backing Comet serde objects (some serdes back multiple SQL names, e.g. `day`/`dayofmonth`, `date_add`/`dateadd`, `date_diff`/`datediff`). For each function, the sub-bullets record: - Whether the Spark class is identical across 3.4.3, 3.5.8, 4.0.1 - Spark 4.0 changes (universally the `NullIntolerant` trait / `nullIntolerant: Boolean` refactor, plus `StringTypeWithCollation` widening on string inputs and some error-helper renames) - Known divergences between Comet and Spark, with tracking-issue links The audit was driven by 8 parallel agents, each handling a related group of expressions (codegen-dispatched, date field extractors, Hour/Minute/Second, scalar function wrappers, timezone/unix, truncation, format, Iceberg transforms). Out of scope: `current_timezone`, `date_part`, `datepart`, `extract`, `localtimestamp` route through Spark optimizer rewrites or evaluate to constants and do not have dedicated Comet serdes; `days` and `hours` are V2 partition transforms with no SQL function name and so do not appear in this section. No code changes in this PR; consistency findings flagged during the audit will be addressed in follow-up PRs.
… audit skill Skill update (audit-comet-expression): - Step 6 reorganises priorities: correctness divergences, missing coverage, and consistency issues are now three distinct buckets. - Step 7 now requires correctness findings to land as captured tests in the same PR as the audit. Walks through the search-or-file issue workflow and the trivial-fix-vs-ignore decision, with a `query ignore(<url>)` example. Captured tests (linked to follow-up issues filed against this PR): - next_day.sql gains a divergence test for whitespace trimming (Comet trims `' MO '`; Spark does not). ignore(apache#4450). - next_day_ansi.sql is new and asserts that `next_day` throws under `spark.sql.ansi.enabled=true` for malformed dayOfWeek. Comet currently returns NULL. ignore(apache#4449). - make_date_ansi.sql is new and asserts that `make_date` throws under `spark.sql.ansi.enabled=true` for invalid `(year, month, day)`. Comet currently returns NULL. ignore(apache#4451). The fourth audit finding (`make_date` year 0 / negative years) was verified against Spark's own implementation and turned out to be a non-divergence; the issue was closed and no test added. None of the three remaining bugs are trivial to fix here: both `SparkNextDay` and `SparkMakeDate` live upstream in the `datafusion-spark` crate, so the fixes need to flow through that project. The captured tests will switch from `ignore(...)` to their intended assertion mode when the upstream changes land.
Member
Author
|
Folded into #4448 per author request — the skill update and ignored tests belong in the same PR as the audit they came from. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #.
Rationale for this change
While reviewing #4448 it became clear that the audit skill should produce captured tests for any correctness divergence it identifies, not just prose findings. Tests in
query ignore(...)mode are the most useful long-lived artefact: they reproduce the bug for whoever picks up the fix, and they auto-activate when theignore(...)is removed.This PR updates the
audit-comet-expressionskill to require that workflow, and applies it retroactively to three correctness findings from the #4448 audit.Stacked on #4448. The diff currently shows that PR's docs commits too; they will drop out when #4448 merges and this branch is rebased.
What changes are included in this PR?
Skill update (
audit-comet-expression):ignoredecision rule with a concretequery ignore(<url>)example.Captured tests for #4448 findings:
next_day.sql(appended)' MO 'to matchMonday; Spark does not trim and returns NULLnext_day_ansi.sql(new)next_dayreturns NULL where Spark throws underspark.sql.ansi.enabled=truemake_date_ansi.sql(new)make_datereturns NULL where Spark throws underspark.sql.ansi.enabled=trueAll three are in
query ignore(<issue>)mode. When the upstream fix lands, removingignore(...)will activate the assertions.Audit finding withdrawn:
The
make_date"year 0 / negative years" finding was checked against Spark's ownMakeDate.nullSafeEval, which usesLocalDate.of(year, month, day)and accepts any year in the full JavaLocalDaterange. Spark and chrono agree on this case, and the existingmake_date.sqltests at lines 133/136 already exercise it. Issue #4452 was closed as not-a-bug.Why not fix instead of ignore?
Both
SparkNextDayandSparkMakeDatelive in the upstreamdatafusion-sparkcrate (pinned at 53.1.0 inCargo.lock). Fixing the ANSI throw and the trim behaviour requires changes there, which is outside the scope of this PR.How are these changes tested?
The new SQL test files parse against
SqlFileTestParser(theIgnorePattern,ConfigPattern, andMinSparkVersionPatternregexes all match the directives used). Thequery ignore(...)queries are skipped at runtime per the documentedignoremode contract insql-file-tests.md, so they do not affect suite outcomes today.