Unify single-node and bulk deployment validation paths#2037
Merged
shangyian merged 5 commits intoDataJunction:mainfrom Apr 21, 2026
Merged
Unify single-node and bulk deployment validation paths#2037shangyian merged 5 commits intoDataJunction:mainfrom
shangyian merged 5 commits intoDataJunction:mainfrom
Conversation
✅ Deploy Preview for thriving-cassata-78ae72 canceled.
|
6ab7f41 to
f03856a
Compare
ad0fd4d to
539363b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Single-node validate and bulk deployment historically ran two separate validators that gave different verdicts for the same SQL. This branch adds
validate_node_data_v2alongside the legacy validator, built on the samevalidate_node_queryprimitive deployment uses, and wires it intoPOST /nodes/{name}/validate/.Shared primitives are called by both
extract_node_graph(deployment) andvalidate_node_data_v2(single-node), so candidate extraction and parent/missing classification produce identical results from both directions. The deployment path now also emitsMissingParentrecords for unresolvedast.Tablerefs (matching single-node behavior) and filters non-metric resolutions out of derived-metric parent lists.Validation changes on v2 over legacy:
SELECT col FROM a JOIN b ON a.col = b.colwherecollives on both)EXPLODE(array<struct<a, b>>) AS (c1, c2): struct fields unpack positionally, both inLATERAL VIEWand projection formsCROSS JOIN UNNEST(t.arr) AS u(x):UNNESTsees the left-side table in scope; struct unpacking andPOSEXPLODEhandledSELECT … LATERAL VIEW EXPLODE(sequence(1, N)) AS xLATERAL VIEW EXPLODE(…) AS cin one select no longer collide on the default aliasVALUES (NULL, 'a'), ('x', 'b') AS t(c1, c2): column type comes from the first typed row, not any given nullarr[1]on aListTypecolumn resolves to the element typefrom_json(string_col, 'MAP<STRING, STRING>')resolves its return type so downstream references workcol.fieldstruct access on a deep chain resolves correctlyColumn X not foundcause surfaces alongside the downstreamUnable to infer typenoiseTest Plan
Two fixture bugs surfaced and fixed.
make checkpassesmake testshows 100% unit test coverageDeployment Plan