Skip to content

Refactor dimension link validation into NodeSpecBulkValidator#1951

Merged
shangyian merged 10 commits intoDataJunction:mainfrom
shangyian:refactor-dimensionlink-staticmeth
Apr 4, 2026
Merged

Refactor dimension link validation into NodeSpecBulkValidator#1951
shangyian merged 10 commits intoDataJunction:mainfrom
shangyian:refactor-dimensionlink-staticmeth

Conversation

@shangyian
Copy link
Copy Markdown
Collaborator

@shangyian shangyian commented Apr 3, 2026

Summary

Previously, dimension link validation was scattered across the deployment orchestrator as several instance methods (e.g., validate_dimension_links). However, these methods were interleaved with deployment execution, making them hard to test in isolation and impossible to call without a running orchestrator.

What changes:

  • DimensionLink.parse_join_sql and build_foreign_key_mapping become static methods. They previously required an instantiated DimensionLink ORM object (needing node_revision and dimension relationships loaded). Making them static lets validation call them with just the raw string values from a deployment spec, without touching the DB.
  • NodeSpecBulkValidator gains _validate_dimension_link_specs, which runs join clause SQL parsing and column existence checks for both join and reference link specs as a post-validation pass. It prefetches dimension node columns in one bulk query (_prefetch_dimension_link_nodes) to avoid per-link DB round trips, and gracefully skips dim-side column checks for dimensions being deployed in the same batch.
  • The four orchestrator validation methods are deleted. Dimension link validation now happens entirely inside bulk_validate_node_data (the pre-deployment validation step), not during execution.
  • update_ast_column_types is promoted from a closure inside validate_node_data to a module-level function in internal/validation.py, so NodeSpecBulkValidator can call it after parsing to keep column type inference consistent.
  • CompileContext gets a column_overrides field so that validation can substitute a proposed column list for a node (instead of loading from DB). This is a prerequisite for dry-run impact preview in later PRs. This is also threaded thru to bulk_validate_node_data
  • Subscripts raises DJParseException instead of returning None when a struct field key isn't found, preventing silent type inference failures downstream.

Test Plan

  • PR has an associated issue: #
  • make check passes
  • make test shows 100% unit test coverage

Deployment Plan

@netlify
Copy link
Copy Markdown

netlify bot commented Apr 3, 2026

Deploy Preview for thriving-cassata-78ae72 canceled.

Name Link
🔨 Latest commit f5998d2
🔍 Latest deploy log https://app.netlify.com/projects/thriving-cassata-78ae72/deploys/69d05d579e787a0008aa461e

@shangyian shangyian changed the title Refactor join sql parsing logic in dimension links Add bulk dimension link validation Apr 3, 2026
@shangyian shangyian changed the title Add bulk dimension link validation Refactor dimension link validation into NodeSpecBulkValidator Apr 4, 2026
@shangyian shangyian marked this pull request as ready for review April 4, 2026 01:20
@shangyian shangyian merged commit b5a111a into DataJunction:main Apr 4, 2026
17 checks passed
@shangyian shangyian deleted the refactor-dimensionlink-staticmeth branch April 4, 2026 01:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant