Skip to content

Feat: finalize clickhouse engine adapter#3125

Merged
treysp merged 23 commits intomainfrom
trey/improve-ch-adapter
Sep 18, 2024
Merged

Feat: finalize clickhouse engine adapter#3125
treysp merged 23 commits intomainfrom
trey/improve-ch-adapter

Conversation

@treysp
Copy link
Contributor

@treysp treysp commented Sep 12, 2024

This PR finalizes the Clickhouse engine adapter:

  • Adds SCD model kind support
  • Adds ability to pass arbitrary settings to client connection
  • Adds docs

Context and implementation details

Joins and NULLs

  • Clickhouse defaults to filling empty cells with a datatype-specific default (e.g., 0 for integer columns).
  • The SCD and table diff queries SQLMesh builds require that we change that behavior to fill with NULLs.
  • We do that by injecting SETTINGS join_use_nulls = 1 into the query
  • SCD detail:
    • The original user query is embedded in a CTE.
    • Query settings are dynamically scoped, so our setting on the outer query will apply to the user query CTE.
    • If join_use_nulls = 0 on the CH server, we inject join_use_nulls = 0 into the CTE to preserve behavior expected by user.

Connection settings

  • Following dbt-clickhouse (maintained by Clickhouse), we pass these settings to the connection:
    • Always
      • mutations_sync = "2"
      • insert_distributed_sync" = "1"
    • When running in cluster or cloud modes
      • database_replicated_enforce_synchronous_settings = "1"
      • insert_quorum = "auto"

Storage format

  • Each CH table must have a "table engine" (MergeTree by default)
  • Users specify a table engine in the MODEL DDL storage_format key
  • Specification may be a function call, so we now generate the value's SQL if the value is an Expression other than Literal or Identifier
    • Generated Literal/Identifier SQL may include quotes, and we want to defer normalization until later

@treysp treysp requested a review from a team September 12, 2024 21:46
@erindru
Copy link
Collaborator

erindru commented Sep 13, 2024

Awesome work! I bet you learned way more than you wanted to about the Clickhouse internals

@treysp treysp force-pushed the trey/improve-ch-adapter branch from 03be872 to 85da392 Compare September 13, 2024 14:33
Copy link
Contributor

@georgesittas georgesittas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @treysp!

@treysp treysp force-pushed the trey/improve-ch-adapter branch from 85da392 to 5e28b8a Compare September 16, 2024 14:52
@treysp treysp force-pushed the trey/improve-ch-adapter branch 4 times, most recently from 626240e to 8f1636d Compare September 17, 2024 19:47
@treysp treysp force-pushed the trey/improve-ch-adapter branch 2 times, most recently from 6d9243f to dfa9b97 Compare September 18, 2024 14:53
@treysp treysp force-pushed the trey/improve-ch-adapter branch from dfa9b97 to 3ee23cd Compare September 18, 2024 21:42
@treysp treysp merged commit fbf941b into main Sep 18, 2024
@treysp treysp deleted the trey/improve-ch-adapter branch September 18, 2024 22:15
@treysp treysp changed the title Feat!: finalize clickhouse engine adapter Feat: finalize clickhouse engine adapter Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants