Integrate catalog schema validation into planner -WIP #15711

zachjsh · 2024-01-17T21:47:22Z

Description

This PR contains a portion of the changes from the inactive draft PR for integrating the catalog with the Calcite planner #13686 from @paul-rogers, allowing the datasource table schemas defined in the catalog to be validated against when ingesting data into the underlying datasource, during SQL based ingestion. This allows for SQL based ingestion into a table with a declared schema to produce segments that conform to that schema. If partitioning and clustering is not defined at ingestion time, defaults for these parameters, as defined in the catalog for the table, if found, are used.

TODO: add more tests.

Release note

Key changed/added classes in this PR

MyFoo
OurBar
TheirBaz

This PR has:

sql/src/main/java/org/apache/druid/sql/calcite/planner/CalcitePlanner.java

sql/src/test/java/org/apache/druid/sql/calcite/planner/PlannerCaptureHook.java

sql/src/main/java/org/apache/druid/sql/calcite/run/QueryMaker.java

+   * Instead return what would have been sent to the execution engine.
+   * The result is a Jackson-serializable query plan.
+   */
+  default Object explain(DruidQuery druidQuery)


sql/src/main/java/org/apache/druid/sql/calcite/run/SqlEngine.java

@@ -94,6 +94,7 @@
  QueryMaker buildQueryMakerForInsert(
      String targetDataSource,
      RelRoot relRoot,
-      PlannerContext plannerContext
+      PlannerContext plannerContext,
+      RelDataType targetType


sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidSqlValidator.java

kgyrtkirk · 2024-02-01T09:19:25Z

sql/src/main/codegen/includes/common.ftl

 }
 {
  (
    <HOUR>
    {
-      granularity = Granularities.HOUR;
-      unparseString = "HOUR";
+      result = SqlLiteral.createCharString(DruidSqlParserUtils.HOUR_GRAIN, getPos());


I wonder if it would be possible to use SqlLiteral.createSymbol here instead; that could remove the need for the string based matching as well...

using SqlLiteral.createSymbol as you suggested

kgyrtkirk · 2024-02-01T09:24:11Z

sql/src/main/java/org/apache/druid/sql/calcite/planner/CalcitePlanner.java

+    // Add the necessary indirection. The type factory used here
+    // is the Druid one, since the per-query one is not yet available
+    // here. Nor are built-in function associated with per-query types.
+    this.operatorTable = new ChainedSqlOperatorTable(


I wonder if this is new functionality - could it be in a separate PR?

Thanks! removed this

kgyrtkirk · 2024-02-01T09:26:37Z

sql/src/test/java/org/apache/druid/sql/calcite/parser/DruidSqlUnparseTest.java

@@ -58,7 +58,7 @@ public void testUnparseReplaceAll() throws ParseException
                            + "OVERWRITE ALL\n"
                            + "SELECT *\n"
                            + "    FROM \"foo\"\n"
-                            + "PARTITIONED BY ALL TIME "
+                            + "PARTITIONED BY 'ALL TIME' "


is the result of the unparse valid ?

Changed back

* fix ingest on non-catalog table failure

github-actions · 2024-04-03T00:16:22Z

This pull request has been marked as stale due to 60 days of inactivity.
It will be closed in 4 weeks if no further activity occurs. If you think
that's incorrect or this pull request should instead be reviewed, please simply
write any comment. Even if closed, you can still revive the PR at any time or
discuss it on the dev@druid.apache.org list.
Thank you for your contributions.

github-actions · 2024-05-01T00:17:49Z

This pull request/issue has been closed due to lack of activity. If you think that
is incorrect, or the pull request requires review, you can revive the PR at any time.

* do stuff

0fdabe6

zachjsh requested a review from jon-wei January 17, 2024 21:47

github-actions bot added Area - Batch Ingestion Area - Querying Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Jan 17, 2024

github-advanced-security bot found potential problems Jan 17, 2024

View reviewed changes

* fix some bugs, add some TODOs

34916b7

zachjsh commented Jan 31, 2024

View reviewed changes

sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidSqlValidator.java Show resolved Hide resolved

kgyrtkirk reviewed Feb 1, 2024

View reviewed changes

zachjsh added 5 commits February 1, 2024 11:06

* fix some tests

4a594f6

* fix ingest on non-catalog table failure

* remove unneeded code

e55edf4

* code review comments

b3ab328

* fix more tests

4eeb543

* fix more tests

63344cf

github-actions bot added the stale label Apr 3, 2024

github-actions bot closed this May 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate catalog schema validation into planner -WIP #15711

Integrate catalog schema validation into planner -WIP #15711

zachjsh commented Jan 17, 2024

kgyrtkirk Feb 1, 2024

zachjsh Feb 2, 2024

kgyrtkirk Feb 1, 2024

zachjsh Feb 2, 2024

kgyrtkirk Feb 1, 2024

zachjsh Feb 2, 2024

github-actions bot commented Apr 3, 2024

github-actions bot commented May 1, 2024

Integrate catalog schema validation into planner -WIP #15711

Integrate catalog schema validation into planner -WIP #15711

Conversation

zachjsh commented Jan 17, 2024

Description

Release note

Key changed/added classes in this PR

kgyrtkirk Feb 1, 2024

Choose a reason for hiding this comment

zachjsh Feb 2, 2024

Choose a reason for hiding this comment

kgyrtkirk Feb 1, 2024

Choose a reason for hiding this comment

zachjsh Feb 2, 2024

Choose a reason for hiding this comment

kgyrtkirk Feb 1, 2024

Choose a reason for hiding this comment

zachjsh Feb 2, 2024

Choose a reason for hiding this comment

github-actions bot commented Apr 3, 2024

github-actions bot commented May 1, 2024