[FLINK-13195] Add create table support for SqlClient #9068

danny0405 · 2019-07-10T11:50:58Z

What is the purpose of the change

Support create table for SqlClient

Brief change log

Add create table command support for SqlClient
Always cache the current catalog to the original session, so that when we set a variable, the current catalog table still can be reused

Verifying this change

See tests in LocalExecutorITCase

Does this pull request potentially affect one of the following parts:

Dependencies (does it add or upgrade a dependency): no
The public API, i.e., is any changed class annotated with @Public(Evolving): no
The serializers: no
The runtime per-record code paths (performance sensitive): no
Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no
The S3 file system connector: no

Documentation

Does this pull request introduce a new feature? yes
If yes, how is the feature documented? not documented

flinkbot · 2019-07-10T11:52:39Z

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit 829e8ae (Tue Aug 06 15:40:36 UTC 2019)

Warnings:

No documentation files were touched! Remember to keep the Flink docs up to date!

_{Mention the bot in a comment to re-run the automated checks.}

Review Progress

❓ 1. The [description] looks good.
❓ 2. There is [consensus] that the contribution should go into to Flink.
❓ 3. Needs [attention] from.
❓ 4. The change fits into the overall [architecture].
❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

Details

The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required

Bot commands

The @flinkbot bot supports the following commands:

@flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
@flinkbot approve all to approve all aspects
@flinkbot approve-until architecture to approve everything until architecture
@flinkbot attention @username1 [@username2 ..] to require somebody's attention
@flinkbot disapprove architecture to remove an approval you gave earlier

twalthr

-1 for this PR. I would like to propose a different architecture.

How about the following flow:

CLI sends SQL (any SQL not just DDL) to Executor
Executor needs to distinguish between immediate executable statements (such as storing tables in a catalog) and statements that just enrich the session context.
The updated session context could be sent back to the client.

This is just one idea. In any case we need to get the separation between CLI (client) and executor (gateway/server) right. This PR mixes responsibilities.

We can also discuss that in an privately first.

twalthr · 2019-07-10T13:10:35Z

...ble/flink-sql-client/src/main/java/org/apache/flink/table/client/gateway/SessionContext.java


 	private final Map<String, ViewEntry> views;

+	private volatile Catalog currentCatalog;


A catalog should not be part of a session context. You need to image the session context like a JSON map that is serialized between CLI and Executor. Context + SQL could be sent to a stateless server.

twalthr · 2019-07-10T13:38:13Z

flink-table/flink-sql-client/src/test/resources/test-sql-client-ddl-table.yaml

+
+#==============================================================================
+# TEST ENVIRONMENT FILE
+# General purpose default environment file.


If you copy code, make sure to also update it accordingly. This is not a "general purpose" file anymore but a file for a specific test. Also remove all unrelated content to see what you are actually testing.

flinkbot · 2019-07-10T20:26:32Z

CI report for commit 0033ba7: FAILURE Build

xuefuz · 2019-07-11T00:38:24Z

I appreciate Timo's vision of client/server model even in case that we only have local execution at the moment. However, I'm a little curious about the goal of a stateless gateway with state being passed back and forth. Some state is easily fit to this model, while others, such as temporary tables created by the user that's further referenced in subsequent queries. This is just one thing that's currently maintained in a table env. Without a live table env instance maintained for a remote client, it's hard to maintain the state via a session context.

To me, stateless gateway seems making a lot of sense for largely stateless client, like one submitting a streaming job. The scenario for batch can be quite different.

Personally, I am in favor of having some DDL support now and leaving re-architecturing for the next release.

danny0405 · 2019-07-11T02:47:48Z

@twalthr How about we cache a catalogName -> DDLs mapping in the SessionContext, just like we cache the ViewEntry, then we recover the tables from these DDLs every time we switch to a new session.

flinkbot · 2019-07-11T03:30:42Z

CI report for commit 829e8ae: FAILURE Build

twalthr · 2019-07-11T08:18:17Z

Tables registered in catalogs should have been persisted. So we don't need to memorize them. It is true that SessionContext has some cache for views, but if you look into the implementation, a view is just a string that comes directly from the CLI. It is not evaluated before submitting an entire query.

I agree that we should not re-register tables in a catalog. A CREATE TABLE statement should be executed immediately and be persistent in the catalog. Have we ever thought about properly distinguishing between temporary tables (for the session) and persistent tables (across sessions)? This would make the implementation in the SQL Client much easier:

CREATE TEMPORARY TABLE (...) WITH (..) // buffered in the CLI session and re-registered when a query is executed

CREATE TABLE (...) WITH (...) // executed immediately and applied to the catalog

This is one idea of fixing it properly.

xuefuz · 2019-07-11T18:19:30Z

@twalthr Your example of temporary table usage might be simple enough to make client-side caching workable. However, that's usually not the case. For instance, user might do
CREATE TEMPORARY TABLE (...) WITH (..) AS SELECT ... FROM X JOIN Y ON .... WHERE ...
it's not efficient to cache this at client side and execute every time when the temp table is referenced.

The essence of the problem is that a SQL user session has a complicated state that cannot be easily managed by a client. Thus, the idea of stateless gateway, while solving some use cases (like submitting a streaming job), has difficulty to apply for batch cases where session state is large and complicated.

flinkbot · 2019-08-06T15:50:08Z

CI report:

829e8ae : FAILURE Build

rmetzger added the review=description? label Jul 10, 2019

[FLINK-13195] Add create table support for SqlClient

b66c9b4

danny0405 force-pushed the sql-cli branch from ce6375a to b66c9b4 Compare July 10, 2019 11:53

danny0405 added 2 commits July 10, 2019 19:56

fix comments

0033ba7

fix complie error

829e8ae

rmetzger added the component=TableSQL/Client label Jul 10, 2019

twalthr requested changes Jul 10, 2019

View reviewed changes

JingsongLi mentioned this pull request Oct 25, 2019

[FLINK-13195][sql-client] Add create table support for SqlClient #9981

Closed

danny0405 closed this Dec 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FLINK-13195] Add create table support for SqlClient #9068

[FLINK-13195] Add create table support for SqlClient #9068

Uh oh!

danny0405 commented Jul 10, 2019

Uh oh!

flinkbot commented Jul 10, 2019 •

edited

Loading

Uh oh!

twalthr left a comment

Uh oh!

twalthr Jul 10, 2019

Uh oh!

twalthr Jul 10, 2019

Uh oh!

flinkbot commented Jul 10, 2019

Uh oh!

xuefuz commented Jul 11, 2019

Uh oh!

danny0405 commented Jul 11, 2019

Uh oh!

flinkbot commented Jul 11, 2019

Uh oh!

twalthr commented Jul 11, 2019

Uh oh!

xuefuz commented Jul 11, 2019

Uh oh!

flinkbot commented Aug 6, 2019 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants


		private final Map<String, ViewEntry> views;

		private volatile Catalog currentCatalog;

[FLINK-13195] Add create table support for SqlClient #9068

[FLINK-13195] Add create table support for SqlClient #9068

Uh oh!

Conversation

danny0405 commented Jul 10, 2019

What is the purpose of the change

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

Uh oh!

flinkbot commented Jul 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Automated Checks

Review Progress

Uh oh!

twalthr left a comment

Choose a reason for hiding this comment

Uh oh!

twalthr Jul 10, 2019

Choose a reason for hiding this comment

Uh oh!

twalthr Jul 10, 2019

Choose a reason for hiding this comment

Uh oh!

flinkbot commented Jul 10, 2019

Uh oh!

xuefuz commented Jul 11, 2019

Uh oh!

danny0405 commented Jul 11, 2019

Uh oh!

flinkbot commented Jul 11, 2019

Uh oh!

twalthr commented Jul 11, 2019

Uh oh!

xuefuz commented Jul 11, 2019

Uh oh!

flinkbot commented Aug 6, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CI report:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

flinkbot commented Jul 10, 2019 •

edited

Loading

flinkbot commented Aug 6, 2019 •

edited

Loading