-
Notifications
You must be signed in to change notification settings - Fork 13.8k
[FLINK-13195] Add create table support for SqlClient #9068
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Automated ChecksLast check on commit 829e8ae (Tue Aug 06 15:40:36 UTC 2019) Warnings:
Mention the bot in a comment to re-run the automated checks. Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. DetailsThe Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
twalthr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-1 for this PR. I would like to propose a different architecture.
How about the following flow:
- CLI sends SQL (any SQL not just DDL) to Executor
- Executor needs to distinguish between immediate executable statements (such as storing tables in a catalog) and statements that just enrich the session context.
- The updated session context could be sent back to the client.
This is just one idea. In any case we need to get the separation between CLI (client) and executor (gateway/server) right. This PR mixes responsibilities.
We can also discuss that in an privately first.
|
|
||
| private final Map<String, ViewEntry> views; | ||
|
|
||
| private volatile Catalog currentCatalog; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A catalog should not be part of a session context. You need to image the session context like a JSON map that is serialized between CLI and Executor. Context + SQL could be sent to a stateless server.
|
|
||
| #============================================================================== | ||
| # TEST ENVIRONMENT FILE | ||
| # General purpose default environment file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you copy code, make sure to also update it accordingly. This is not a "general purpose" file anymore but a file for a specific test. Also remove all unrelated content to see what you are actually testing.
|
I appreciate Timo's vision of client/server model even in case that we only have local execution at the moment. However, I'm a little curious about the goal of a stateless gateway with state being passed back and forth. Some state is easily fit to this model, while others, such as temporary tables created by the user that's further referenced in subsequent queries. This is just one thing that's currently maintained in a table env. Without a live table env instance maintained for a remote client, it's hard to maintain the state via a session context. To me, stateless gateway seems making a lot of sense for largely stateless client, like one submitting a streaming job. The scenario for batch can be quite different. Personally, I am in favor of having some DDL support now and leaving re-architecturing for the next release. |
|
@twalthr How about we cache a |
|
Tables registered in catalogs should have been persisted. So we don't need to memorize them. It is true that I agree that we should not re-register tables in a catalog. A This is one idea of fixing it properly. |
|
@twalthr Your example of temporary table usage might be simple enough to make client-side caching workable. However, that's usually not the case. For instance, user might do The essence of the problem is that a SQL user session has a complicated state that cannot be easily managed by a client. Thus, the idea of stateless gateway, while solving some use cases (like submitting a streaming job), has difficulty to apply for batch cases where session state is large and complicated. |
What is the purpose of the change
Support create table for SqlClient
Brief change log
Verifying this change
See tests in LocalExecutorITCase
Does this pull request potentially affect one of the following parts:
@Public(Evolving): noDocumentation