-
Notifications
You must be signed in to change notification settings - Fork 13.8k
[FLINK-31956][table] Extend the CompiledPlan to read from/write to Fl… #22539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
LadyForest
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @xishuaidelin, thanks for your contribution.
I have briefly reviewed the code, which may be inaccurate, but the interface should remain the same. What do you think?
flink-table/flink-table-api-java/src/main/java/org/apache/flink/table/api/CompiledPlan.java
Outdated
Show resolved
Hide resolved
...table/flink-table-api-java/src/main/java/org/apache/flink/table/delegation/InternalPlan.java
Outdated
Show resolved
Hide resolved
|
@flinkbot run azure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @xishuaidelin, thanks for your update.
I think we may reuse org.apache.flink.table.resource.ResourceManager to take over the URI check and resource download/cleanup. You can take the impl for the CREATE FUNCTION USING JAR statement as a reference.
...k-table-api-java/src/main/java/org/apache/flink/table/api/internal/TableEnvironmentImpl.java
Outdated
Show resolved
Hide resolved
|
@flinkbot run azure |
LadyForest
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...ble-planner/src/main/java/org/apache/flink/table/planner/plan/ExecNodeGraphInternalPlan.java
Show resolved
Hide resolved
| filePath, TableConfigOptions.PLAN_FORCE_RECOMPILE.key())); | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert this back
| if (ifNotExists) { | ||
| return loadPlan(PlanReference.fromFile(filePath)); | ||
| if (fs.isDistributedFS()) { | ||
| URL localUrl = resourceManager.downloadResource(filePath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little unnatural to do so. Can we wrap up this logic?
|
By the way, you can evaluate the param Nit: you can use minio to mock s3 |
|
|
||
| compiledPlan.writeToFile(file, false); | ||
| compiledPlan.writeToFile(localPath, false); | ||
| resourceManager.updateFilePath(filePath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I'm wrong, but I think we should not and probably cannot write files to a remote FileSystem.
Although users have configured related conf in flink-conf.yaml, such as accessKey and accessSecret, having permission to read files under a bucket does not mean having the same write permission.
If users feel it is necessary, they can upload files manually. Implementing this through the framework may have many limitations.
| } | ||
|
|
||
| /** | ||
| * register the filePath of flink filesystem. If it is remote filesystem and the file exists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: JavaDoc comments should start with the capitalized sentence. You can take https://www.oracle.com/technical-resources/articles/java/javadoc-tool.html
as a ref.
...able/flink-table-api-java/src/main/java/org/apache/flink/table/resource/ResourceManager.java
Outdated
Show resolved
Hide resolved
LadyForest
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your update. However, the code quality is not yet at a level where it can be merged. As time is tight, I will take over your work and continue to make improvements. Can we review the changes together at a later time?
| sql = String.format( | ||
| "COMPILE and EXECUTE plan '%s' FOR INSERT INTO MySink SELECT * FROM MyTable", | ||
| path) | ||
| tableEnv.executeSql(sql) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of this test? It does not assert anything.
| val execEnv = StreamExecutionEnvironment.getExecutionEnvironment | ||
| execEnv.setParallelism(1) | ||
| val settings = EnvironmentSettings.newInstance().inStreamingMode().build() | ||
| val tableEnv = StreamTableEnvironment.create(execEnv, settings) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of re-creating a tableEnv?
| "Child operation of CompileAndExecuteOperation must be either a " | ||
| + "ModifyOperation or a StatementSetOperation."); | ||
| this.filePath = filePath; | ||
| this.filePath = new Path(filePath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we don't need to change this.
| public class ExecutePlanOperation implements Operation { | ||
|
|
||
| private final String filePath; | ||
| private final Path filePath; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
| operation instanceof StatementSetOperation || operation instanceof ModifyOperation, | ||
| "child operation of CompileOperation must be either a ModifyOperation or a StatementSetOperation"); | ||
| this.filePath = filePath; | ||
| this.filePath = new Path(filePath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
| public void runSQL(String sqlPath, Map<String, String> varsMap) throws Exception { | ||
| try (ClusterController clusterController = flink.startCluster(1)) { | ||
| List<String> sqlLines = initializeSqlLines(sqlPath, varsMap); | ||
| executeSqlStatements(clusterController, sqlLines); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why duplicate the method and assert nothing?
| import java.util.Map; | ||
|
|
||
| /** End-to-End tests for compile and execute remote file. */ | ||
| public class CompileAndExecuteRemoteFileITCase extends SqlITCaseBase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test shares the same HDFS configuration with UsingRemoteJarITCase. Maybe deserve a common base class
| MiniDFSCluster.Builder builder = new MiniDFSCluster.Builder(hdConf); | ||
| hdfsCluster = builder.build(); | ||
|
|
||
| hdPath = new org.apache.hadoop.fs.Path("/test.json"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the purpose of declaring this variable is to delete this path in the after method, then it can be turned into a local variable in after.
| hdfs = hdPath.getFileSystem(hdConf); | ||
|
|
||
| } catch (Throwable e) { | ||
| e.printStackTrace(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's a good practice to do so...
|
|
||
| @Test | ||
| public void testCompilePlanRemoteFile() throws Exception { | ||
| runSQL("compile_plan_use_remote_file_e2e.sql", generateReplaceVars()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test does not assert anything...
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
| SET execution.runtime-mode = $MODE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
COMPILE PLAN statement is not supported under batch mode.


What is the purpose of the change
Brief change log
Verifying this change
This change could be verified by the testSqlExecutePlanPath in SqlOtherToOperationConverterTest, testSqlcompilePlanPath in SqlOtherToOperationConverterTest and testCompileAndExecutePlanWithFlinkFilesystem in TableEnvironmentTest.
Does this pull request potentially affect one of the following parts:
@Public(Evolving): (yes / no)Documentation