-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-33211][table] support flink table lineage #24618
base: master
Are you sure you want to change the base?
Conversation
@flinkbot run azure |
34f3667
to
3a01cfd
Compare
@flinkbot run azure |
3a01cfd
to
4820faf
Compare
4820faf
to
16bb67b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HuangZhenQiu Thanks for the contribution! I left some comments.
Also I found a lot of one-line changes removing blank line in file header. Could you split them to another hotfix commit, or directly revert them as they are not quite necessary?
@@ -34,6 +35,7 @@ | |||
public abstract class PhysicalTransformation<T> extends Transformation<T> { | |||
|
|||
private boolean supportsConcurrentExecutionAttempts = true; | |||
private LineageVertex lineageVertex; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like only source and sink transformations have lineage vertex. What about we only add it to source / sink transformations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense. Added an TransformationWithLineage class for this purpose.
|
||
public TableLineageDatasetImpl(ContextResolvedTable contextResolvedTable) { | ||
this.name = contextResolvedTable.getIdentifier().asSummaryString(); | ||
this.namespace = inferNamespace(contextResolvedTable.getTable()).orElse(""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if the implementation here matches the definition on the interface. From Javadoc of LineageDataset:
flink/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/lineage/LineageDataset.java
Line 32 in 0b2e988
/* Unique name for this dataset's storage, for example, url for jdbc connector and location for lakehouse connector. */ |
Let's take JDBC connector as an example. My assumption is that the namespace should describe the URL of the database, or at least some identifier that can tell difference between difference DB instances. Here the implementation only writes jdbc
as the namespace.
...in/java/org/apache/flink/table/planner/plan/nodes/exec/common/CommonExecTableSourceScan.java
Show resolved
Hide resolved
@@ -90,6 +90,7 @@ protected Transformation<RowData> createConversionTransformationIfNeeded( | |||
final RowType outputType = (RowType) getOutputType(); | |||
final Transformation<RowData> transformation; | |||
final int[] fieldIndexes = computeIndexMapping(true); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is added by mistake I assume
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
@@ -123,6 +124,7 @@ public class StreamGraph implements Pipeline { | |||
private CheckpointStorage checkpointStorage; | |||
private Set<Tuple2<StreamNode, StreamNode>> iterationSourceSinkPairs; | |||
private InternalTimeServiceManager.Provider timerServiceProvider; | |||
private LineageGraph lineageGraph; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only usage of this field is for tests. Is it possible not to introduce it in StreamGraph
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed offline, we will keep it here.
import java.util.Map; | ||
|
||
/** Default implementation for DatasetSchemaFacet. */ | ||
public class TableDataSetSchemaFacet implements DatasetSchemaFacet { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DataSet
-> Dataset
import java.util.Map; | ||
|
||
/** Default implementation for DatasetSchemaFacet. */ | ||
public class TableDataSetSchemaFacet implements DatasetSchemaFacet { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually we don't need facets for table schema and table config. They are already included in TableLineageDataset#table
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. It is just for exposing the data in a structured way. After the implementation, I feel we probably don't need to expose CatalogContext and CatalogBaseTable to users.
e5f8a17
to
5f29a04
Compare
@PatrickRen |
5f29a04
to
375fe2d
Compare
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/lineage/LineageGraph.java
Show resolved
Hide resolved
...ava/src/main/java/org/apache/flink/streaming/api/transformations/OneInputTransformation.java
Show resolved
Hide resolved
...ble/flink-table-planner/src/main/java/org/apache/flink/table/planner/lineage/ModifyType.java
Outdated
Show resolved
Hide resolved
...ble-planner/src/main/java/org/apache/flink/table/planner/lineage/TableColumnLineageEdge.java
Show resolved
Hide resolved
...-table/flink-table-planner/src/test/java/org/apache/flink/connector/source/ValuesSource.java
Show resolved
Hide resolved
375fe2d
to
c8aa9ee
Compare
@davidradl |
47ec379
to
81da89d
Compare
@PatrickRen |
81da89d
to
4fc6696
Compare
What is the purpose of the change
Brief change log
Verifying this change
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (yes)Documentation