-
Notifications
You must be signed in to change notification settings - Fork 13.8k
[FLINK-25387] Introduce ExecNodeMetadata #18479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Automated ChecksLast check on commit d2b8689 (Mon Jan 24 14:37:40 UTC 2022) Warnings:
Mention the bot in a comment to re-run the automated checks. Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
90cd6d8
to
c52f5b9
Compare
FYI, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a first scan of the PR. I think overall it looks good.
I have one specific comment about the context
field. Why have you opted for mixing together in the serialization the instance identifier and the type identifier (composed by name and version)? For me "type" identity and "instance" identity are very different concepts, and they definitely deserve different fields.
...lanner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/serde/JsonSerdeUtil.java
Show resolved
Hide resolved
...le-planner/src/main/java/org/apache/flink/table/planner/plan/utils/ExecNodeMetadataUtil.java
Outdated
Show resolved
Hide resolved
...ner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/ExecNodeTypeIdResolver.java
Show resolved
Hide resolved
...e-planner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/ExecNodeMetadata.java
Show resolved
Hide resolved
...le-planner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/ExecNodeContext.java
Outdated
Show resolved
Hide resolved
Because, with the upgrade story, we can have an
|
I may have badly explained myself, but i'm not questioning the name + version tuple. That's ok and I get why we need it. What I'm questioning is that in the same field you add |
Thx for explaining! I see your point. The decision was more towards JSON and code simplification, so that this int id is part of the one liner @twalthr WDYT? |
59073c3
to
2f4e2b8
Compare
I kind of agree with @slinkydeveloper. The FLIP declares the full string for naming the operators However, I don't have a strong opinion here. If it is too complicated implementation wise to make |
Having the On one hand, I think that accessing the I also don't really have a strong opinion, just a slight preference to have everything in the POJO, but please let me know. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @matriv. I added some feedback. My biggest question is why StreamExecMultipleInput
has no annotation?
...table-planner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/ExecNodeBase.java
Outdated
Show resolved
Hide resolved
...table-planner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/ExecNodeBase.java
Outdated
Show resolved
Hide resolved
...le-planner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/ExecNodeContext.java
Outdated
Show resolved
Hide resolved
...le-planner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/ExecNodeContext.java
Outdated
Show resolved
Hide resolved
...va/org/apache/flink/table/planner/plan/nodes/exec/stream/StreamExecGroupWindowAggregate.java
Outdated
Show resolved
Hide resolved
metadata.add(annotation); | ||
} | ||
|
||
ExecNodeMetadatas annotations = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't we skip the check above? isn't the multi annotation including the single one or vice versa?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the "wrapper" MultipleExecNodeMetadata
we can allow multiple individual ExecNodeMetadata
annotations or a single MultipleExecNodeMetadata
, where the value
is an array of ExecNodeMetadata
, but apparently java allows to have 2 annotations as well, one MultipleExecNodeMetadata
and one ExecNodeMetadata
, so we're checking this as not allowed.
...le-planner/src/main/java/org/apache/flink/table/planner/plan/utils/ExecNodeMetadataUtil.java
Outdated
Show resolved
Hide resolved
...le-planner/src/main/java/org/apache/flink/table/planner/plan/utils/ExecNodeMetadataUtil.java
Outdated
Show resolved
Hide resolved
flink-table/flink-table-planner/src/test/resources/jsonplan/testGetJsonPlan.out
Outdated
Show resolved
Hide resolved
As I said, having the id in Context is fine and also no additional constructor argument is good. I'm just wondering if we can somehow have the id in the JSON plan directly. Can't we add it as a field in the base but fill the field with information from the Context in the base constructor? |
I personally think we should have instance and type id splitted, even if this requires a bit more code on the POJO side. |
e0137cf
to
1c866a4
Compare
...e-planner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/ExecNodeMetadata.java
Outdated
Show resolved
Hide resolved
@Target(ElementType.TYPE) | ||
@Retention(RetentionPolicy.RUNTIME) | ||
@Repeatable(value = MultipleExecNodeMetadata.class) | ||
@PublicEvolving |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Experimental?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably yes, thx, @twalthr right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather say @Internal
. It should only be used by us.
@Documented | ||
@Target(ElementType.TYPE) | ||
@Retention(RetentionPolicy.RUNTIME) | ||
@PublicEvolving |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Experimental?
158a4df
to
037e444
Compare
@twalthr @slinkydeveloper I think the PR is in a state for a final review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update @matriv. I left my last set of comments. Should be good in the next iteration if @slinkydeveloper agrees?
* into the JSON plan. | ||
*/ | ||
@JsonProperty(value = FIELD_NAME_TYPE, access = JsonProperty.Access.READ_ONLY, index = 1) | ||
public ExecNodeContext getContextFromAnnotation() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be final
and protected
public final class ExecNodeContext { | ||
|
||
/** This is used to assign a unique ID to every ExecNode. */ | ||
private static Integer idCounter = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AtomicInteger
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also thought of it, but I didn't change it and left it as a plain int, do you think it can be incremented in a concurrent context?
idCounter = 0; | ||
} | ||
|
||
private Integer id; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mark @Nullable
and final
, quickly explain why nullable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not marking here as @Nullable
but int the relevant ctor and added the javadoc there.
|
||
@JsonCreator | ||
public ExecNodeContext(String value) { | ||
String[] split = value.split("_"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explain the format in the class JavaDoc.
|
||
/** | ||
* Annotation to be used for {@link ExecNode}s to keep necessary metadata when | ||
* serialising/deserializing them in a plan. It's used for internal bookkeeping across Flink |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
serializing
@Target(ElementType.TYPE) | ||
@Retention(RetentionPolicy.RUNTIME) | ||
@Repeatable(value = MultipleExecNodeMetadata.class) | ||
@Experimental |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Internal
only used for our internal bookkeeping. we don't need to provide API stability here.
if (LOOKUP_MAP.containsKey(key)) { | ||
throw new IllegalStateException( | ||
String.format( | ||
"Found duplicate ExecNode: %s. This is a bug, please contact developers.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bug. Please file an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to remove this, I think it's ok to completely remove the phrase This is a bug....
* Returns the {@link ExecNodeMetadata} annotation of the class with the highest (most recent) | ||
* {@link ExecNodeMetadata#version()}. | ||
*/ | ||
@SuppressWarnings("rawtypes") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are quite a few suppressions. As mentioned in a comment before, I think we can get rid of a couple of them by correct declaration.
@twalthr addressed your comments, Thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update @matriv. LGTM, please ping me when it is ready to be merged. Currently, it seems to be failing (might need another rebase?).
…NodeMetadata - Introduce a new annotation `ExecNodeMetadata` on `ExecNode`s which is used to improve the serialiasation/de-serialiasation to/from JSON plan of `ExecNode`s and facilitate the upgrade of the pipeline, since every ExecNode has now also a version attached. - List all the JSON plan eligible `ExecNode`s in `ExecNodeMetadataUtil` and use a static list to register them in Jackson. - Annotate all those eligible `ExecNode`s with the new annotation and provide a name constructed by the class name using `-` separators. All versions are set now to 1. - Use an `ExecNodeContext` POJO which uses the uniqueId, name and version to serialize/de-serialise them in a JSON plan in the form of `<id>_<exec-node-name>_<version>`. - Fix issues with `@JsonIgnoreProperties` and `@JsonIgnore`, and opt for the class based annotation instead of the per field annotation of `@JsonIgnore`. - Update the test plans with the new JSON scheme derived by the changes.
…NodeMetadata - Introduce a new annotation `ExecNodeMetadata` on `ExecNode`s which is used to improve the serialization/deserialization to/from JSON plan of `ExecNode`s and facilitate the upgrade of the pipeline, since every ExecNode has now also a version attached. - List all the JSON plan eligible `ExecNode`s in `ExecNodeMetadataUtil` and use a static list to register them in Jackson. - Annotate all those eligible `ExecNode`s with the new annotation and provide a name constructed by the class name using `-` separators. All versions are set now to 1. - Use an `ExecNodeContext` POJO which uses the uniqueId, name and version to serialize/deserialize them in a JSON plan in the form of `<id>_<exec-node-name>_<version>`. - Fix issues with `@JsonIgnoreProperties` and `@JsonIgnore`, and opt for the class based annotation instead of the per field annotation of `@JsonIgnore`. - Update the test plans with the new JSON scheme derived by the changes. This closes apache#18479.
…NodeMetadata - Introduce a new annotation `ExecNodeMetadata` on `ExecNode`s which is used to improve the serialization/deserialization to/from JSON plan of `ExecNode`s and facilitate the upgrade of the pipeline, since every ExecNode has now also a version attached. - List all the JSON plan eligible `ExecNode`s in `ExecNodeMetadataUtil` and use a static list to register them in Jackson. - Annotate all those eligible `ExecNode`s with the new annotation and provide a name constructed by the class name using `-` separators. All versions are set now to 1. - Use an `ExecNodeContext` POJO which uses the uniqueId, name and version to serialize/deserialize them in a JSON plan in the form of `<id>_<exec-node-name>_<version>`. - Fix issues with `@JsonIgnoreProperties` and `@JsonIgnore`, and opt for the class based annotation instead of the per field annotation of `@JsonIgnore`. - Update the test plans with the new JSON scheme derived by the changes. This closes apache#18479.
Follows: #d9d72ef142a2343f37b8b10ca6e04dc7f6ca086e of: apache#18479
Follows: #d9d72ef142a2343f37b8b10ca6e04dc7f6ca086e of: apache#18479
Follows: #d9d72ef142a2343f37b8b10ca6e04dc7f6ca086e of: #18479
What is the purpose of the change
Add versioning to the
ExecNode
s to enable upgrading a plan to a next Flink version.Brief change log
Introduce a new annotation
ExecNodeMetadata
onExecNode
s which is used toimprove the serialiasation/de-serialiasation to/from JSON plan of
ExecNode
s andfacilitate the upgrade of the pipeline, since every ExecNode has now also a version
attached.
List all the JSON plan eligible
ExecNode
s inExecNodeMetadataUtil
and use astatic list to register them in Jackson.
Annotate all those eligible
ExecNode
s with the new annotation and provide aname constructed by the class name using
-
separators. All versions are set nowto 1.
Use an
ExecNodeContext
POJO which uses the uniqueId, name and version toserialize/de-serialise them in a JSON plan in the form of
<id>_<exec-node-name>_<version>
.Fix issues with
@JsonIgnoreProperties
and@JsonIgnore
, and opt for the classbased annotation instead of the per field annotation of
@JsonIgnore
.Update the test plans with the new JSON scheme derived by the changes.
Verifying this change
Adapted the test JSON plans with the new scheme.
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: yesDocumentation