Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-11256] Replace the reference of StreamNode object with ID in S… #7403

Closed
wants to merge 3 commits into from

Conversation

sunhaibotb
Copy link
Contributor

What is the purpose of the change

This pull request replaces the reference of StreamNode object with ID in StreamEdge to reduce the sizes of JobGraph and TDD. Those referenced objects including sourceVertex and targetVertex will be written transitively on serializing StreamEdge, but they are not needed in JM and Task. For a large size topology, this will causes JobGraph and TDD to become much larger than that actually need, and more likely to occur rpc timeout when transmitted.

Brief change log

  • Replaces the reference of StreamNode object with ID in StreamEdge
  • Migrates methods getSourceVertex() and getTargetVertex() in StreamEdge to StreamGraph

Verifying this change

This change is already covered by existing tests, such as StreamGraphGeneratorTest, StreamingJobGraphGeneratorTest, etc.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? ( no)
  • If yes, how is the feature documented? (not applicable)

…treamEdge to reduce the sizes of JobGraph and TDD
Copy link
Member

@sunjincheng121 sunjincheng121 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. @sunhaibotb
I only left two suggestions. And please let me know your opinion :)
Bests,
Jincheng

@sunjincheng121
Copy link
Member

sunjincheng121 commented Jan 17, 2019

Hi @sunhaibotb Thanks for the update!
The PR looks good to me.
+1 to be merged.

@sunhaibotb
Copy link
Contributor Author

Thank @sunjincheng121 for the review.

sunjincheng121 pushed a commit to sunjincheng121/flink that referenced this pull request Jan 18, 2019
…bGraph

This closes apache#7403

Add operatorName information to the method toString() of StreamEdge

add comments for sourceOperatorName and targetOperatorName
@sunjincheng121
Copy link
Member

Merging...

@asfgit asfgit closed this in a7eb845 Jan 19, 2019
@sunhaibotb sunhaibotb deleted the FLINK-11256 branch January 19, 2019 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants