Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-21326][runtime] Optimize building topology when initializing ExecutionGraph #14868

Merged
merged 4 commits into from Mar 5, 2021

Conversation

Thesharing
Copy link
Contributor

@Thesharing Thesharing commented Feb 4, 2021

What is the purpose of the change

This PR introduces the optimization of building topology when initializing ExecutionGraph.
The main idea is to put all the vertices that consumed the same result partitions into one group, and put all the result partitions that have the same consumer vertices into one consumer group.
The complexity of building topology in ExecutionGraph decreases from O(N^2) to O(N).

For more details please check FLINK-21326.

Brief change log

  • Introduced EdgeManager, ConsumerVertexGroup and ConsumedPartitionGroup to store the topology in ExecutionGraph
  • Introduced optimizations on the procedure of building topology when initializing ExecutionGraph
  • Removed ExecutionEdge and fixed related tests

Verifying this change

Since these optimizations do not change the original logic of building topology in ExecutionGraph, we believe that this change is already covered by existing tests, such as ExecutionGraphConstructionTest, ExecutionGraphRescalingTest, PointwisePatternTest, and etc.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@flinkbot
Copy link
Collaborator

flinkbot commented Feb 4, 2021

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit 85af2bb (Thu Feb 04 09:35:47 UTC 2021)

Warnings:

  • No documentation files were touched! Remember to keep the Flink docs up to date!

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.


The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Feb 4, 2021

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run travis re-run the last Travis build
  • @flinkbot run azure re-run the last Azure build

@Thesharing Thesharing changed the title [FLINK-21110][runtime] Optimize building topology when initializing ExecutionGraph [FLINK-21326][runtime] Optimize building topology when initializing ExecutionGraph Feb 9, 2021
Copy link
Contributor

@zhuzhurk zhuzhurk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening this PR @Thesharing
The change generally looks good to me. I have a few minor comments.

Copy link
Contributor Author

@Thesharing Thesharing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing and providing these great suggestions. I've resolved them in the fix-up commits.

@Thesharing Thesharing force-pushed the flink-21110 branch 3 times, most recently from e19dc30 to caf86ff Compare February 25, 2021 12:15
Copy link
Contributor

@zhuzhurk zhuzhurk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing all the comments @Thesharing
The change looks good to me.
@tillrohrmann do you want to take another look?

@tillrohrmann
Copy link
Contributor

I'll try to give it a pass until Monday. If I didn't manage to do it, then go ahead with merging it.

Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for creating this PR @Thesharing. The changes go in a good direction. I had a couple of comments. Please take a look.

Comment on lines 65 to 66
// sanity check
checkState(consumedPartitions.size() == inputNumber);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that we have to add the consumed partitions in increasing order? If this is the contract, then we might wanna add a JavaDoc explaining this more explicitly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively we could change the API so that one needs to add all Collection<ConsumedPartitionGroup> when adding an ExecutionVertexID.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this order is redundant, there is no limitation about order before. I prefer to remove inputNumber from the parameters, since currently in EdgeManagerBuildUtils ConsumedPartitionGroup is added one-by-one per JobEdge.

/** Utilities for building {@link EdgeManager}. */
public class EdgeManagerBuildUtil {

public static void connectVertexToResult(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have some tests for this method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like EdgeManager, we think since we didn't change the original logic, it's covered by The all-to-all edges are tested by ExecutionGraphConstructionTest. The pointwise edges are tested by PointwisePatternTest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, could you add a comment to the the two test classes that they effectively test EdgeManagerBuildUtil now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved.

Comment on lines 38 to 60
public List<IntermediateResultPartitionID> getResultPartitions() {
return Collections.unmodifiableList(resultPartitions);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could hide the implementation detail by letting ConsumedPartitionGroup implement the methods we need it to have to directly work with it. For example if it implements size() and Iterable, then it should already go a far way. Maybe we also need get(int index). The same applies to the ConsumerVertexGroup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agreed. This will make the call of ConsumedPartitionGroup more simplified. After discussing with @zhuzhurk, we decided to have the following methods:

  • iterator()
  • size()
  • getFirst() (to replace get(0))
  • isEmpty()

Copy link
Contributor Author

@Thesharing Thesharing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for these patient and enlightening suggestions, @tillrohrmann. I've made several changes according to them. Would you mind re-reviewing it again when you got free time?

Comment on lines 65 to 66
// sanity check
checkState(consumedPartitions.size() == inputNumber);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this order is redundant, there is no limitation about order before. I prefer to remove inputNumber from the parameters, since currently in EdgeManagerBuildUtils ConsumedPartitionGroup is added one-by-one per JobEdge.

Comment on lines 38 to 60
public List<IntermediateResultPartitionID> getResultPartitions() {
return Collections.unmodifiableList(resultPartitions);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agreed. This will make the call of ConsumedPartitionGroup more simplified. After discussing with @zhuzhurk, we decided to have the following methods:

  • iterator()
  • size()
  • getFirst() (to replace get(0))
  • isEmpty()

* based on the {@link DistributionPattern}. The connection information is stored in the {@link
* EdgeManager}.
*/
public static void connectVertexToResult(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems it can be package private.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved.

Comment on lines -383 to -392
if (parallelism % numSources == 0) {
// same number of targets per source
int factor = parallelism / numSources;
sourcePartition = subTaskIndex / factor;
} else {
// different number of targets per source
float factor = ((float) parallelism) / numSources;
sourcePartition = (int) (subTaskIndex / factor);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the previous code which handles case XXX % YYY == 0 is an unnecessarily complication and we can simplify it a bit. The result should be the same and PointwiseTest#testPointwiseConnectionSequence is added to ensure this.

Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating this PR @Thesharing. I had a few more comments. Please take a look.

this.partitionId = new IntermediateResultPartitionID(totalResult.getId(), partitionNumber);

producer.getExecutionGraph().registerResultPartition(partitionId, this);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a huge fan of coupling components by these kind of constructs. Couldn't we register the IntermediateResultPartition where it is created (e.g. in the ExecutionVertex)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it seems coupling too much. I think it's better to register all the ExecutionVertices and IntermediateResultParititons after all of them are created.

Now we register them in ExecutionGraph#registerExecutionVerticesAndResultPartitions, and this method is called in ExecutionGraph#attachJobGraph, right after creating all the ExecutionJobVertices.

Comment on lines 96 to 102
private EdgeManager getEdgeManager() {
return producer.getExecutionGraph().getEdgeManager();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this shows that we are overly coupling the IntermediateResultPartition with the ExecutionGraph. Couldn't we give an EdgeManager to the IntermediateResultPartition when we create it? This makes the dependency explicit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved. I'm wondering should we also give EdgeManager to the constructor of ExecutionVertex? I'm not sure about it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically yes. But the ExecutionVertex is already coupled quite tightly to the ExecutionGraph. Hence, it might not make a big difference to not pass it in.

Copy link
Contributor Author

@Thesharing Thesharing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for providing these great suggestions, @tillrohrmann. They really help me improve this pull request. Would you mind re-reviewing it if you got free time? Thank you in advance.

Comment on lines 96 to 102
private EdgeManager getEdgeManager() {
return producer.getExecutionGraph().getEdgeManager();
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved. I'm wondering should we also give EdgeManager to the constructor of ExecutionVertex? I'm not sure about it.

@Thesharing Thesharing force-pushed the flink-21110 branch 2 times, most recently from 782c2e6 to 3ed3f44 Compare March 4, 2021 15:39
@Thesharing Thesharing force-pushed the flink-21110 branch 3 times, most recently from 9b89841 to 1a9fa79 Compare March 5, 2021 08:21
@Thesharing
Copy link
Contributor Author

I've rebased on latest master branch, due to changes introduced in FLINK-21347.

@tillrohrmann
Copy link
Contributor

I'll try to give it another pass today.

Copy link
Contributor

@tillrohrmann tillrohrmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating this PR @Thesharing. It looks really nice now. Well done! I had a few very minor comments. I will address them myself while merging this PR.

Comment on lines +1370 to +1378
private void registerExecutionVerticesAndResultPartitions(
List<ExecutionJobVertex> executionJobVertices) {
for (ExecutionJobVertex executionJobVertex : executionJobVertices) {
for (ExecutionVertex executionVertex : executionJobVertex.getTaskVertices()) {
executionVerticesById.put(executionVertex.getID(), executionVertex);
resultPartitionsById.putAll(executionVertex.getProducedPartitions());
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this is a very good solution :-)

Comment on lines +55 to +60
// sanity check
checkState(
consumers.isEmpty(), "Currently there has to be exactly one consumer in real jobs");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This checkState and the one above seem to be testing the same thing. I would keep only one. Ideally one with an explanation message.


@Override
public EdgeManager getEdgeManager() {
return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's implement this method with UnsupportedOperationException


@Override
public ExecutionVertex getExecutionVertexOrThrow(ExecutionVertexID id) {
return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's implement this method with UnsupportedOperationException

@Override
public IntermediateResultPartition getResultPartitionOrThrow(
IntermediateResultPartitionID id) {
return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's implement this method with UnsupportedOperationException

@Thesharing
Copy link
Contributor Author

Thank you, @tillrohrmann and @zhuzhurk! I believe I've learnt a lot from your suggestions. I'll start to prepare the next pull request. Thank you for these enlightening reviews!

autophagy pushed a commit to autophagy/flink that referenced this pull request Mar 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants