[FLINK-12667][runtime] Add JobID to TaskExecutorGateway#releasePartitions #8630

zentol · 2019-06-05T12:46:07Z

What is the purpose of the change

Adds JobID as an argument to TaskExecutorGateway#releasePartitions to simplify bookkeeping on the taskmanager side.

Additionally adds tests for calls to said method originating in Execution.

flinkbot · 2019-06-05T12:48:53Z

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Review Progress

❓ 1. The [description] looks good.
❓ 2. There is [consensus] that the contribution should go into to Flink.
❓ 3. Needs [attention] from.
❓ 4. The change fits into the overall [architecture].
❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required

Bot commands

The @flinkbot bot supports the following commands:

@flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
@flinkbot approve all to approve all aspects
@flinkbot approve-until architecture to approve everything until architecture
@flinkbot attention @username1 [@username2 ..] to require somebody's attention
@flinkbot disapprove architecture to remove an approval you gave earlier

azagrebin

Thanks for the PR @zentol, I left some comments

flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java

flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/Execution.java

azagrebin · 2019-06-05T16:57:25Z

flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/TaskExecutor.java

@@ -641,7 +641,7 @@ private void stopTaskExecutorServices() throws Exception {
 	}

 	@Override
-	public void releasePartitions(Collection<ResultPartitionID> partitionIds) {
+	public void releasePartitions(JobID jobId, Collection<ResultPartitionID> partitionIds) {


a bit weird that we will leave jobId here unused, is the future change that needs it going to be so big?

it's not gonna be big, but this part is simply already done.

zhijiangW · 2019-06-06T08:57:34Z

flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java

+		final SimpleAckingTaskManagerGateway taskManagerGateway = new SimpleAckingTaskManagerGateway();
+		taskManagerGateway.setReleasePartitionsConsumer(releasedPartitions::setFields);
+
+		final SimpleSlot slot = new SimpleSlot(


Maybe we could provide an utility for creating SimpleSlot in this class, because it would be reused for many tests.

such a utility already exists in #createProgrammedSlotProvider, will update the test to use that instead

zhijiangW · 2019-06-06T09:28:28Z

flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java

+			Collections.emptySet(),
+			TestingUtils.infiniteTime());
+
+		execution.deploy();


Maybe we could further provide an utility for creating Execution like below:

private CompletableFuture<Execution> createExecution( TaskManagerGateway taskManagerGateway, JobVertex... vertices) throws Exception { SimpleSlot slot = new SimpleSlot( new SingleSlotTestingSlotOwner(), new LocalTaskManagerLocation(), 0, taskManagerGateway); ProgrammedSlotProvider slotProvider = new ProgrammedSlotProvider(1); slotProvider.addSlot(vertices[0].getID(), 0, CompletableFuture.completedFuture(slot)); ExecutionGraph executionGraph = ExecutionGraphTestUtils.createSimpleTestGraph( new JobID(), slotProvider, new NoRestartStrategy(), vertices); executionGraph.start(TestingComponentMainThreadExecutorServiceAdapter.forMainThread()); ExecutionJobVertex executionJobVertex = executionGraph.getJobVertex(vertices[0].getID()); ExecutionVertex executionVertex = executionJobVertex.getTaskVertices()[0]; Execution execution = executionVertex.getCurrentExecutionAttempt(); CompletableFuture<Execution> allocationFuture = execution.allocateAndAssignSlotForExecution( slotProvider, false, LocationPreferenceConstraint.ALL, Collections.emptySet(), TestingUtils.infiniteTime()); return allocationFuture; }

Then we could get Execution from future, and further get ExecutionVertex and ExecutionGraph from Execution. Then this helper could be reused for many existing tests.

I agree that we could de-duplicate some code here. I'm concerned that we're baking in a few assumptions (vertex parallelism should be 1, first vertex in the array is special), and there a few subtle differences in tests that where I don't know whether they are significant or not. (For example, #testTaskRestoreStateIsNulledAfterDeployment doesn't allocate a slot beforehand). I don't have the time right now to really look into these things, so I'd move any larger refactoring to this class into a follow-up.

zhijiangW · 2019-06-06T09:36:52Z

flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java

+		execution.markFinished();
+		postFinishedExecutionAction.accept(execution);
+
+		assertEquals(executionGraph.getJobID(), releasedPartitions.f0);


maybe use assertThat instead for assertEquals

zhijiangW · 2019-06-06T09:40:16Z

flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java

+			IntermediateResultPartition intermediateResultPartition = executionVertex
+				.getProducedPartitions()
+				.get(partitionId.getPartitionId());
+			assertNotNull(intermediateResultPartition);


this assert might be not necessary

this check is done to ensure that the ids of all released partitions are actually valid. without it the test would pass even if completely random partitions were passed to the task executor.

I'll add a comment to clarify this; I had to think for a bit myself as to what we're checking here.

zhijiangW

Thanks for opening this PR @zentol .

It looks good to me, I think the added JobID might be used in future, only has some deduplication concerns in test.

gaoyunhaii · 2019-06-06T12:18:38Z

flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java

+	 * Tests that the partitions are released in case of an execution cancellation after the execution is already finished.
+	 */
+	@Test
+	public void testPartitionReleaseOnCancelAfterFinished() throws Exception {


I think "...OnCancelingAfterFinished" would be better in considering of the gramma, but I think I won't argue for that. :)

gaoyunhaii · 2019-06-06T12:20:04Z

flink-runtime/src/test/java/org/apache/flink/runtime/executiongraph/ExecutionTest.java

+	 * Tests that the partitions are released in case of an execution suspension after the execution is already finished.
+	 */
+	@Test
+	public void testPartitionReleaseOnSuspendAfterFinished() throws Exception {


Similarly, I think "...OnSuspendingAfterFinished" would be better, but I won't argue for that too.

…ions

rmetzger added review=description? component=Runtime/Coordination labels Jun 5, 2019

azagrebin reviewed Jun 5, 2019

View reviewed changes

zhijiangW reviewed Jun 6, 2019

View reviewed changes

gaoyunhaii reviewed Jun 6, 2019

View reviewed changes

zentol mentioned this pull request Jun 11, 2019

[FLINK-12612][coordination] Track stored partition on the TaskExecutor #8687

Merged

[hotfix][docs] Minor clarification

389c4c7

zentol force-pushed the 12667 branch 2 times, most recently from ab7cf17 to 3c10fed Compare June 13, 2019 13:55

[FLINK-12667][runtime] Add JobID to TaskExecutorGateway#releasePartit…

0a3cdd4

…ions

zentol force-pushed the 12667 branch from 3c10fed to 0a3cdd4 Compare June 13, 2019 13:56

zentol merged commit b86ba3b into apache:master Jun 13, 2019

zentol deleted the 12667 branch June 19, 2019 15:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FLINK-12667][runtime] Add JobID to TaskExecutorGateway#releasePartitions #8630

[FLINK-12667][runtime] Add JobID to TaskExecutorGateway#releasePartitions #8630

zentol commented Jun 5, 2019

flinkbot commented Jun 5, 2019

azagrebin left a comment

azagrebin Jun 5, 2019

zentol Jun 5, 2019

zhijiangW Jun 6, 2019

zentol Jun 12, 2019

zhijiangW Jun 6, 2019 •

edited by zentol

zentol Jun 12, 2019

zhijiangW Jun 6, 2019 •

edited

zhijiangW Jun 6, 2019

zentol Jun 12, 2019

zhijiangW left a comment

gaoyunhaii Jun 6, 2019 •

edited

gaoyunhaii Jun 6, 2019

[FLINK-12667][runtime] Add JobID to TaskExecutorGateway#releasePartitions #8630

[FLINK-12667][runtime] Add JobID to TaskExecutorGateway#releasePartitions #8630

Conversation

zentol commented Jun 5, 2019

What is the purpose of the change

flinkbot commented Jun 5, 2019

Review Progress

azagrebin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhijiangW Jun 6, 2019 • edited by zentol

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhijiangW Jun 6, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhijiangW left a comment

Choose a reason for hiding this comment

gaoyunhaii Jun 6, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhijiangW Jun 6, 2019 •

edited by zentol

zhijiangW Jun 6, 2019 •

edited

gaoyunhaii Jun 6, 2019 •

edited