New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issues with final stage info #12179

Merged
merged 6 commits into from Jan 9, 2019

Conversation

Projects
None yet
6 participants
@dain
Copy link
Contributor

dain commented Jan 4, 2019

Two things that I think need to be fixed:

  1. we should "capture" the final stage info once it is created, so it can not be changed.
  2. we should not allow tasks to be added after the stage is in a done state.

I think the first one is pretty straight forward, but I think the second one should be fixed first since it would be strange to have tasks in a stage that are not in the stage info. The second one is more tricky to fix, because we can't do the easy thing and simply throw an exception because a stage could finish early due to a limit. Instead, we need to adapt the callers to expect an Optional<RemoteTask>. I'm working on these now.

Fixes #12095

@dain dain changed the title WIP: Fix issues with final stage info Fix issues with final stage info Jan 7, 2019

@nezihyigitbasi
Copy link
Contributor

nezihyigitbasi left a comment

Remove unused StageStateMachine get info with child stages

Show resolved Hide resolved ...-main/src/main/java/com/facebook/presto/execution/StageStateMachine.java
{
checkState(stageState.get().isDone());
finalStatusReady.set(true);
StageInfo stageInfo = getStageInfo(() -> finalTaskInfos);
checkArgument(stageInfo.isCompleteInfo(), "finalTaskInfos are not final");

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Jan 7, 2019

Contributor

I think Verify.verify(...) is a better fit here.

verify(stageInfo.isCompleteInfo(), "finalTaskInfos are not final");

Also the error message is a bit confusing. The check requires all tasks and the stage to be done, maybe we should just say so (e.g., "All tasks and the stage must be done" or sth like that).

This comment has been minimized.

@dain

dain Jan 8, 2019

Contributor

I went with checkArguments because finalTaskInfos is an argument to the function, but I did update the error message

Show resolved Hide resolved ...-main/src/main/java/com/facebook/presto/execution/StageStateMachine.java Outdated

public BasicStageStats toBasicStageStats(StageState stageState)
{
boolean isScheduled = (stageState == RUNNING) || stageState.isDone();

This comment has been minimized.

@nezihyigitbasi

nezihyigitbasi Jan 7, 2019

Contributor

According to this condition, isScheduled will be false for a stage in SCHEDULED state. Would that be OK?

This comment has been minimized.

@dain

dain Jan 7, 2019

Contributor

Correct. This code is copied from from StageStateMachine.getBasicStageStats(...).... I'm not really sure what this is used for, but it needs to be consistent.

@arhimondr
Copy link
Member

arhimondr left a comment

Prevent new tasks from being created in a done stage

LGTM

@arhimondr
Copy link
Member

arhimondr left a comment

Remove unused StageStateMachine get info with child stages

LGTM % @nezihyigitbasi's comment

@arhimondr
Copy link
Member

arhimondr left a comment

Require stage to be in a done state before all tasks final is set

LGTM

@nezihyigitbasi
Copy link
Contributor

nezihyigitbasi left a comment

LGTM % minor comments.

@arhimondr

This comment has been minimized.

Copy link
Member

arhimondr commented Jan 8, 2019

@wenleix from what i understand this race may result in a query not being cleaned up on the coordinator potentially generating memory leaks. @dain ?

@arhimondr
Copy link
Member

arhimondr left a comment

Ensure final stage info is never changed

LGTM

@arhimondr

This comment has been minimized.

Copy link
Member

arhimondr commented Jan 8, 2019

@dain thanks for fixing that

@@ -326,6 +334,11 @@ public BasicStageStats getBasicStageStats(Supplier<Iterable<TaskInfo>> taskInfos

public StageInfo getStageInfo(Supplier<Iterable<TaskInfo>> taskInfosSupplier)

This comment has been minimized.

@wenleix

wenleix Jan 8, 2019

Contributor

Curious: Why it takes Supplier as the input?

This comment has been minimized.

@dain

dain Jan 8, 2019

Contributor

That is a very good question. The "supplierness" of this and getBasicStageStats doesn't seem to be used at all. I'll address that in another PR.

@wenleix

wenleix approved these changes Jan 8, 2019

Copy link
Contributor

wenleix left a comment

LGTM

@dain dain force-pushed the dain:final-stage-info branch from 0ef2c18 to 8412a1f Jan 8, 2019

@dain dain closed this Jan 9, 2019

@dain dain deleted the dain:final-stage-info branch Jan 9, 2019

@dain dain merged commit 8412a1f into prestodb:master Jan 9, 2019

1 of 2 checks passed

continuous-integration/travis-ci/push The Travis CI build is in progress
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment