Skip to content

[FLINK-6833] [task] Fail StreamTask only due to async exception if it is running#4058

Closed
tillrohrmann wants to merge 2 commits intoapache:masterfrom
tillrohrmann:fixStreamTaskRaceCondition
Closed

[FLINK-6833] [task] Fail StreamTask only due to async exception if it is running#4058
tillrohrmann wants to merge 2 commits intoapache:masterfrom
tillrohrmann:fixStreamTaskRaceCondition

Conversation

@tillrohrmann
Copy link
Contributor

In order to resolve a race condition between a properly terminated StreamTask which
cleans up its resources (stopping asynchronous operations, etc.) and a cancelled
asynchronous operation (e.g. asynchronous checkpointing operation), we check whether
the StreamTask is still running before failing it externally.

… is running

In order to resolve a race condition between a properly terminated StreamTask which
cleans up its resources (stopping asynchronous operations, etc.) and a cancelled
asynchronous operation (e.g. asynchronous checkpointing operation), we check whether
the StreamTask is still running before failing it externally.
@zentol
Copy link
Contributor

zentol commented Jun 2, 2017

The test will not pass checkstyle. You can run mvn checkstyle:check on the command-line to quickly verify checkstyle compliance.

@Override
public void handleAsyncException(String message, Throwable exception) {
getEnvironment().failExternally(exception);
if (isRunning) {
Copy link
Contributor

@StefanRRichter StefanRRichter Jun 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt that this is a complete fix, it will only make the problem less likely to occur. This method can be called asynchronously to Threads that manipulate isRunning, which means that the stream task can leave running status after the condition was checked as true, but before failExternally(...) went through.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If isRunning == true when entering the if branch, then depending on what happens before failExternally, we can assume that the handleAsyncException either happened atomically before isRunning was set to false or not. But what we don't want to happen is that if isRunning == false, that we can still fail the task. Thus, I think it solves a valid problem.

@tillrohrmann
Copy link
Contributor Author

I would like to merge this PR if I could clarify all concerns and there are no further objections.

@StefanRRichter
Copy link
Contributor

LGTM +1

@tillrohrmann
Copy link
Contributor Author

Thanks for the review @zentol and @StefanRRichter. Merging this PR.

@asfgit asfgit closed this in ab2fc02 Jun 13, 2017
asfgit pushed a commit that referenced this pull request Jun 13, 2017
… is running

In order to resolve a race condition between a properly terminated StreamTask which
cleans up its resources (stopping asynchronous operations, etc.) and a cancelled
asynchronous operation (e.g. asynchronous checkpointing operation), we check whether
the StreamTask is still running before failing it externally.

This closes #4058.
@tillrohrmann tillrohrmann deleted the fixStreamTaskRaceCondition branch July 6, 2017 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants