[FLINK-3390] [runtime, tests] Restore savepoint on ExecutionGraph restart#1720
Closed
uce wants to merge 1 commit intoapache:masterfrom
Closed
[FLINK-3390] [runtime, tests] Restore savepoint on ExecutionGraph restart#1720uce wants to merge 1 commit intoapache:masterfrom
uce wants to merge 1 commit intoapache:masterfrom
Conversation
…h restart Temporary work around to restore initial state on failure during recovery as required by a user. Will be superseded by FLINK-3397 with better handling of checkpoint and savepoint restoring. A failure during recovery resulted in restarting a job without its savepoint state. This temporary work around makes sure that if the savepoint coordinator ever restored a savepoint and there was no checkpoint after the savepoint, the savepoint state will be restored again.
Contributor
|
Looks good to me, pretty good test. Is this crucial for the next 1.0 RC? |
Contributor
Author
|
I would say yes, because a user ran into this issue and asked for a fix. |
Contributor
|
Changes look good to me. Good work @uce :-) Will merge it to the master and the release branch. |
asfgit
pushed a commit
that referenced
this pull request
Feb 26, 2016
…h restart Temporary work around to restore initial state on failure during recovery as required by a user. Will be superseded by FLINK-3397 with better handling of checkpoint and savepoint restoring. A failure during recovery resulted in restarting a job without its savepoint state. This temporary work around makes sure that if the savepoint coordinator ever restored a savepoint and there was no checkpoint after the savepoint, the savepoint state will be restored again. This closes #1720.
subhankarb
pushed a commit
to subhankarb/flink
that referenced
this pull request
Mar 17, 2016
…h restart Temporary work around to restore initial state on failure during recovery as required by a user. Will be superseded by FLINK-3397 with better handling of checkpoint and savepoint restoring. A failure during recovery resulted in restarting a job without its savepoint state. This temporary work around makes sure that if the savepoint coordinator ever restored a savepoint and there was no checkpoint after the savepoint, the savepoint state will be restored again. This closes apache#1720.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Temporary work around to restore initial state on failure during recovery as
required by a user. Will be superseded by FLINK-3397 with better handling of
checkpoint and savepoint restoring.
A failure during recovery resulted in restarting a job without its savepoint
state. This temporary work around makes sure that if the savepoint coordinator
ever restored a savepoint and there was no checkpoint after the savepoint,
the savepoint state will be restored again.