Skip to content

Conversation

@tillrohrmann
Copy link
Contributor

This fixes the problem that when checkpointing is enabled and the number of execution retries is set to 0 that the automatic restarting should be deactivated. This is consistent with the semantics before the restart strategies where introduced.

Be aware, though, that whenever you enable checkpointing for streaming jobs, the cluster wide default restart strategy which is set in the configuration will always be overwritten. Either by manually setting a restart strategy or by automatically setting a FixedDelayRestartStrategy(Integer.MAX_VALUE, 10000) strategy in the StreamingJobGraphGenerator if nothing was specified. That is consistent with the previous behaviour where all default execution retry attempts set in the configuration where overwritten in case of a checkpointed streaming job.

Additionally, this PR sets the default retry delay to 10000 ms and disallows to set it to negative values. Before, the default execution retry delay would have been used if the delay in the ExecutionConfig was set to -1. However, with the new RestartStrategies there is no longer the possibility to set an explicit execution retry delay independent of the number of retries in the configuration file. Therefore it is no longer possible to set the number of execution retries in the configuration file and to specify the execution retry delay in the ExecutionConfig or vice versa. Both have to be defined either in the ExecutionConfig or the configuration file.

if (executionRetryDelay < 0 ) {
throw new IllegalArgumentException(
"The delay between reties must be non-negative, or -1 (use system default)");
"The delay between reties must be non-negative.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can fix this typo while we're at it: reties:retries

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Overlooked the typo. Will fix it.

@tillrohrmann
Copy link
Contributor Author

Thanks for the review @zentol. Will merge it.

tillrohrmann added a commit to tillrohrmann/flink that referenced this pull request Feb 23, 2016
@asfgit asfgit closed this in 6323ed4 Feb 23, 2016
subhankarb pushed a commit to subhankarb/flink that referenced this pull request Mar 17, 2016
@tillrohrmann tillrohrmann deleted the fixRestartStrategy branch August 19, 2016 12:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants