Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add offline_timeout_max option to control offline interval backoff #5252

Closed
wants to merge 4 commits into from

Conversation

deastoe
Copy link
Contributor

@deastoe deastoe commented Jul 21, 2020

The offline_timeout period determines the minimum time between
attempts of a data provider to go back online, if it is offline due
to eg. unreachable servers. Each time this check fails there is a
backoff factor applied meaning there can be up to 60 minutes between
these attempts.

Here we introduce the offline_timeout_max option which allows the
the maximum period between attempts to be defined in the configuration,
instead of the default 60 minutes; therefore providing more
flexibility.

Setting offline_timeout_max to 0 disables the backoff functionality.

Additionally fix two issues observed in be_ptask scheduling:

  • Backoff is not applied on the first re-schedule operation
  • The maximum backoff value (previously hardcoded) might not be reached.

src/providers/data_provider_be.c Outdated Show resolved Hide resolved
src/providers/data_provider_be.c Show resolved Hide resolved
src/providers/data_provider_be.c Outdated Show resolved Hide resolved
DEBUG(SSSDBG_CONF_SETTINGS,
"Failed to get offline_timeout_max from confdb. "
"Will use 3600 seconds.\n");
offline_timeout_max = 3600;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that changing those hardcoded "60" and "3600" everywhere may be a good candidate for next contribution.
They may look better as a some function local const or global define if it is worht it. Right now in future when someone change 60s to lets say 90s will have to go around multiple lines to fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I've added some defines at the top of the file. Apologies for the delay getting back to this PR.

The offline_timeout period determines the minimum time between
attempts of a data provider to go back online, if it is offline due
to eg. unreachable servers. Each time this check fails there is a
backoff factor applied meaning there can be up to 60 minutes between
these attempts.

Here we introduce the offline_timeout_max option which allows the
the maximum period between attempts to be defined in the configuration,
instead of the default 60 minutes; therefore providing more
flexibility.

Setting offline_timeout_max to 0 disables the backoff functionality.
If the incremented delay value was greater than max_backoff then
the previous delay was used, rather than using max_backoff as a
ceiling value.
The task interval backoff is not applied on the first re-schedule
operation, since when scheduling the first run (BE_PTASK_FIRST_DELAY)
we do not calculate the backed off period for the next re-schedule.

Calculate the backed off period for the current scheduling operation,
rather than the next, to resolve this.
Replace hardcoded default value of 60 in a couple of places.
Copy link
Contributor

@elkoniu elkoniu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pbrezina pbrezina added the Ready to push Ready to push label Sep 23, 2020
@pbrezina
Copy link
Member

Pushed PR: #5252

  • master
    • 04ea422 - data_provider_be: Add OFFLINE_TIMEOUT_DEFAULT
    • 904ff17 - be_ptask: backoff not applied on first re-schedule
    • 7807ffd - be_ptask: max_backoff may not be reached
    • b1ef82b - data_provider_be: Configurable max offline time

@pbrezina pbrezina added Pushed and removed Accepted Ready to push Ready to push labels Sep 23, 2020
@pbrezina pbrezina closed this Sep 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants