Fix: sbd-inquisitor: Implement default delay start for diskless sbd#133
Conversation
|
Can one of the admins verify this patch? |
|
test this please |
| exit_status = ping_via_slots(argv[optind + 1], servants_leader); | ||
|
|
||
| } else if (strcmp(argv[optind], "watch") == 0) { | ||
| } else if (strcmp(argv[optind], "query-watchdog") == 0) { |
There was a problem hiding this comment.
guess we need this to be outside the disk-only-section
There was a problem hiding this comment.
Yep those two should be separated as before.
There was a problem hiding this comment.
Guess we can have one long if-then-else-if-then ... clause with the first part disabled when built without disk-support.
| timeout_watchdog_warn = calculate_timeout_watchdog_warn(timeout_watchdog); | ||
| } | ||
|
|
||
| cl_log(LOG_DEBUG, "Delay start: %s%s%s", |
There was a problem hiding this comment.
why don't we postpone that to after we've actually calculated the values that are being executed so that we can display them.
to simplify the logic and still not loose the information where the data came from we can set a pointer to a string constant when updating the delay from e.g. 'SBD_WATCHDOG_TIMEOUT * 2', 'msgwait', 'watchdog-timeout from cmdline * 2' (possible other source of watchdog would be the disk-header but this isn't relevant here as we would the go for msgwait taken from there) and add that as a hint to the log.
| timeout_watchdog_warn = calculate_timeout_watchdog_warn(timeout_watchdog); | ||
| } | ||
| } else { | ||
| cl_log(LOG_DEBUG, "Delay start (no)"); |
| } else if (strcmp(argv[optind], "ping") == 0) { | ||
| exit_status = ping_via_slots(argv[optind + 1], servants_leader); | ||
|
|
||
| } else { |
There was a problem hiding this comment.
thought we'd move that outside the with-disk-section right at the end.
} else if (strcmp(argv[optind], "ping") == 0) {
exit_status = ping_via_slots(argv[optind + 1], servants_leader);
} else
#endif
if (strcmp(argv[optind], "query-watchdog") == 0) {
...
} else {
exit_status = -2;
}
There was a problem hiding this comment.
Ah ok, and split the watch option from it right ?
There was a problem hiding this comment.
Ok I've pushed the commit with the white space fix to soon.
|
|
||
| if (delay_start && delay <= 0) { | ||
| delay = get_first_msgwait(servants_leader); | ||
| delay_source = "msgwait"; |
There was a problem hiding this comment.
probably just 'if (delay > 0)' as otherwise 2 * watchdog-timeout is taken (disk is probably not the source as otherwise reading msgwait would have worked).
even better would probably have been to bail out if msgwait can't be read but let's leave that to a maybe subsequent PR as it would mean a change in existing behavior and for 3 disks it shouldn't fail if the first disk fails.
| exit_status = watchdog_test(); | ||
| } else if (strcmp(argv[optind], "watch") == 0) { | ||
| /* sleep $(sbd $SBD_DEVICE_ARGS dump | grep -m 1 msgwait | awk '{print $4}') 2>/dev/null */ | ||
| sleep((unsigned long) delay); |
There was a problem hiding this comment.
sleep(0) is more than a NOP (something like sched_yield()) so we should probably keep it in the delay_start block.
There was a problem hiding this comment.
Will fix that, got mix up will the code move and all.
| exit_status = watchdog_info(); | ||
| } else if (strcmp(argv[optind], "test-watchdog") == 0) { | ||
| exit_status = watchdog_test(); | ||
| } else { |
There was a problem hiding this comment.
why not even further to the end after the "watch"-section?
| if (strcmp(argv[optind], "watch") == 0) { | ||
| /* sleep $(sbd $SBD_DEVICE_ARGS dump | grep -m 1 msgwait | awk '{print $4}') 2>/dev/null */ | ||
|
|
||
| const char *delay_source = "SBD_DELAY_START"; |
There was a problem hiding this comment.
not necessarily source of delay at that point still can be SBD_DELAY_START, SBD_WATCHDOG_TIMEOUT * 2, default watchdog-timeout * 2, watchdog-timeout from cmdline
There was a problem hiding this comment.
I'm a bit confused about that.
I see delay set only here after it has been set initially to 0.
value = get_env_option("SBD_DELAY_START");
if(value) {
delay_start = crm_is_true(value);
if (!delay_start) {
delay = crm_get_msec(value) / 1000;
if (delay > 0) {
delay_start = true;
}
}
}
After that point it has been set in the watch section.
There was a problem hiding this comment.
True. But if it is not set there the source is 'timeout_watchdog * 2' and that again can have different sources.
Tracking where certain settings are coming from - and logging e.g. in debug or even at higher level together with the value - is one of my projects.
Here debug had that implemented already half-way so I thought we should keep it.
So as to keep that here low effort we can go with 'watchdog-timeout * 2' for now in cases startup-delay is derived from watchdog timeout. We still can add tracking of where watchdog-timeout is coming from in a separate PR including the decision if with delay we still want to show with 'watchdog-timeout' or if we replace the string with the details.
There was a problem hiding this comment.
Thanks for the clarifications and for your patience to review my changes.
I will commit the relevant changes.
| if (delay <= 0) { | ||
| delay = get_first_msgwait(servants_leader); | ||
| } | ||
| if (delay_start && delay <= 0) { |
There was a problem hiding this comment.
Guess this should be in the block with (disc_count > 0) as get_first_msgwait is doomed to fail otherwise.
| if (delay_start) { | ||
| if (disk_count == 0 && delay <= 0) { | ||
| delay = 2 * timeout_watchdog; | ||
| delay_source = "SBD_WATCHDOG_TIMEOUT * 2"; |
There was a problem hiding this comment.
we don't know where watchdog-timeout is coming from - yet.
so we probably should stay with what delay_source already has.
|
test this please |
|
format-string issues detected by ubuntu builders ... |
Thanks for the hint, fixed. |
|
test this please |
|
Went through the final result again and I'm afraid the logic atm still has 2 holes. |
|
Maybe something like: |
I think it already does so in get_first_msgwait(). The function actually means to "get_msgwait_from_the_first_available_disk" :-) |
Sounds sensible and clear to me. |
Yep I guess the check just before the sleep() |
The same impression here. |
Shame on me - the loop is quite obvious ;-) |
Yep that would create the correct logic with minimum code. But we should spare a comment to state that we catch disk-read-issues here as well to remember later on. When we detect the correct reading of msgwait above (delay > 0) we can still state in a warning otherwise that we default to 2 * watchdog-timeout as the final log is just a debug. |
|
|
||
| if (delay > 0) { | ||
| delay_source = "msgwait"; | ||
| cl_log(LOG_WARNING, "Using 'msgwait' value for 'delay' starting instead of default '2 * watchdog-timeout'"); |
There was a problem hiding this comment.
Sorry if my unclear phrasing led to that ;-)
I guess we are in the branch here where everything runs as configured so no need for a warning.
I meant an else here that would warn about '2 * watchdog-timeout' being taken instead of unreadable 'msgwait'.
There was a problem hiding this comment.
No worries, actually this was my initial thought to. But I've understand the message a bit different.
Will fix that quick.
SBD sysconfig option `SBD_DELAY_START` is for addressing the case where cluster nodes reboot so fast even before fencing targeting it returns. This could be common for virtual machines. So far technically, a setting `SBD_DELAY_START=yes` takes effect only for sbd with shared disk, since the delay value is basically `msgwait` retrieved from sbd metadata. For diskless sbd, for now we'd have to specify `SBD_DELAY_START` with a explicit delay value. We need to implement to make `SBD_DELAY_START=yes` take effect for diskless sbd as well. In this case, the delay value should be `SBD_WATCHDOG_TIMEOUT * 2`
|
test this please |
|
Thank you for the contribution! |
SBD sysconfig option
SBD_DELAY_STARTis for addressing the case where cluster nodes reboot so fast even before fencing targeting it returns.This could be common for virtual machines.
So far technically, a setting
SBD_DELAY_START=yestakes effect only for sbd with shared disk, since the delay value is basicallymsgwaitretrieved from sbd metadata.For diskless sbd, for now we'd have to specify
SBD_DELAY_STARTwith a explicit delay value.We need to implement to make
SBD_DELAY_START=yestake effect for diskless sbd as well. In this case, the delay value should beSBD_WATCHDOG_TIMEOUT * 2