-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
daemon: add watchdog checker #209
Conversation
Can one of the admins verify this patch? |
I may need help to write tests on this feature |
Can you add documentation for the method? Thanks! |
@squeed should be done :) |
an exemple of usage: https://github.com/guilhem/traefik/commit/1caf8d9d7e45dcc076aa8e584b07745f0402d5e2 |
Great, thanks. |
// (0, err) - an error happened (e.g. error converting time). | ||
// (time, nil) - watchdog is enabled and we can send ping. | ||
// time is delay before inactive service will be killed. | ||
func SdWatchdogEnabled() (time.Duration, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sd_watchdog_enabled()
takes an additional unset_environment
bool which will unconditionally unset the two environment variables. I think it is worth to have it here too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lucab yes, but we doesn't support it neither in sdnotify
:
https://github.com/coreos/go-systemd/blob/master/daemon/sdnotify.go#L29
https://github.com/systemd/systemd/blob/master/src/systemd/sd-daemon.h#L220
So I don't know what to do ^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally think it is fine if the two are not aligned. @jonboulle @squeed may have a different opinion though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really see a reason to introduce the inconsistency; perhaps we could land this and follow up adding that variable? (or is it somehow more useful for this call than for sdnotify?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think @jonboulle is right - might as well add the parameter. Sorry for the runaround.
@squeed tests are there :) |
} | ||
s, err := strconv.Atoi(wusec) | ||
if err != nil { | ||
return 0, fmt.Errorf("Error converting WATCHDOG_USEC: %s", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please lower-case error strings (e.g. "error converting...")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
return 0, fmt.Errorf("Error converting WATCHDOG_USEC: %s", err) | ||
} | ||
if s <= 0 { | ||
return 0, fmt.Errorf("Error WATCHDOG_USEC should be positive") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Error WATCHDOG_USEC should be positive/WATCHDOG_USEC must be a positive number/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Add a function SdWatchdogEnabled() who check if service have to report its status This is inspirated from the behavior of sd_watchdog_enabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple suggestions around the testing.
The implementation code looks okay to me.
// (time, nil) | ||
err := os.Setenv("WATCHDOG_USEC", "100") | ||
if err != nil { | ||
panic(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
t.Fatal
preferred over panic to halt a test early.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually disagree on that one - my usual classification is:
t.Error
for test failures related to user code that can continuet.Fatal
for test failures related to user code that cannot continuepanic
for test failures unrelated to user code e.g. in setup etc (very explicit that it's an environment failure/solar flare and not a code bug)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreeing with @jonboulle here, let's keep it as is.
|
||
delay, err := SdWatchdogEnabled() | ||
if delay == 0 || err != nil { | ||
t.Errorf("TEST: Watchdog enabled FAILED") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're okay with this test being 1.7+ only, we could use subtests to cleanup these repetitive tests and still have nice error reporting.
I don't know how much this project needs compatibility with older go versions.
Failing at using subtests, I'd still rather have a slice of cases to loop through than this, though It's not a blocking concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
subtests are perhaps a bit of a stretch, I agree on having the test-cases in a better tabular form if possible.
And so?.... 👼 |
Tests look great, thanks! |
It looks like we agree this is mostly ok as it is. The only remaining concerns here are:
Neither of those are blockers, so we can either do another iteration here or land this first and followup with a cleaning PR, depending on @guilhem availability. I'd suggest to at least re-arrange the tests in the scope of this PR, and be ready for a change in the function signature happening soon later. @guilhem sounds ok? |
@lucab I'm not very familiar with go tests and don't have many ideas to improve those. |
I need to do something mind-numbing after the US election so I'll take a shot. |
@lucab ptal |
LGTM, thanks everybody. I'll follow up with the |
Add a function
SdWatchdogEnabled()
who check if service have to report its statusThis is inspirated from the behavior of
sd_watchdog_enabled