Feature idea: signal failure if command completes too quickly #5

cuu508 · 2020-05-13T10:20:17Z

Sometimes apps and scripts fail early, but still return exit code 0. And legacy systems can be hard to fix.

Perhaps there could be an optional feature where runitor measures the execution time of the command, and signals failure if the command completes too quickly? Something like:

# signals success
runitor -min-time=5s -uuid 6e1fbf8f-c17e-4749-af44-0c81461bdd19 -- sleep 1

# signals failure
runitor -min-time=5s -uuid 6e1fbf8f-c17e-4749-af44-0c81461bdd19 -- sleep 6

The text was updated successfully, but these errors were encountered:

cuu508 · 2020-05-13T10:24:04Z

PS. This could also be done on server side, there's a ticket for that: healthchecks/healthchecks#236

One benefit of doing it on client side is more precise timing. HTTP requests have latency – if you measure second-long events using HTTP requests that can also sometimes take seconds, you'll sometimes get false positives and false negatives.

bdd · 2020-05-13T20:15:36Z

Huh. Initially I thought about a possible max-runtime feature to catch runaway processes but that's not very Unix when we already have the right tool for it, timeout from GNU coreutils.

This one is the other way around. I'm curious, how much of a common problem is this?

First pass implementation of the feature idea from #5. README.md and doc.go still needs updates.

bdd · 2020-05-14T05:37:27Z

Here's a one minute implementation e0af841

Honestly it doesn't feel like this feature fits in along with the others. Other than stdout & stderr routing, runitor just implements healthchecks.io Ping API features. Nothing less, nothing more.

Would you consider client measured run duration to be passed as a parameter for success pings (and for symmetry also failure)? This way users can define "inverse of grace period" as mentioned in healthchecks/healthchecks#236, and the act of lifting the signal to failure is done on the server side.

cuu508 · 2020-05-14T07:22:47Z

I'm curious, how much of a common problem is this?

It's not a common request. It has been requested a few times (in #236 and in email). Measuring run time on client was never explicitly suggested, that was my idea.

Also, if you have a legacy environment where you cannot fix the exit code, chances are you also cannot add a wrapper (runitor) around your command.

I think this feature would fit in if you're OK with Swiss army knife kind of a tool. Various niche but sometimes useful features (examples: uwsgi, curl, ImageMagick, caddy). If you'd rather keep it small and focused, then it's not a good fit.

Would you consider client measured run duration to be passed as a parameter for success pings (and for symmetry also failure)?

The API currently doesn't support that, but that could be added. Reported execution time would take priority over the execution time measured on server. Dead Man Snitch's API and Field Agent works like that (no start signal, execution time measured on client and reported after the job completes).

cuu508 · 2022-10-27T11:34:56Z

Honestly it doesn't feel like this feature fits in along with the others.

Yep, and there's always the option to wrap the legacy app in a custom shell script, with custom success/failure testing criteria (how much time it took? what output did it produce? did it produce file X on the filesystem? etc.).

bdd added a commit that referenced this issue May 14, 2020

Report failure when run duration is too short

c2afaaa

First pass implementation of the feature idea from #5. README.md and doc.go still needs updates.

bdd added a commit that referenced this issue May 14, 2020

Report failure when run duration is too short

e0af841

First pass implementation of the feature idea from #5. README.md and doc.go still needs updates.

cuu508 closed this as not planned Won't fix, can't repro, duplicate, stale Oct 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature idea: signal failure if command completes too quickly #5

Feature idea: signal failure if command completes too quickly #5

cuu508 commented May 13, 2020

cuu508 commented May 13, 2020

bdd commented May 13, 2020

bdd commented May 14, 2020 •

edited

Loading

cuu508 commented May 14, 2020

cuu508 commented Oct 27, 2022

Feature idea: signal failure if command completes too quickly #5

Feature idea: signal failure if command completes too quickly #5

Comments

cuu508 commented May 13, 2020

cuu508 commented May 13, 2020

bdd commented May 13, 2020

bdd commented May 14, 2020 • edited Loading

cuu508 commented May 14, 2020

cuu508 commented Oct 27, 2022

bdd commented May 14, 2020 •

edited

Loading