-
-
Notifications
You must be signed in to change notification settings - Fork 783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get notified if a job is run too often #17
Comments
Thanks for suggestion!
Which one did you have in mind? Any ideas on how to distinguish between the two? |
I was thinking about the second scenario. It probably has to be something you can turn on/off for a check, since you still want to support checks for jobs that run at irregular times (but you still want to be notified if it hasn't run within X time). Perhaps a good solution would be to have the ability to specify "minimum time between runs"? |
The minimum time between runs could either be defaulted to 0s, in which case it wouldn't solve scenario 1. Or, it could be defaulted to something like 75% of Period, and - in scenario 1 - it would then notify the user that the check had been pinged too frequently, and the user could then change the setting. I think I'm leaning towards defaulting it to 0s, since it doesn't feel like core-functionality, and it's probably quite likely that users who are trying out the service for the first time would get the warning-notification when they access the check URLs using a browser or Curl. |
Just jumping in here to say I'd love to see this as a feature, I've had a couple of "run-away" crons recently that for some reason get an error while running and starting looping forever. |
So I am looking into developing this feature, I think if the notification uses proper start/active endpoints plus |
This feature request is from 2015. At the time we did not have "fail" signals (added in May 2018) or "start" signals (added in Dec 2018). I think these two help handle some variants of the infinitely looping job scenario:
There's a third case: the completes successfully, but gets restarted right away. On the Healthchecks side this would look like a flood of either "start"/"success" signal pairs, or just a stream of "success" signals. We could add a configurable parameter for this: "if there is more than X success signals in time period Y, let the user know somehow". As usual there is a tradeoff between having more flexibility, and having a harder to use product. Right now, I do not want to go in this direction, seeing as there's been interest in this functionality from 3 persons in 8 years. On the hosted service, healthchecks.io, I'm sometimes dealing with a slightly related problem when someone sticks a hc-ping.com call in a function that runs frequently on many servers. The service starts seeing 10, 50, 100 and more requests per second for a single UUID. To protect the database against this, on the hosted service there is rate-limiting at nginx level. |
IIRC, what prompted me to create the feature request back in 2015 was that I found out that I had a duplicate instance of a backup job that was dumping a database to the same file, causing the data to be corrupted. Both jobs were pinging Healthchecks, so it could have been noticed earlier if this feature had been available.
Completely understandable :). |
Hey, thanks for a really neat project & service!
It would be nice to have the possibility to get notified if a job is run too often.
Unfortunately, I don't have the time to dig in and implement it myself, and I'm sorry in case you don't want feature requests as Github issues.
The text was updated successfully, but these errors were encountered: