Replies: 1 comment 3 replies
-
|
we will have #215. |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Current Behavior
In pitchfork, daemons can be configured with a
retrycount to automatically restart on failure (e.g.,retry = 3retries up to 3 times). However, once retries are exhausted, the daemon simply stays in an "Errored" state with no further action—no cleanup, notifications, or alternative recovery mechanisms. The retry count is tracked internally but isn't exposed to the daemon's run command.This limits robustness for production-like scenarios where you might want to handle failures gracefully after all retries fail.
Proposed Enhancements
On-Failure Hook: Add a new config option like
on_failure = "cleanup.sh"inpitchfork.toml. When retries are fully exhausted, execute this hook once (e.g., for logging, sending alerts, or running cleanup scripts).Retry Count Injection: Pass the current retry attempt as an environment variable (e.g.,
PITCHFORK_RETRIED=3) to the daemon's run command, allowing the script to adjust behavior based on the attempt number (e.g., exponential backoff or different logic per retry).Use Case Examples
Database Cleanup on API Failure: An API daemon retries 3 times on connection errors. After exhaustion, run an
on_failurescript to flush caches or notify DevOps.Benefit: Prevents stale data buildup without manual intervention.
Progressive Retry Logic: A batch job script uses
RETRY_COUNTto increase delays (e.g., sleep longer on higher attempts).Benefit: Smarter backoff without pitchfork handling complex logic.
Alerting on Critical Failures: After retries fail for a critical service, trigger an alert via the hook.
Benefit: Integrates with external monitoring without polling logs.
What do you think? Are there other ways to handle this?
Beta Was this translation helpful? Give feedback.
All reactions