Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default retry logic does not match PagerDuty documentation. #115

Closed
vectro opened this issue Jun 26, 2023 · 3 comments
Closed

Default retry logic does not match PagerDuty documentation. #115

vectro opened this issue Jun 26, 2023 · 3 comments

Comments

@vectro
Copy link

vectro commented Jun 26, 2023

From pdpyras.py:

The default behavior is to retry infinitely on a 429, and return the
response in any other case (assuming a HTTP response was received from the
server).

However, according to the API documentation, 5xx errors should also be retried.

@Deconstrained
Copy link
Collaborator

In my opinion we shouldn't make this the default, even for the Events API. If the end goal is to minimize dropped events in the event that the PagerDuty Events API isn't responding or is responding with 5XX, implementing a cache/queueing system for alerts (for instance pagerduty-agent uses the local filesystem) is more effective.

I did once a few years ago see an edge case involving extremely high frequency of POST requests to the users endpoint that led to database lock contention and some 5xx responses. That's what led to me making configurable retry logic so that this edge case could be more easily handled, i.e.

session.retry[500] = 2 # internal server error, 3 requests total
session.retry[502] = 4 # bad gateway, 5 requests total
session.retry[503] = 6 # service unavailable, 7 requests total

@vectro
Copy link
Author

vectro commented Jun 27, 2023

Seems odd to me that the official PD documentation says to treat 429 and 5xx responses the same while the official PD client library does not do so (by default); but if that's PagerDuty's position I'm not going to argue about it.

@Deconstrained
Copy link
Collaborator

I've given this more thought and you have a good point. It would be keeping in the client's scope of abstracting the most important aspects of API access described in the documentation to codify this expectation.

Probably the best way to implement this is in a minor release with something like the above three lines in the constructor of the events and change events API client classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants