-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CON-457] add retry and exponential backoff to ensurePrimaryMiddlware
#4171
Conversation
ensurePrimaryMiddlware
ensurePrimaryMiddlware
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hahah this will make things much worse - sorry the card was not very clear, but getReplicaSetSpIdsByUserId
internally makes retries with a fixed delay (check out that code)
the scope of this is to actually replace that retry logic with asyncRetry
Oh god it does have its own retry log RIP - reverting and taking another crack at it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to change values, otherwise lgtm
this is critical path change (every user write) - good to ensure things seem ok locally and on staging
Description
This PR adds retry logic to
ensurePrimaryMiddleware
with an exponential backoff to reduce strain on discovery nodesTests
No extra tests needed since each individual piece seems to be tested well enough
Monitoring - How will this change be monitored? Are there sufficient logs / alerts?
This change can be monitored with content node logs to make sure it actually retries as well as discovery provider grafana metrics more generally to see if there is in fact less strain.