Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

router: Add RETRY_ON_RESET retry policy #7505

Merged
merged 2 commits into from
Jul 15, 2019

Conversation

matthoey-okta
Copy link
Contributor

@matthoey-okta matthoey-okta commented Jul 9, 2019

Signed-off-by: Matt Hoey matt.hoey@okta.com

For an explanation of how to fill out the fields, please see the relevant section
in PULL_REQUESTS.md

Description: Adds a retry policy for resets. Not a developer, let alone a c++ dev but gave it my best shot. Had a few questions:

  • Should this have a different, more descriptive, name? Perhaps connect-reset
  • Should I add something to codes.h? 503?
  • If so, I'm guessing I should add it to wouldRetryFromHeaders?
  • Didn't know if I should have deprecated gateway-error and 5xx for resets?
  • The code I felt okay on, but testing I was definitely lost. I'm guessing I have no coverage?

Risk Level: Low
Testing: I added one. I can try to do a build in the morning. Honestly I thought we were experiencing this problem, but I found out after I wrote it that we don't have the issue so I'm unsure how to test.
Docs Changes: Pretty much anywhere I found gateway error I added reset documentation. Open to feedback on it.
Release Notes: n/a
Fixes #6726
[Optional Deprecated:]

Signed-off-by: Matt Hoey <matt.hoey@okta.com>
@matthoey-okta
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

🤷‍♀️ nothing to rebuild.

🐱

Caused by: a #7505 (comment) was created by @matthoey-okta.

see: more, trace.

Copy link
Contributor

@snowp snowp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this seems good to me modulo the one doc comment. Coverage seems to be good according to the generated coverage report, so I don't think additional testing is required.

Re deprecation of the old policies I'm not sure if we have a good deprecation path for them, so I suspect we might need to keep them around for the foreseeable future to avoid breaking existing users. @alyssawilk @mattklein123 WDYT?

@@ -79,6 +79,9 @@ gateway-error
This policy is similar to the *5xx* policy but will only retry requests that result in a 502, 503,
or 504.

reset
Envoy will attempt a retry if the upstream server does not respond due to a reset.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording here is a bit off I think: the upstream might have started responding but trigger before it finishes (for example, the upstream server could crash mid-response). Maybe reuse the wording from the 5xx retry policy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@snowp Which part of the wording for 5xx exactly? This doesn't trigger if any 5xx response code comes back, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i mean the language about resets, the does not respond at all (disconnect/reset/read timeout) bit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, gotcha. Updated.

@mattklein123
Copy link
Member

@alyssawilk @mattklein123 WDYT?

IMO it's not worth the user churn and I would probably just leave the existing functionality.

@snowp snowp added the waiting label Jul 10, 2019
Copy link
Contributor Author

@matthoey-okta matthoey-okta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm heading to Europe for a few weeks and won't be able to work on this. If I can get the change to the wording (or if someone else can make the change on my behalf, I'm good with that too,) I can make it in a few hours or I can pick this up when I get back?

@@ -79,6 +79,9 @@ gateway-error
This policy is similar to the *5xx* policy but will only retry requests that result in a 502, 503,
or 504.

reset
Envoy will attempt a retry if the upstream server does not respond due to a reset.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@snowp Which part of the wording for 5xx exactly? This doesn't trigger if any 5xx response code comes back, right?

Signed-off-by: Matt Hoey <matt.hoey@okta.com>
Copy link
Contributor

@snowp snowp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@matthoey-okta
Copy link
Contributor Author

You all left the blueprint. Happy to have contributed to one of my favorite projects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Retry not firing as expected, even with retriable_status_codes
3 participants