-
Notifications
You must be signed in to change notification settings - Fork 7.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xds: fix issue when skipping first request #39916
Conversation
The bug (example failure: https://prow.istio.io/view/gs/istio-prow/pr-logs/pull/istio_istio/39896/integ-cni_istio/1546984256612339712) * ingressgateway has 1 Gateway * Gateway is removed * envoy disconnects * Envoy reconnects, requests RDS. This hits the INIT/RECONNECT flow. * RDS hits 'Gateway missing for route' path and gives no response at all * Next RDS request, we get a "stale nonce" since we have no previously sent nonce * Envoy stuck forever The fix: * Remove code path to return empty route instead of no route (matching other paths) * Add assertions to ensure that we don't send empty response to requests and that we never count a "stale nonce" if we somehow have no previously sent nonce, to ensure there aren't any other issues
🤔 🐛 You appear to be fixing a bug in Go code, yet your PR doesn't include updates to any test files. Did you forget to add a test? Courtesy of your friendly test nag. |
/test integ-assertion_istio |
/retest |
This flag requires the control plane to also have support, and its not enabled. This also breaks a (test only) assertion added in istio#39916.
This flag requires the control plane to also have support, and its not enabled. This also breaks a (test only) assertion added in #39916.
When met |
No, the issue happens because listener is still present for 45s (draining) after removal of the Gateway /test integ-assertion_istio (assertion failed due to unrelated issue, fixed in another pr) |
IC, should we fix envoy which should not send rds request for a listener in draining state? |
It seems reasonable to me for envoy to do that otherwise it cannot serve traffic on the draining listener appropriately. We only set drain to 45s but some may do it for hours even where it makes more sense |
LGTM. Would be good if we can add a test case that covers this. But if you want to add later, I am OK to approve this. Let me know. |
will follow up with more tests, right now I want to make sure we can
unblock all the failing e2e tests
…On Thu, Jul 14, 2022, 4:56 AM Rama Chavali ***@***.***> wrote:
LGTM. Would be good if we can add a test case that covers this. But if you
want to add later, I am OK to approve this. Let me know.
—
Reply to this email directly, view it on GitHub
<#39916 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEYGXLE523YSOL6KL52RJTVT755RANCNFSM53PWNDLQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@howardjohn: The following test failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/retest |
@howardjohn When we met the case Is it right that we push empty RDS? istio/pilot/pkg/networking/core/v1alpha3/gateway.go Lines 354 to 360 in ecb615d
|
Its hard to say without more info. what did you actually do in terms of creating/deleting GW/VS? |
The bug (example failure:
https://prow.istio.io/view/gs/istio-prow/pr-logs/pull/istio_istio/39896/integ-cni_istio/1546984256612339712)
sent nonce
The fix:
other paths)
and that we never count a "stale nonce" if we somehow have no
previously sent nonce, to ensure there aren't any other issues
Please provide a description of this PR: