-
Notifications
You must be signed in to change notification settings - Fork 7.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ingress SDS missing event to push key/cert pair to proxy #18912
Comments
I vaguely recall this being a known bug that has been fixed in one of the 1.3 patch releases. I highly recommend you upgrade to 1.3.5 which has a number of high severity fixed. @JimmyCYJ can verify if this was fixed |
Thanks! I couldn't find any reference to this in the release notes. @JimmyCYJ can you advise? |
Thanks @howardjohn! Yes, there is a bug in Citadel agent in 1.3.0 release. If your Cidatel agent prints logs such as "warn secretFetcherLog unexpected server key/cert change in secret kyma-gateway-certs", then it indicates that Citadel agent hits this bug. The release notes is here https://istio.io/news/2019/announcing-1.3.4/#minor-enhancements. Please upgrade to 1.3.5. |
Thanks @JimmyCYJ. I'm not seeing logs like that anywhere in our setup. We are seeing these logs from our SDS container though:
Along with these:
Does it look like a different issue to you? |
I'm testing this against a cluster that's been upgraded to 1.3.5, but I find the original issue is very hard to reproduce. Is there something specific that I can exercise to hit the case above where SDS loses track of the connection? |
I'm closing this because @JimmyCYJ provide a suggestion to upgrade to 1.3.5 Please feel free to re-open if you can reproduce after upgrading to 1.3.5 |
I still see this issue with 1.3.5. Can we reopen this? @incfly I don't have perms to reopen the issue. Here's how I'm able to repro: Create a gateway that uses a secret like so:
Create the secret that the gateway uses:
Now delete the secret and gateway and create them again using the exact same values. This causes SDS to fail to push things over to Envoy. In 1.3.5 I see that even the root cert doesn't make it to Envoy so the TLS endpoint is completely broken:
|
I've also seen this issue with 1.3.5. |
I see the same for 1.4.0 (kube 1.16.2): istioctl version kubectl version While SIMPLE tls for ingress works fine, MUTUAL tls realized according to task https://istio.io/docs/tasks/traffic-management/ingress/secure-ingress-sds/ behaves like in the issue |
@aaronjwood could you turn on debug logging at ingress-sds? You can do it via /ctrlz endpoint. Please share a copy of the log, or send me a copy by slack channel. From the piece of log you shared, ingress-sds does not detect the secret |
Okay I ran through this scenario again with cacheLog, sdsServiceLog, and secretFetcher debug logs turned on:
At a quick glance it looks like the cleanup job which is taking out stalled clients seems to be the culprit. |
I should instead say that the recycle logic here
|
Also, for some context the reason why you see that the secret does not exist for some time in the logs (even in the scenarios that work correctly) is due to the behavior of our own software. What we do is we first program the gateway config into the cluster and then later (between 30 seconds and 1 min usually) program the secret into the cluster. |
Saw an interesting scenario today:
The connection its trying to use called
|
Hi, ingress sds : istio-proxy : it seems like the istio-proxy inside the ingress crashed and didn't recover.( i looked for info Envoy proxy is ready) can someone advise? thanks, |
Note that SDS in 1.5 is not quite ready yet: #22443 |
@howardjohn should we reopen this ticket @aaronjwood reported that the problem is still seen. |
I face the same problem in version 1.5. |
#24817 is the fix for the issue. |
@JimmyCYJ what release will this be targeted for? Will it be backported anywhere? |
It will be shipped with release-1.7. Which will be announced on August 11th (https://github.com/istio/istio/wiki/Istio-Release-1.7) |
will cherry-pick to 1.5 and 1.6 this week |
Bug description
In rare cases we're seeing that SDS is missing an event entirely that's causing only the root cert to get pushed but not the key/cert pair. Looking through the logs we can see that the root cert gets pushed out once the secret is eventually found:
You'll see that no key/cert pair gets pushed. This causes TLS/mTLS gateways to break completely.
The way to recover from this is to delete the secret and add it back again. This results in the following logs:
Expected behavior
SDS always pushing the required crypto to the proxy.
Steps to reproduce the bug
Unknown, I've found this very hard to reproduce. It has only affected one user of ours AFAIK.
Version (include the output of
istioctl version --remote
andkubectl version
)How was Istio installed?
Imagine having a Makefile that does this where
$ISTIO_VERSION
is1.3.0
:A kubectl apply is later done on the
istio-crds.yaml
andistio.yaml
files.Environment where bug was observed (cloud vendor, OS, etc)
EKS
The text was updated successfully, but these errors were encountered: