-
Notifications
You must be signed in to change notification settings - Fork 7.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple runs of istio-init container leads to crashloopbackoff status #42792
Comments
Appears to be erroring when attempting to recreate a chain that already exists. Ideally, you shouldn't be deleting containers externally, which is why kubelet is rescheduling the init-container. There's a problem here as iptables is not declarative: there's not a way to say the rules should look like X, and only change what needs changing... but if we flush and rebuild, we risk breaking existing or new connections while the rules are being rebuilt. |
Definitely cannot flush them. If someone has a simple proposal to make it work without flushing I would be open to it. Maybe split "create chain" and "apply rules" logic:
? what happens when it fails? Do we apply no rules, all the valid rules, or all the valid rules before the error? |
I believe at present it hits the failure to create the chains and stops. There's a config to not use the restore method, but the other relies on RunOrFail which would stop applying rules as well as soon as it fails. I can take a look at this, the restore method I don't think there's an easy way around but RunOrFail should be doable. |
In that case my idea may be somewhat problematic. In k8s I'd imagine the
rules applies never change but in VMs anything could happen and we may get
into weird inconsistent states? Although I guess that can happen today as
well
…On Thu, Jan 12, 2023, 3:06 PM Daniel Hawton ***@***.***> wrote:
I believe at present it hits the failure to create the chains and stops.
There's a config to not use the restore method, but the other relies on
RunOrFail which would stop applying rules as well as soon as it fails. I
can take a look at this, the restore method I don't think there's an easy
way around but RunOrFail should be doable.
—
Reply to this email directly, view it on GitHub
<#42792 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEYGXJV3CXKIBLTG6LGUPTWSCFAPANCNFSM6AAAAAATZOIWFQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
That's true. We already have a task in the pipeline to not do that. Irrespective of that I wanted to kick off the discussion around if the init container is idempotent and if it's straightforward enough to make it.
oh we definitely shouldn't be doing that. In fact, #16768 got us away from this deletion + re-creation combination as part of the init run so we wouldn't want to go back there. There was this proposal at #18159 to add a step to check if iptables rules are already present and if they are, just exit. I like the #3 option proposed there and @howardjohn you seem to be a part of that conversation too. We backed out of the change but if the reason to back out is not true/no longer applies (I am not sure what it is), can we revive the same solution? (not sure what all has changed with Istio-init and cni plugin since then and if the same solution would still help). |
Maybe we can see what k8s does, they have much more dynamic usage of
iptables
…On Thu, Jan 12, 2023 at 7:33 PM Anand Kumar ***@***.***> wrote:
you shouldn't be deleting containers externally, which is why kubelet is
rescheduling the init-container.
That's true. We already have a task in the pipeline to not do that.
Irrespective of that I wanted to kick off the discussion around if the init
container is idempotent and if it's straightforward enough to make it.
but if we flush and rebuild, we risk breaking existing or new connections
while the rules are being rebuilt.
oh we definitely shouldn't be doing that. In fact, #16768
<#16768> got us away from this
deletion + re-creation combination as part of the init run so we wouldn't
want to go back there.
There was this proposal at #18159
<#18159> to add a step to check if
iptables rules are already present and if they are, just exit. I like the
#3 <#3> option proposed there and
@howardjohn <https://github.com/howardjohn> you seem to be a part of that
conversation too. We backed out of the change but if the reason to back out
is not true/no longer applies (I am not sure what it is), can we revive the
same solution? (not sure what all has changed with Istio-init and cni
plugin since then and if the same solution would still help).
—
Reply to this email directly, view it on GitHub
<#42792 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEYGXLDBRJUVXXMINIKFSDWSDEJNANCNFSM6AAAAAATZOIWFQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@dhawton have you already decided on how we want to tackle this (assuming, we want to tackle this)? Let me know if you need my bandwidth on any sort of implementation or testing. |
🚧 This issue or pull request has been closed due to not having had activity from an Istio team member since 2023-01-13. If you feel this issue or pull request deserves attention, please reopen the issue. Please see this wiki page for more information. Thank you for your contributions. Created by the issue and PR lifecycle manager. |
I also ran into this issue. $ istioctl version
client version: 1.17.1
control plane version: 1.17.1
data plane version: 1.17.1 (352 proxies)
$ kubectl version
Client Version: v1.29.3
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.4-gke.1447000 |
Bug Description
We are testing Istio in a small test setup before expanding it into our production cluster. With sidecar injection, in a pod, istio-init container completes just fine for the first time, setting up all the iptables rules and the sidecar proxy thereafter works fine too.
However, for reasons that we are "aware" of, the istio-init containers re-run without the pod restarting or anything. We run
docker system prune -f
as a cron which removes the exited istio-init container and that is tipping off Kubelet to start the init container again. This situation is much discussed in the past at kubernetes/kubernetes#67261 for e.g.Note: even though the container and the pod shows
Init:CrashLoopBackOff
status post that, the app container and proxy container keep on working fine but it's throwing us off.However I was under the impression that istio-init runs are idempotent and a comment from around 3 years back indicates the same: #18159 (comment)
Is that not the case anymore? We still are running iptables with
--noflush
flag which makes this whole thing idempotent as per that comment.All the first runs of istio-init container across pods are completing just fine but every run from the second run onward is failing. Here is the log of istio-init container from a failed run:
On our side, we can enhance
docker system prune -f
to not prune istio-init containers or even get rid of this task, but I believe even k8s suggests making the init containers idempotent.from https://kubernetes.io/docs/concepts/workloads/pods/init-containers/#detailed-behavior
Can I get some help/direction here?
Version
Additional Information
No response
The text was updated successfully, but these errors were encountered: