New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hostPort iptables rule is lost after node restarts #17464
Comments
|
@knobunc Any updates? I can also help by providing an environment that can reproduce this issue. |
|
any chance this issue priority can be bumped please ? It does happen on 1.5/3.6/3.7 when origin-node service is restarted .. @smarterclayton any chance you might be aware of any K8 bug related to this issue? i couldn't find anything |
|
@DanyC97 I think the information provided here is sufficient to reproduce this issue. It seems to me that the priority of this issue is low. |
|
@sdodson any chance you can help with increasing the priority of this issue please ? |
|
i guess no luck to grab anyone attention no matter how much i tried .... oh well |
|
Investigated and was able to reproduce locally (at least a variant of the issue) using the nginx daemonset and restarting docker. Analysis:
It's currently unclear what should be done about this; it's a completely upstream problem. We've fixed a number of upstream issues with PLEG status racing with SyncPod in the past. |
|
One question for the reporter: does the container ever become ready and start but without the hostport rules? Or does the container never become ready? |
|
@dcbw Thanks for your investigation. Regarding to your question, I need time to confirm. Will follow up later. |
|
@dcbw in my case the is the former situation, pod up, no iptables rule |
|
Do either of you see a line like this in your openshift-node process logs? (eg journalctl -b -u openshift-node)
|
|
@dcbw yes i do |
|
@dcbw any luck or you need more info from my side ? |
|
@dcbw sorry to keep nudge you, any chance you can spare some time to get to the bottom of it please? (i know this might be in your spare time hence will be much appreciated the extra mile effort) |
|
@danwinship @smarterclayton @dcbw @liggitt i don't know who else to tag, i'm screaming for help please, can one of you keep looking into this issue as it does hurt me a lot running Origin internally. I'm very surprised to see that this issue doesn't get much attention, i wonder if this doesn't happen in OCP ? is a very common scenario which can trigger a disaster in production since there is no current solution which does monitor the iptables rules to see if is still present or not. |
|
@DanyC97 does the node eventually settle down and start the container, or does it never happen? Also, if you can run openshift-node with --loglevel=5, reproduce the issue, and then mail "journalctl -b -u openshift-node" (might be called "atomic-openshift-node" too) to me at [dcbw at redhat dot com] I can analyze and see if your issue is the same as the one I've diagnosed. |
|
@dcbw i'll email all the info to you next week with a reproducible test case. Thanks a bunch !! |
|
This appears to be an issue in 3.9 as well and I am seeing it with the prometheus daemonset. |
|
@dcbw fyi just sent you an email, again sorry for very long delay. |
|
Hi, what is the status of this? |
|
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
|
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
|
Rotten issues close after 30d of inactivity. Reopen the issue by commenting /close |
|
@openshift-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[provide a description of the issue]
hostPort mapped by daemonset will disappear after node (or just docker) restarts.
But I am not sure if it is still present in latest version.
Version
[provide output of the
openshift versionoroc versioncommand]OpenShift origin v3.6.1+008f2d5
Steps To Reproduce
KUNE-HT-*chain ofnattable.Current Result
After several minutes, the hostport on that node will become unreachable and the iptables rule in KUNE-HT-* chain will disappear.
Expected Result
the hostport will be mapped to the new Pod.
Additional Information
The text was updated successfully, but these errors were encountered: