New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert "conntrack e2e test debug connections" #94800
Conversation
/assign @wojtek-t |
@@ -328,25 +297,3 @@ func dumpConntrack(cs clientset.Interface) { | |||
} | |||
} | |||
} | |||
|
|||
func dumpIptables(cs clientset.Interface) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we revert only the iptables part?
This is problematic because we log it for all nodes (5k), but logging conntrack should be fine as we log that for two nodes...
@aojea - FYI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah sure, , sorry,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wojtek-t, I'm not sure if I understand your comment: I think both dumpIptables and dumpConntrack iterates over all nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry - I think this has changed in the meantime, I think we were dumping only particular nodes in the past.
@aojea - I'm going to approve it to fix scalability tests; we should re-revert by dumping iptables/conntrack only for nodes that are interested for us. /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mborsz, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
/retest |
@danehans this is breaking scsalability tests, we need milestone here please |
Added milestone. |
/retest |
1 similar comment
/retest |
After revert kubernetes/kubernetes#94800, the test takes ~12-15GiB now, so 48GiB should be really safe.
Reverts #92643
After this PR has been merged, gce-master-scale-correctness started OOMing: https://screenshot.googleplex.com/773vvp4MTsfMxv6
I'm not 100% it's caused by this change, but it seems to be likely given that this PR adds dumping iptables-save for each node pair + submission time correlates with the time when OOMs has started.