-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move iptables logging in kubeproxy from Errorf to V(2).Infof #48085
Move iptables logging in kubeproxy from Errorf to V(2).Infof #48085
Conversation
/lgtm |
@gmarek We can try turning on fluentd again with this change. |
If it's not that important, I'd do it v3. |
We can, but currently quite some tests (and normal kube-up setups) are running at v2 and having iptables printed might be useful there for debugging. If need be, we can increase in future IMO. |
@@ -1571,7 +1571,8 @@ func (proxier *Proxier) syncProxyRules() { | |||
glog.V(5).Infof("Restoring iptables rules: %s", proxier.iptablesData.Bytes()) | |||
err = proxier.iptables.RestoreAll(proxier.iptablesData.Bytes(), utiliptables.NoFlushTables, utiliptables.RestoreCounters) | |||
if err != nil { | |||
glog.Errorf("Failed to execute iptables-restore: %v\nRules:\n%s", err, proxier.iptablesData.Bytes()) | |||
glog.Errorf("Failed to execute iptables-restore: %v", err) | |||
glog.V(2).Infof("Rules:\n%s", proxier.iptablesData.Bytes()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an error... it shouldn't be happening at all. Why is it happening so much that it causes problems?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ref: #48052 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't merge this PR without also filing an issue about the actual bug. (ie, that apparently there are locking timeout issues causing some kube-proxy iptables updates to get dropped)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #48107. Thanks.
SGTM |
/lgtm I am ok to cherrypick this one to 1.7 to help scalability test. But we need a real solution for production. :-) |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bowei, dchen1107, shyamjvs Associated issue: 48052 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
Sounds good. Thanks @dchen1107 ! |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bowei, dchen1107, shyamjvs Associated issue: 48052 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
1 similar comment
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bowei, dchen1107, shyamjvs Associated issue: 48052 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bowei, dchen1107, shyamjvs Associated issue: 48052 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
Automatic merge from submit-queue (batch tested with PRs 44058, 48085, 48077, 48076, 47823) |
@dchen1107 - I completely agree. The thing is that it's a complex problem with multiple actors involved, so it'd need to be a very wide effort, which was never prioritized highly enough. There are at least two problems:
|
…#47986-#47152-#47860-#47945-#47961-#47986-#47993-#48012-#48085-upstream-release-1.7 Automatic merge from submit-queue Automated cherry pick of #47986 #47152 #47860 #47945 #47961 #47986 #47993 #48012 #48085 Cherry pick of #47986 #47152 #47860 #47945 #47961 #47986 #47993 #48012 #48085 on release-1.7. #47986: Change service port to avoid collision #47152: Kubelet doesn't override addrs from Cloud provider #47860: Make fluentd log to stdio instead of a dedicated file #47945: add level for print flags #47961: Bumped Heapster to v1.4.0-beta.0 #47986: Change service port to avoid collision #47993: Use a different env var to enable the ip-masq-agent addon. We #48012: Extending timeout waiting for delete node to become ready #48085: Move iptables logging in kubeproxy from Errorf to V(2).Infof
Commit found in the "release-1.7" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked. |
Automatic merge from submit-queue Log abridged set of rules at v2 in kube-proxy on error **What this PR does / why we need it**: this is a follow-on to #48085 **Special notes for your reviewer**: we hit this in operations where we typically run in v2, and would like to log abridged set of output rather than full output. **Release note**: ```release-note NONE ```
Fixes #48052
This will stop fluentd from OOM'ing in reasonably large clusters with services due to kube-proxy. You'll still get iptables printed on setups which run at >= v2, but we can at least optout.
@bowei Does this look reasonable?
cc @kubernetes/sig-network-misc