Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix hostport duplicate chain names #55153

Merged
merged 3 commits into from
Nov 15, 2017
Merged

Conversation

chenchun
Copy link
Contributor

@chenchun chenchun commented Nov 6, 2017

Fixes bad conversion from int32 to string. Without this patch, getHostportChain/hostportChainName generates the same chain names for ports 57119/55429/56833 of the same pod.

closes #55771

Fixes bad conversion in host port chain name generating func which leads to some unreachable host ports.

@k8s-ci-robot k8s-ci-robot added do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 6, 2017
@MrHohn
Copy link
Member

MrHohn commented Nov 6, 2017

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Nov 6, 2017
@@ -198,3 +198,16 @@ func TestHostportManager(t *testing.T) {
assert.EqualValues(t, true, port.closed)
}
}

func TestGetHostportChain(t *testing.T) {
m := make(map[string]int)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: what about map[string]bool?

@@ -247,7 +248,7 @@ func (hm *hostportManager) closeHostports(hostportMappings []*PortMapping) error
// WARNING: Please do not change this function. Otherwise, HostportManager may not be able to
// identify existing iptables chains.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that mean by changing the chain name, we are orphaning the old chain/rule? Is that acceptable or anyway to workaround?

@kubernetes/sig-network-pr-reviews

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a commit to clean up these old chains/rules as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, thought about this again, I wonder why it would be an issue changing this function. Can't we assume upon node/kubelet upgrade, all iptables rules will not be retained because node has been restarted? Am I suggesting an unnecessary cleanup? Would be great to have guidance from @thockin and @freehan

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we assume upon node/kubelet upgrade, all iptables rules will not be retained because node has been restarted?

😕 Is this true? I don't think everyone will restart node upon upgrade. Is this the required step of upgrading to the next release or something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least that is what happens with gce/upgrade.sh :)

I'm not aware of any supported per-system-component upgrade mechanism in k8s yet, not sure if someone already support that...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is better. It's low cost.

@k8s-ci-robot k8s-ci-robot added the sig/network Categorizes an issue or PR as relevant to SIG Network. label Nov 6, 2017
@chenchun
Copy link
Contributor Author

chenchun commented Nov 8, 2017

/retest

Copy link
Member

@MrHohn MrHohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, seems to be a viable solution. Some comments about testing.

}

// TODO remove this, please refer https://github.com/kubernetes/kubernetes/pull/55153
func getBugyHostportChain(id string, pm *PortMapping) utiliptables.Chain {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth to also add an one-line description about the issue?

"-A KUBE-HP-63UPIDJXVRSZGSUZ -m comment --comment \"pod1_ns1 hostport 8081\" -s 10.1.1.2/32 -j KUBE-MARK-MASQ": true,
"-A KUBE-HP-63UPIDJXVRSZGSUZ -m comment --comment \"pod1_ns1 hostport 8081\" -m udp -p udp -j DNAT --to-destination 10.1.1.2:81": true,
"-A KUBE-HP-WFBOALXEP42XEMJK -m comment --comment \"pod3_ns1 hostport 8443\" -s 10.1.1.4/32 -j KUBE-MARK-MASQ": true,
"-A KUBE-HP-WFBOALXEP42XEMJK -m comment --comment \"pod3_ns1 hostport 8443\" -m tcp -p tcp -j DNAT --to-destination 10.1.1.4:443": true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we validate the cleanup logic in unit test so that we will have more confidences?

@@ -142,7 +143,7 @@ func writeLine(buf *bytes.Buffer, words ...string) {
// this because IPTables Chain Names must be <= 28 chars long, and the longer
// they are the harder they are to read.
func hostportChainName(pm *PortMapping, podFullName string) utiliptables.Chain {
hash := sha256.Sum256([]byte(string(pm.HostPort) + string(pm.Protocol) + podFullName))
hash := sha256.Sum256([]byte(strconv.Itoa(int(pm.HostPort)) + string(pm.Protocol) + podFullName))
Copy link
Member

@MrHohn MrHohn Nov 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For hostport_syncer, is the cleanup logic originally baked in? Might be great to verify it in unit test as well.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 9, 2017
@chenchun
Copy link
Contributor Author

chenchun commented Nov 9, 2017

@MrHohn Addressed all your comments, PTAL

assert.True(t, ok)
// check KUBE-HOSTPORTS chain should be cleaned up
hostportChain, ok := natTable.chains["KUBE-HOSTPORTS"]
assert.True(t, ok, "%s %v", string(hostportChain.name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

assert.True(t, ok, "%s", string(hostportChain.name))

assert.True(t, ok)
// check pod1's rules in KUBE-HOSTPORTS chain should be cleaned up
hostportChain, ok := natTable.chains["KUBE-HOSTPORTS"]
assert.True(t, ok, "%s %v", string(hostportChain.name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

assert.True(t, ok, "%s", string(hostportChain.name))

@@ -274,6 +275,11 @@ func (f *fakeIPTables) restore(restoreTableName utiliptables.Table, data []byte,
}
}
_, _ = f.ensureChain(tableName, chainName)
if !strings.Contains(allLines, "-X "+string(chainName)) {
if err := f.FlushChain(tableName, chainName); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I don't quite get the logic here, why should we flush the chain when there is no -X CHAIN_NAME? Explanation would be appreciated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I should update here to just flush user defined chains.
The --noflush option for iptables-restore doesn't work for user-defined chains, such as TESTCHAIN, only builtin chains. https://unix.stackexchange.com/questions/134687/how-to-combine-iptables-rulesets

I did a test to confirm this. See below example, the POSTROUTING chain didn't get flush, but KUBE-NODEPORT chain did.

[root@kubernetes-master vagrant]# iptables -t nat -S
-P PREROUTING ACCEPT
-P INPUT ACCEPT
-P OUTPUT ACCEPT
-P POSTROUTING ACCEPT
-N KUBE-HP-5N7UH5JAXCVP5UJR
-N KUBE-NODEPORT
-A POSTROUTING -p tcp -m comment --comment "pod3_ns1 hostport 8443" -m tcp --dport 8443 -j KUBE-HP-5N7UH5JAXCVP5UJR
-A KUBE-NODEPORT -p tcp -m comment --comment "pod3_ns1 hostport 8443" -m tcp --dport 8443 -j KUBE-HP-5N7UH5JAXCVP5UJR
[root@kubernetes-master vagrant]# cat nat.log 
# Generated by iptables-save v1.4.21 on Mon Nov 13 03:14:59 2017
*nat
:PREROUTING ACCEPT [65:3900]
:INPUT ACCEPT [65:3900]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:KUBE-HP-5N7UH5JAXCVP5UJR - [0:0]
:KUBE-NODEPORT - [0:0]
COMMIT
# Completed on Mon Nov 13 03:14:59 2017
[root@kubernetes-master vagrant]# iptables-restore --table=nat --noflush < nat.log 
[root@kubernetes-master vagrant]# iptables -t nat -S
-P PREROUTING ACCEPT
-P INPUT ACCEPT
-P OUTPUT ACCEPT
-P POSTROUTING ACCEPT
-N KUBE-HP-5N7UH5JAXCVP5UJR
-N KUBE-NODEPORT
-A POSTROUTING -p tcp -m comment --comment "pod3_ns1 hostport 8443" -m tcp --dport 8443 -j KUBE-HP-5N7UH5JAXCVP5UJR

And this behavior is needed to cleanup rules in KUBE-NODEPORT in my tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the investigation, now I get it. Interesting that this behavior is not documented in iptables manual.

Though I just learned that user-defined chains won't always be flushed, unless they are explicitly mentioned in the input of iptables-restore --noflush, as what below comment is referring to:

// We must (as per iptables) write a chain-line for it, which has
// the nice effect of flushing the chain. Then we can remove the
// chain.
writeLine(natChains, existingNATChains[chain])
writeLine(natRules, "-X", chainString)

In short I believe your fix will work as expected, but the implementation of fakeIPTables.restore() still seems problematic as it flushes more than it should...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops I scan too fast, looks like that is exactly what you have implemented. Could you update it to just flush user defined chains? Thanks.

Copy link
Contributor Author

@chenchun chenchun Nov 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also added a test TestRestoreFlushRules for fackIPtables.restore().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks!

@@ -247,7 +248,7 @@ func (hm *hostportManager) closeHostports(hostportMappings []*PortMapping) error
// WARNING: Please do not change this function. Otherwise, HostportManager may not be able to
// identify existing iptables chains.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, thought about this again, I wonder why it would be an issue changing this function. Can't we assume upon node/kubelet upgrade, all iptables rules will not be retained because node has been restarted? Am I suggesting an unnecessary cleanup? Would be great to have guidance from @thockin and @freehan

Copy link
Member

@MrHohn MrHohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, appreciate your works!

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 14, 2017
@MrHohn
Copy link
Member

MrHohn commented Nov 14, 2017

This might worth a release note.
/release-note

/assign @thockin
for approval.

@k8s-ci-robot
Copy link
Contributor

@MrHohn: the /release-note and /release-note-action-required commands have been deprecated.
Please edit the release-note block in the PR body text to include the release note. If the release note requires additional action include the string action required in the release note. For example:

```release-note
Some release note with action required.
```

In response to this:

This might worth a release note.
/release-note

/assign @thockin
for approval.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Nov 14, 2017
@@ -82,6 +82,7 @@ type Table string
const (
TableNAT Table = "nat"
TableFilter Table = "filter"
TableMangle Table = "mangle"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this part of the same PR? Seems unrelated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used by NewFakeIPTables to cache builtin chains.

@@ -177,6 +178,8 @@ func (hm *hostportManager) Remove(id string, podPortMapping *PodPortMapping) (er
chainsToRemove := []utiliptables.Chain{}
for _, pm := range hostportMappings {
chainsToRemove = append(chainsToRemove, getHostportChain(id, pm))
// TODO remove this, please refer https://github.com/kubernetes/kubernetes/pull/55153
chainsToRemove = append(chainsToRemove, getBugyHostportChain(id, pm))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Bugy/Buggy/

@@ -247,7 +248,7 @@ func (hm *hostportManager) closeHostports(hostportMappings []*PortMapping) error
// WARNING: Please do not change this function. Otherwise, HostportManager may not be able to
// identify existing iptables chains.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is better. It's low cost.

@@ -177,6 +178,8 @@ func (hm *hostportManager) Remove(id string, podPortMapping *PodPortMapping) (er
chainsToRemove := []utiliptables.Chain{}
for _, pm := range hostportMappings {
chainsToRemove = append(chainsToRemove, getHostportChain(id, pm))
// TODO remove this, please refer https://github.com/kubernetes/kubernetes/pull/55153
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... after release 1.9

@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 15, 2017
@thockin
Copy link
Member

thockin commented Nov 15, 2017

This needs an associated issue and a closes #12345 in the FIRST COMMENT, or it can not be approved.

@thockin
Copy link
Member

thockin commented Nov 15, 2017

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 15, 2017
@k8s-github-robot k8s-github-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Nov 15, 2017
@MrHohn
Copy link
Member

MrHohn commented Nov 15, 2017

Comments are addressed and an associated issue is created.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 15, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chenchun, MrHohn, thockin

Associated issue: 55771

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 54436, 53148, 55153, 55614, 55484). If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 7ad180a into kubernetes:master Nov 15, 2017
@chenchun chenchun deleted the fix branch November 16, 2017 01:32
k8s-github-robot pushed a commit that referenced this pull request Jan 9, 2018
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

remove duplicate function getBuggyHostportChain

**What this PR does / why we need it**:
remove `TODO remove this after release 1.9, please refer #55153
function `getBuggyHostportChain`  does bad conversion on HostPort from int32 to string, now that `getHostportChain` does right, we remove function `getBuggyHostportChain` .

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #

**Special notes for your reviewer**:

**Release note**:

```release-note
NONE
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/network Categorizes an issue or PR as relevant to SIG Network. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Some hostports are unreachable
6 participants