Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deleting cilium pods messes with hostport mappings #6499

Closed
felipejfc opened this issue Dec 21, 2018 · 1 comment · Fixed by #6502
Closed

Deleting cilium pods messes with hostport mappings #6499

felipejfc opened this issue Dec 21, 2018 · 1 comment · Fixed by #6502
Assignees
Labels
kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack.
Projects
Milestone

Comments

@felipejfc
Copy link

Hi guys, we've been using cilium in production for some time now and our kubernetes use case includes a lot of pods that runs with hostPort (using portmap plugin).

We observed today that these pods became unreachable by the hostport after a cilium update, so after some research I found that after I restart cilium pods, iptables rules that are necessary for hostPort to work always get deleted, to be more specific, these are the rules in iptables before deleting cilium pod:

-A CNI-DN-c20774ee0bf22c7e19507 -p udp -m udp --dport 41529 -j DNAT --to-destination 10.56.4.152:7788
-A CNI-DN-9b37bc50e0ab952351885 -p udp -m udp --dport 43407 -j DNAT --to-destination 10.56.4.13:7788
-A CNI-HOSTPORT-DNAT -m comment --comment "dnat name: \"cilium-portmap\" id: \"883b02f84c3cdd3e5e8432fecab3306a6a180c32aa0a6c78a0378d532cbe9d13\"" -j CNI-DN-c20774ee0bf22c7e19507
-A CNI-HOSTPORT-DNAT -m comment --comment "dnat name: \"cilium-portmap\" id: \"00f324f7e68994294c2a2edde64e823d93d7338ba6e9646f252ea412d8ae1163\"" -j CNI-DN-9b37bc50e0ab952351885
-A CNI-HOSTPORT-SNAT -m comment --comment "snat name: \"cilium-portmap\" id: \"883b02f84c3cdd3e5e8432fecab3306a6a180c32aa0a6c78a0378d532cbe9d13\"" -j CNI-SN-c20774ee0bf22c7e19507
-A CNI-HOSTPORT-SNAT -m comment --comment "snat name: \"cilium-portmap\" id: \"00f324f7e68994294c2a2edde64e823d93d7338ba6e9646f252ea412d8ae1163\"" -j CNI-SN-9b37bc50e0ab952351885
-A CNI-SN-9b37bc50e0ab952351885 -s 127.0.0.1/32 -d 10.56.4.13/32 -p udp -m udp --dport 7788 -j MASQUERADE
-A CNI-SN-c20774ee0bf22c7e19507 -s 127.0.0.1/32 -d 10.56.4.152/32 -p udp -m udp --dport 7788 -j MASQUERADE

after deleting cilium pods and seeing they be recreated only this two rules exist:

-A CNI-DN-9b37bc50e0ab952351885 -p udp -m udp --dport 43407 -j DNAT --to-destination 10.56.4.13:7788
-A CNI-DN-c20774ee0bf22c7e19507 -p udp -m udp --dport 41529 -j DNAT --to-destination 10.56.4.152:7788

so the chain in being messed up somehow when I delete cilium pods.

General Information

How to reproduce the issue

  1. run with portmap enabled
  2. create a pod that has a hostPort configured
  3. see that its accessible
  4. delete cilium pod and wait for it to be recreated
  5. see that pod will no longer be accessible using the node port
  6. if delete the pod and wait for it to be rescheduled, it will be accessible again
@tgraf tgraf added kind/bug This is a bug in the Cilium logic. needs/triage This issue requires triaging to establish severity and next steps. kind/community-report This was reported by a user in the Cilium community, eg via Slack. labels Dec 21, 2018
@tgraf tgraf added this to Proposed in 1.4 via automation Dec 21, 2018
@tgraf tgraf added this to the 1.4-bugfix milestone Dec 21, 2018
@tgraf tgraf self-assigned this Dec 26, 2018
@tgraf tgraf removed the needs/triage This issue requires triaging to establish severity and next steps. label Dec 26, 2018
@tgraf
Copy link
Member

tgraf commented Dec 26, 2018

The issue has been tracked down to the following logic:
When Cilium starts up. It removes all iptables rules with the word cilium in it. This is done to remove rules which have been installed by prior versions of Cilium but are no longer in use. By calling the hostPort plugin cilium-portmap this leads to rules being removed incorrectly.

tgraf added a commit that referenced this issue Dec 26, 2018
The existing legacy rule removal logic on bootstrap removed all rules which
contains the word "cilium". While this removed Cilium relevant rules, it also
incorrectly removed rules installed by the portmap/hostport plugin if the
plugin was configured with a name that contained the string cilium.

Example CNI configuration:
```
    {
      "cniVersion": "0.3.1",
        "name": "cilium-portmap",
        "plugins": [
          {
            "type": "cilium-cni"
          },
          {
            "type": "portmap",
            "capabilities": { "portMappings": true }
          }
        ]
    }
```

Example of incorrectly removed rule:
-A CNI-HOSTPORT-DNAT -m comment --comment "dnat name: \"cilium-portmap\" id: \"95dc537b9152da5f91be3fc5692bf91592bf1871b6e61755aed2056a03e98c4f\"" -j CNI-DN-258a52f03b4b7aa8abdc5

The fix is to be more restrictive in selecting rules to remove and limit it to
rules which contain the string "CILIUM_".

Fixes: #6499

Signed-off-by: Thomas Graf <thomas@cilium.io>
1.4 automation moved this from Proposed to Done Dec 28, 2018
tgraf added a commit that referenced this issue Dec 28, 2018
The existing legacy rule removal logic on bootstrap removed all rules which
contains the word "cilium". While this removed Cilium relevant rules, it also
incorrectly removed rules installed by the portmap/hostport plugin if the
plugin was configured with a name that contained the string cilium.

Example CNI configuration:
```
    {
      "cniVersion": "0.3.1",
        "name": "cilium-portmap",
        "plugins": [
          {
            "type": "cilium-cni"
          },
          {
            "type": "portmap",
            "capabilities": { "portMappings": true }
          }
        ]
    }
```

Example of incorrectly removed rule:
-A CNI-HOSTPORT-DNAT -m comment --comment "dnat name: \"cilium-portmap\" id: \"95dc537b9152da5f91be3fc5692bf91592bf1871b6e61755aed2056a03e98c4f\"" -j CNI-DN-258a52f03b4b7aa8abdc5

The fix is to be more restrictive in selecting rules to remove and limit it to
rules which contain the string "CILIUM_".

Fixes: #6499

Signed-off-by: Thomas Graf <thomas@cilium.io>
ianvernon pushed a commit that referenced this issue Jan 2, 2019
[ upstream commit 36cdd98 ]

The existing legacy rule removal logic on bootstrap removed all rules which
contains the word "cilium". While this removed Cilium relevant rules, it also
incorrectly removed rules installed by the portmap/hostport plugin if the
plugin was configured with a name that contained the string cilium.

Example CNI configuration:
```
    {
      "cniVersion": "0.3.1",
        "name": "cilium-portmap",
        "plugins": [
          {
            "type": "cilium-cni"
          },
          {
            "type": "portmap",
            "capabilities": { "portMappings": true }
          }
        ]
    }
```

Example of incorrectly removed rule:
-A CNI-HOSTPORT-DNAT -m comment --comment "dnat name: \"cilium-portmap\" id: \"95dc537b9152da5f91be3fc5692bf91592bf1871b6e61755aed2056a03e98c4f\"" -j CNI-DN-258a52f03b4b7aa8abdc5

The fix is to be more restrictive in selecting rules to remove and limit it to
rules which contain the string "CILIUM_".

Fixes: #6499

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Ian Vernon <ian@cilium.io>
ianvernon pushed a commit that referenced this issue Jan 2, 2019
[ upstream commit 36cdd98 ]

The existing legacy rule removal logic on bootstrap removed all rules which
contains the word "cilium". While this removed Cilium relevant rules, it also
incorrectly removed rules installed by the portmap/hostport plugin if the
plugin was configured with a name that contained the string cilium.

Example CNI configuration:
```
    {
      "cniVersion": "0.3.1",
        "name": "cilium-portmap",
        "plugins": [
          {
            "type": "cilium-cni"
          },
          {
            "type": "portmap",
            "capabilities": { "portMappings": true }
          }
        ]
    }
```

Example of incorrectly removed rule:
-A CNI-HOSTPORT-DNAT -m comment --comment "dnat name: \"cilium-portmap\" id: \"95dc537b9152da5f91be3fc5692bf91592bf1871b6e61755aed2056a03e98c4f\"" -j CNI-DN-258a52f03b4b7aa8abdc5

The fix is to be more restrictive in selecting rules to remove and limit it to
rules which contain the string "CILIUM_".

Fixes: #6499

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Ian Vernon <ian@cilium.io>
tgraf added a commit that referenced this issue Jan 3, 2019
[ upstream commit 36cdd98 ]

The existing legacy rule removal logic on bootstrap removed all rules which
contains the word "cilium". While this removed Cilium relevant rules, it also
incorrectly removed rules installed by the portmap/hostport plugin if the
plugin was configured with a name that contained the string cilium.

Example CNI configuration:
```
    {
      "cniVersion": "0.3.1",
        "name": "cilium-portmap",
        "plugins": [
          {
            "type": "cilium-cni"
          },
          {
            "type": "portmap",
            "capabilities": { "portMappings": true }
          }
        ]
    }
```

Example of incorrectly removed rule:
-A CNI-HOSTPORT-DNAT -m comment --comment "dnat name: \"cilium-portmap\" id: \"95dc537b9152da5f91be3fc5692bf91592bf1871b6e61755aed2056a03e98c4f\"" -j CNI-DN-258a52f03b4b7aa8abdc5

The fix is to be more restrictive in selecting rules to remove and limit it to
rules which contain the string "CILIUM_".

Fixes: #6499

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Ian Vernon <ian@cilium.io>
tgraf added a commit that referenced this issue Jan 3, 2019
[ upstream commit 36cdd98 ]

The existing legacy rule removal logic on bootstrap removed all rules which
contains the word "cilium". While this removed Cilium relevant rules, it also
incorrectly removed rules installed by the portmap/hostport plugin if the
plugin was configured with a name that contained the string cilium.

Example CNI configuration:
```
    {
      "cniVersion": "0.3.1",
        "name": "cilium-portmap",
        "plugins": [
          {
            "type": "cilium-cni"
          },
          {
            "type": "portmap",
            "capabilities": { "portMappings": true }
          }
        ]
    }
```

Example of incorrectly removed rule:
-A CNI-HOSTPORT-DNAT -m comment --comment "dnat name: \"cilium-portmap\" id: \"95dc537b9152da5f91be3fc5692bf91592bf1871b6e61755aed2056a03e98c4f\"" -j CNI-DN-258a52f03b4b7aa8abdc5

The fix is to be more restrictive in selecting rules to remove and limit it to
rules which contain the string "CILIUM_".

Fixes: #6499

Signed-off-by: Thomas Graf <thomas@cilium.io>
tgraf added a commit that referenced this issue Jan 4, 2019
[ upstream commit 36cdd98 ]

The existing legacy rule removal logic on bootstrap removed all rules which
contains the word "cilium". While this removed Cilium relevant rules, it also
incorrectly removed rules installed by the portmap/hostport plugin if the
plugin was configured with a name that contained the string cilium.

Example CNI configuration:
```
    {
      "cniVersion": "0.3.1",
        "name": "cilium-portmap",
        "plugins": [
          {
            "type": "cilium-cni"
          },
          {
            "type": "portmap",
            "capabilities": { "portMappings": true }
          }
        ]
    }
```

Example of incorrectly removed rule:
-A CNI-HOSTPORT-DNAT -m comment --comment "dnat name: \"cilium-portmap\" id: \"95dc537b9152da5f91be3fc5692bf91592bf1871b6e61755aed2056a03e98c4f\"" -j CNI-DN-258a52f03b4b7aa8abdc5

The fix is to be more restrictive in selecting rules to remove and limit it to
rules which contain the string "CILIUM_".

Fixes: #6499

Signed-off-by: Thomas Graf <thomas@cilium.io>
ianvernon pushed a commit that referenced this issue Jan 9, 2019
[ upstream commit 36cdd98 ]

The existing legacy rule removal logic on bootstrap removed all rules which
contains the word "cilium". While this removed Cilium relevant rules, it also
incorrectly removed rules installed by the portmap/hostport plugin if the
plugin was configured with a name that contained the string cilium.

Example CNI configuration:
```
    {
      "cniVersion": "0.3.1",
        "name": "cilium-portmap",
        "plugins": [
          {
            "type": "cilium-cni"
          },
          {
            "type": "portmap",
            "capabilities": { "portMappings": true }
          }
        ]
    }
```

Example of incorrectly removed rule:
-A CNI-HOSTPORT-DNAT -m comment --comment "dnat name: \"cilium-portmap\" id: \"95dc537b9152da5f91be3fc5692bf91592bf1871b6e61755aed2056a03e98c4f\"" -j CNI-DN-258a52f03b4b7aa8abdc5

The fix is to be more restrictive in selecting rules to remove and limit it to
rules which contain the string "CILIUM_".

Fixes: #6499

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Ian Vernon <ian@cilium.io>
ianvernon pushed a commit that referenced this issue Jan 14, 2019
[ upstream commit 36cdd98 ]

The existing legacy rule removal logic on bootstrap removed all rules which
contains the word "cilium". While this removed Cilium relevant rules, it also
incorrectly removed rules installed by the portmap/hostport plugin if the
plugin was configured with a name that contained the string cilium.

Example CNI configuration:
```
    {
      "cniVersion": "0.3.1",
        "name": "cilium-portmap",
        "plugins": [
          {
            "type": "cilium-cni"
          },
          {
            "type": "portmap",
            "capabilities": { "portMappings": true }
          }
        ]
    }
```

Example of incorrectly removed rule:
-A CNI-HOSTPORT-DNAT -m comment --comment "dnat name: \"cilium-portmap\" id: \"95dc537b9152da5f91be3fc5692bf91592bf1871b6e61755aed2056a03e98c4f\"" -j CNI-DN-258a52f03b4b7aa8abdc5

The fix is to be more restrictive in selecting rules to remove and limit it to
rules which contain the string "CILIUM_".

Fixes: #6499

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Ian Vernon <ian@cilium.io>
tgraf added a commit that referenced this issue Jan 15, 2019
[ upstream commit 36cdd98 ]

The existing legacy rule removal logic on bootstrap removed all rules which
contains the word "cilium". While this removed Cilium relevant rules, it also
incorrectly removed rules installed by the portmap/hostport plugin if the
plugin was configured with a name that contained the string cilium.

Example CNI configuration:
```
    {
      "cniVersion": "0.3.1",
        "name": "cilium-portmap",
        "plugins": [
          {
            "type": "cilium-cni"
          },
          {
            "type": "portmap",
            "capabilities": { "portMappings": true }
          }
        ]
    }
```

Example of incorrectly removed rule:
-A CNI-HOSTPORT-DNAT -m comment --comment "dnat name: \"cilium-portmap\" id: \"95dc537b9152da5f91be3fc5692bf91592bf1871b6e61755aed2056a03e98c4f\"" -j CNI-DN-258a52f03b4b7aa8abdc5

The fix is to be more restrictive in selecting rules to remove and limit it to
rules which contain the string "CILIUM_".

Fixes: #6499

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Ian Vernon <ian@cilium.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack.
Projects
No open projects
1.4
  
Done
Development

Successfully merging a pull request may close this issue.

2 participants