Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipsec: Fix xfrm output mask #14381

Merged
merged 1 commit into from
Dec 15, 2020
Merged

Conversation

pchaigno
Copy link
Member

First commit temporarily uses our netlink fork to be able to set the xfrm output mask. Second commit sets said mask in our xfrm rules. See commit messages for details.

I tested by applying the following diff to our existing test and manually checking the source identity is preserved when making requests between nodes:

diff --git a/test/k8sT/DatapathConfiguration.go b/test/k8sT/DatapathConfiguration.go
index c11628f67..a91bc17bb 100644
--- a/test/k8sT/DatapathConfiguration.go
+++ b/test/k8sT/DatapathConfiguration.go
@@ -214,7 +214,8 @@ var _ = Describe("K8sDatapathConfig", func() {
 
                        deploymentManager.Deploy(helpers.CiliumNamespace, IPSecSecret)
                        deploymentManager.DeployCilium(map[string]string{
-                               "encryption.enabled": "true",
+                               "encryption.enabled":     "true",
+                               "bpf.monitorAggregation": "none",
                        }, DeployCiliumOptionsAndDNS)
                        validateBPFTunnelMap()
                        Expect(testPodConnectivityAcrossNodes(kubectl)).Should(BeTrue(), "Connectivity test with IPsec between nodes failed")
@@ -736,6 +737,7 @@ func testPodConnectivityAndReturnIP(kubectl *helpers.Kubectl, requireMultiNode b
        if !res.WasSuccessful() {
                return false, targetIP
        }
+       helpers.HoldEnvironment("Let's check the source identity!")
 
        // HTTP connectivity test
        res = kubectl.ExecPodCmd(randomNamespace, srcPod,
Fix missing packet mark mask that can cause policy deny drops in IPSec configuration. 

@pchaigno pchaigno added kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. needs-backport/1.8 labels Dec 11, 2020
@pchaigno pchaigno requested review from jrfastab and a team December 11, 2020 22:47
@pchaigno pchaigno requested a review from a team as a code owner December 11, 2020 22:47
@maintainer-s-little-helper maintainer-s-little-helper bot added this to In progress in 1.10.0 Dec 11, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.9.2 Dec 11, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.8.7 Dec 11, 2020
@pchaigno pchaigno force-pushed the pr/pchaigno/fix-xfrm-outputmark-mask branch 2 times, most recently from 54752d3 to 9758513 Compare December 11, 2020 23:01
@aditighag
Copy link
Member

Are there any valid cases where we would be sending over zero identities over tunnel header? In other words, can we distinguish between a set/unset identity?

@jrfastab
Copy link
Contributor

Other than small nit about using the default value RouteMarkMask LGTM

@pchaigno pchaigno force-pushed the pr/pchaigno/fix-xfrm-outputmark-mask branch from 9758513 to 3c0387e Compare December 12, 2020 08:31
@pchaigno pchaigno added the area/encryption Impacts encryption support such as IPSec, WireGuard, or kTLS. label Dec 12, 2020
@pchaigno
Copy link
Member Author

Are there any valid cases where we would be sending over zero identities over tunnel header? In other words, can we distinguish between a set/unset identity?

I can't think of any. If we fail to resolve an identity we should fallback to WORLD_ID.

Are you thinking we could fallback to an ipcache lookup at the destination node if the identity is unset? That might work, but (1) I'm not sure we always have all the identities on the remote ipcaches and (2) we would need to consider the complexity impact.

@aditighag
Copy link
Member

Are there any valid cases where we would be sending over zero identities over tunnel header? In other words, can we distinguish between a set/unset identity?

I can't think of any. If we fail to resolve an identity we should fallback to WORLD_ID.

Are you thinking we could fallback to an ipcache lookup at the destination node if the identity is unset? That might work, but (1) I'm not sure we always have all the identities on the remote ipcaches and (2) we would need to consider the complexity impact.

I was thinking more from detecting such error conditions. Perhaps we can log a warning?

@pchaigno
Copy link
Member Author

I was thinking more from detecting such error conditions. Perhaps we can log a warning?

Could be a topic for the next SIG Datapath meeting? It would definitely help uncovering such bugs in CI. It may be a bit hard to implement though because we either need to trace traffic from time to time and know which identity should be non-zero, or we need to extend the datapath to report such cases. Maybe there's a middle ground where we point those out in the cilium monitor output?

@jrfastab
Copy link
Contributor

jrfastab commented Dec 13, 2020

An observation,

        RouteMarkMask = 0xF00

and

        // IPsecMarkMask is the mask required for the IPsec SPI and encrypt/decrypt bits
        IPsecMarkMask = 0xFF00

So using IPsecMarkMask is slightly different from original patch. LGTM though.

@pchaigno pchaigno added the dont-merge/blocked Another PR must be merged before this one. label Dec 13, 2020
@pchaigno pchaigno force-pushed the pr/pchaigno/fix-xfrm-outputmark-mask branch from 3c0387e to b4bdba9 Compare December 14, 2020 09:55
@pchaigno pchaigno removed the dont-merge/blocked Another PR must be merged before this one. label Dec 14, 2020
Commit b6f2e75 set the xfrm output mark for the tunneling mode, but did
not set a mask. As a result, the packet mark is zeroed before writing
the new mark value, clearing our source security context.

In particular, this loss of security context happen when packets are
passed from the lxc devices to cilium_host via the stack. We then
receive a zero source identity in bpf_host and transmit it over the
tunnel to the destination nodes. That causes policy deny drops on remote
nodes when policies are enforced.

The bug can be observed by tracing the packets leaving the containers on
the source node. The source identity becomes unknown when it leaves
cilium_host (-> overlay trace event):

    <- endpoint 684 flow 0x10b975f7 identity 14765->unknown state new ifindex 0 orig-ip 0.0.0.0: 10.0.1.68:41856 -> 10.0.0.206:80 tcp SYN
    -> stack flow 0x10b975f7 identity 14765->30552 state new ifindex 0 orig-ip 0.0.0.0: 10.0.1.68:41856 -> 10.0.0.206:80 tcp SYN
    -> overlay flow 0x10b975f7 identity unknown->unknown state new ifindex cilium_vxlan orig-ip 0.0.0.0: 10.0.1.122 -> 10.0.0.184

This commit fixes the bug by setting the mask appropriately. That
requires us to temporarily use our cilium/netlink fork which has support
for setting the output mask [1], until that support is upstreamed.

1 - https://github.com/cilium/netlink/tree/outputmark-mask
Fixes: b6f2e75 ("cilium: set output mark for tunnel case")
Suggested-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Paul Chaignon <paul@cilium.io>
@pchaigno pchaigno force-pushed the pr/pchaigno/fix-xfrm-outputmark-mask branch from b4bdba9 to 616cef3 Compare December 14, 2020 10:08
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.9 in 1.9.2 Dec 15, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.9 in 1.9.2 Dec 15, 2020
@jibi jibi mentioned this pull request Dec 17, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.8 in 1.8.7 Dec 17, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.8 in 1.8.7 Dec 17, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.9 to Backport done to v1.9 in 1.9.2 Jan 5, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.9 to Backport done to v1.9 in 1.9.2 Jan 5, 2021
pchaigno added a commit that referenced this pull request Jan 5, 2021
When using IPSec with Cilium in tunneling mode, we need support for the
xfrm state output mask in the kernel (cf. #14381). This commit probes
for such kernel support, introduced upstream in 9b42c1f ("xfrm: Extend
the output_mark to support input direction and masking"), on startup
and fatals if the kernel is too old.

The lack of kernel support only breaks policy enforcement across nodes
and we can probably make it work in the future, but in the meantime,
it's best to cleanly fatal.

Signed-off-by: Paul Chaignon <paul@cilium.io>
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.8 to Backport done to v1.8 in 1.8.7 Jan 6, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.8 to Backport done to v1.8 in 1.8.7 Jan 6, 2021
pchaigno added a commit that referenced this pull request Jan 11, 2021
When using IPSec with Cilium in tunneling mode, we need support for the
xfrm state output mask in the kernel (cf. #14381). This commit probes
for such kernel support, introduced upstream in 9b42c1f ("xfrm: Extend
the output_mark to support input direction and masking"), on startup
and fatals if the kernel is too old.

The lack of kernel support only breaks policy enforcement across nodes
and we can probably make it work in the future, but in the meantime,
it's best to cleanly fatal.

Signed-off-by: Paul Chaignon <paul@cilium.io>
nathanjsweet pushed a commit that referenced this pull request Jan 11, 2021
[ upstream commit 1870713 ]

When using IPSec with Cilium in tunneling mode, we need support for the
xfrm state output mask in the kernel (cf. #14381). This commit probes
for such kernel support, introduced upstream in 9b42c1f ("xfrm: Extend
the output_mark to support input direction and masking"), on startup
and fatals if the kernel is too old.

The lack of kernel support only breaks policy enforcement across nodes
and we can probably make it work in the future, but in the meantime,
it's best to cleanly fatal.

Signed-off-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Nate Sweet <nathanjsweet@pm.me>
nathanjsweet pushed a commit that referenced this pull request Jan 11, 2021
[ upstream commit 1870713 ]

When using IPSec with Cilium in tunneling mode, we need support for the
xfrm state output mask in the kernel (cf. #14381). This commit probes
for such kernel support, introduced upstream in 9b42c1f ("xfrm: Extend
the output_mark to support input direction and masking"), on startup
and fatals if the kernel is too old.

The lack of kernel support only breaks policy enforcement across nodes
and we can probably make it work in the future, but in the meantime,
it's best to cleanly fatal.

Signed-off-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Nate Sweet <nathanjsweet@pm.me>
gandro pushed a commit that referenced this pull request Jan 13, 2021
[ upstream commit 1870713 ]

When using IPSec with Cilium in tunneling mode, we need support for the
xfrm state output mask in the kernel (cf. #14381). This commit probes
for such kernel support, introduced upstream in 9b42c1f ("xfrm: Extend
the output_mark to support input direction and masking"), on startup
and fatals if the kernel is too old.

The lack of kernel support only breaks policy enforcement across nodes
and we can probably make it work in the future, but in the meantime,
it's best to cleanly fatal.

Signed-off-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Nate Sweet <nathanjsweet@pm.me>
aanm pushed a commit that referenced this pull request Jan 13, 2021
[ upstream commit 1870713 ]

When using IPSec with Cilium in tunneling mode, we need support for the
xfrm state output mask in the kernel (cf. #14381). This commit probes
for such kernel support, introduced upstream in 9b42c1f ("xfrm: Extend
the output_mark to support input direction and masking"), on startup
and fatals if the kernel is too old.

The lack of kernel support only breaks policy enforcement across nodes
and we can probably make it work in the future, but in the meantime,
it's best to cleanly fatal.

Signed-off-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Nate Sweet <nathanjsweet@pm.me>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/encryption Impacts encryption support such as IPSec, WireGuard, or kTLS. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
No open projects
1.8.7
Backport done to v1.8
1.9.2
Backport done to v1.9
Development

Successfully merging this pull request may close these issues.

None yet

7 participants