Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpf: don't answer ARP requests for endpoint IP #11533

Merged
merged 1 commit into from May 21, 2020

Conversation

jcaamano
Copy link
Contributor

Previously, bpf_lxc was answering to all ARP requests. This causes
an issue in an scenario where a duplicate address check is being
conducted inside the container by sending an ARP request for the
endpoint IP for which no answer is expected. This fixes it by answering
to ARP requests for all IPs except the endpoint IP.

Fixes: #10574

Signed-off-by: Jaime Caamaño Ruiz jcaamano@suse.com

@jcaamano jcaamano requested a review from a team May 14, 2020 15:33
@maintainer-s-little-helper
Copy link

Please set the appropriate release note label.

@coveralls
Copy link

coveralls commented May 14, 2020

Coverage Status

Coverage decreased (-0.02%) to 36.987% when pulling a4b39ac on jcaamano:arp-lxc-fix into 859e178 on cilium:master.

Copy link
Member

@joestringer joestringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of minor nits below, but otherwise looks good.

bpf/lib/arp.h Outdated
ctx_store_bytes(ctx, 28, &ip, 4, 0) < 0 ||
ctx_store_bytes(ctx, 32, &smac, sizeof(smac), 0) < 0 ||
ctx_store_bytes(ctx, 38, &sip, sizeof(sip), 0) < 0)
ctx_store_bytes(ctx, 22, smac, 6, 0) < 0 ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noted from the previous PR that you stated that the verifier rejected the program if you used sizeof(*smac) here rather than 6. Is that correct?

In that case I think it'd be good to at least add a note here that states this is a verifier workaround. We do this in various places in the datapath, which allows us to easily collect such cases and use them to investigate & improve upstream LLVM/Linux and also gives us the context to know why it's written this way so eventually we can convert it to more standard C styling.

In addition I'd imagine there's a few BPF folks who would be interested to see the BPF instruction comparison generated by LLVM for the two cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added note, but I should have asked before if there some specific formats to these notes so they can be tracked down easier.

Anyway, by just replacing 6 by sizeof(*smac), this is the verifier output:

Verifier analysis:
0: (bf) r6 = r1
1: (18) r1 = 0xafdf
3: (6b) *(u16 *)(r10 -44) = r1
4: (18) r1 = 0x5cfb74c2
6: (63) *(u32 *)(r10 -48) = r1
7: (b7) r0 = 0
8: (61) r2 = *(u32 *)(r6 +80)
9: (61) r1 = *(u32 *)(r6 +76)
10: (bf) r3 = r1
11: (07) r3 += 42
12: (2d) if r3 > r2 goto pc+185
 R0=imm0,min_value=0,max_value=0 R1=pkt(id=0,off=0,r=42) R2=pkt_end R3=pkt(id=0,off=42,r=42) R6=ctx R10=fp
13: (69) r2 = *(u16 *)(r1 +20)
14: (55) if r2 != 0x100 goto pc+183
 R0=imm0,min_value=0,max_value=0 R1=pkt(id=0,off=0,r=42) R2=inv48,min_value=256,max_value=256 R3=pkt(id=0,off=42,r=42) R6=ctx R10=fp
15: (69) r2 = *(u16 *)(r1 +14)
16: (55) if r2 != 0x100 goto pc+181
 R0=imm0,min_value=0,max_value=0 R1=pkt(id=0,off=0,r=42) R2=inv48,min_value=256,max_value=256 R3=pkt(id=0,off=42,r=42) R6=ctx R10=fp
17: (61) r2 = *(u32 *)(r1 +0)
18: (18) r3 = 0xffffffff
20: (5d) if r2 != r3 goto pc+2
 R0=imm0,min_value=0,max_value=0 R1=pkt(id=0,off=0,r=42) R2=inv R3=inv R6=ctx R10=fp
21: (69) r3 = *(u16 *)(r1 +4)
22: (15) if r3 == 0xffff goto pc+10
 R0=imm0,min_value=0,max_value=0 R1=pkt(id=0,off=0,r=42) R2=inv R3=inv48 R6=ctx R10=fp
23: (18) r3 = 0x5cfb74c2
25: (67) r3 <<= 32
26: (77) r3 >>= 32
27: (5d) if r2 != r3 goto pc+170
 R0=imm0,min_value=0,max_value=0 R1=pkt(id=0,off=0,r=42) R2=inv R3=inv32 R6=ctx R10=fp
28: (18) r2 = 0xafdf
30: (57) r2 &= 65535
31: (69) r3 = *(u16 *)(r1 +4)
32: (5d) if r3 != r2 goto pc+165
 R0=imm0,min_value=0,max_value=0 R1=pkt(id=0,off=0,r=42) R2=inv48,min_value=0,max_value=65535 R3=inv48 R6=ctx R10=fp
33: (71) r3 = *(u8 *)(r1 +7)
34: (67) r3 <<= 8
35: (71) r2 = *(u8 *)(r1 +6)
36: (4f) r3 |= r2
37: (71) r2 = *(u8 *)(r1 +9)
38: (67) r2 <<= 8
39: (71) r4 = *(u8 *)(r1 +8)
40: (4f) r2 |= r4
41: (67) r2 <<= 16
42: (4f) r2 |= r3
43: (71) r3 = *(u8 *)(r1 +11)
44: (67) r3 <<= 8
45: (71) r4 = *(u8 *)(r1 +10)
46: (4f) r3 |= r4
47: (71) r4 = *(u8 *)(r1 +13)
48: (67) r4 <<= 8
49: (71) r5 = *(u8 *)(r1 +12)
50: (4f) r4 |= r5
51: (67) r4 <<= 16
52: (4f) r4 |= r3
53: (67) r4 <<= 32
54: (4f) r4 |= r2
55: (7b) *(u64 *)(r10 -56) = r4
56: (71) r3 = *(u8 *)(r1 +39)
57: (67) r3 <<= 8
58: (71) r2 = *(u8 *)(r1 +38)
59: (4f) r3 |= r2
60: (71) r4 = *(u8 *)(r1 +40)
61: (71) r2 = *(u8 *)(r1 +41)
62: (67) r2 <<= 8
63: (4f) r2 |= r4
64: (67) r2 <<= 16
65: (4f) r2 |= r3
66: (18) r3 = 0xf13c100a
68: (67) r3 <<= 32
69: (77) r3 >>= 32
70: (1d) if r2 == r3 goto pc+127
 R0=imm0,min_value=0,max_value=0 R1=pkt(id=0,off=0,r=42) R2=inv R3=inv32 R4=inv56 R5=inv56 R6=ctx R10=fp
71: (71) r3 = *(u8 *)(r1 +30)
72: (71) r4 = *(u8 *)(r1 +31)
73: (71) r5 = *(u8 *)(r1 +28)
74: (71) r1 = *(u8 *)(r1 +29)
75: (63) *(u32 *)(r10 -32) = r2
76: (67) r1 <<= 8
77: (4f) r1 |= r5
78: (67) r4 <<= 8
79: (4f) r4 |= r3
80: (67) r4 <<= 16
81: (4f) r4 |= r1
82: (63) *(u32 *)(r10 -8) = r4
83: (b7) r1 = 512
84: (6b) *(u16 *)(r10 -34) = r1
85: (bf) r3 = r10
86: (07) r3 += -48
87: (b7) r8 = 0
88: (bf) r1 = r6
89: (b7) r2 = 6
90: (b7) r4 = 6
91: (b7) r5 = 0
92: (85) call 9
93: (18) r7 = 0xffffff73
95: (67) r0 <<= 32
96: (c7) r0 s>>= 32
97: (6d) if r8 s> r0 goto pc+63
 R0=inv,min_value=0 R6=ctx R7=inv R8=imm0,min_value=0,max_value=0 R10=fp
98: (bf) r3 = r10
99: (07) r3 += -56
100: (bf) r1 = r6
101: (b7) r2 = 0
102: (b7) r4 = 6
103: (b7) r5 = 0
104: (85) call 9
105: (67) r0 <<= 32
106: (c7) r0 s>>= 32
107: (6d) if r8 s> r0 goto pc+53
 R0=inv,min_value=0 R6=ctx R7=inv R8=imm0,min_value=0,max_value=0 R10=fp
108: (bf) r3 = r10
109: (07) r3 += -34
110: (b7) r8 = 0
111: (bf) r1 = r6
112: (b7) r2 = 20
113: (b7) r4 = 2
114: (b7) r5 = 0
115: (85) call 9
116: (67) r0 <<= 32
117: (c7) r0 s>>= 32
118: (6d) if r8 s> r0 goto pc+42
 R0=inv,min_value=0 R6=ctx R7=inv R8=imm0,min_value=0,max_value=0 R10=fp
119: (bf) r3 = r10
120: (07) r3 += -48
121: (bf) r1 = r6
122: (b7) r2 = 22
123: (b7) r4 = 8
124: (b7) r5 = 0
125: (85) call 9
invalid indirect read from stack off -48+6 size 8

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the difference between the bad program above, and the non problematic one that uses 6 instead of sizeof(*smac):

120c120
<      123:	r4 = 8
---
>      123:	r4 = 6

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting, because I would expect sizeof(*smac) to be 6, because a mac is 6B. But apparently there's something I'm missing with this assumption. Maybe it's being padded out to alignment? Or the change was unintentionally taking the size of the pointer rather than the size of the structure?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got that backwards somehow? With 6, it's r4 = 6. With sizeof, it's r4 = 8. I would say yes, that would be a result of padding struct { u32; u16; }. In bpf/lib/eth.h, there is a constant defined ETH_ALEN used throughout the file. Let's use that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM. Does this resolve the issue you had with the verifier?

bpf/bpf_lxc.c Outdated Show resolved Hide resolved
Previously, bpf_lxc was answering to all ARP requests. This causes
an issue in an scenario where a duplicate address check is being
conducted inside the container by sending an ARP request for the
endpoint IP for which no answer is expected. This fixes it by answering
to ARP requests for all IPs except the endpoint IP.

Fixes: cilium#10574

Signed-off-by: Jaime Caamaño Ruiz <jcaamano@suse.com>
@joestringer
Copy link
Member

test-me-please

@joestringer joestringer added the release-note/minor This PR changes functionality that users may find relevant to operating Cilium. label May 20, 2020
@joestringer joestringer added the sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. label May 20, 2020
@joestringer joestringer added this to the 1.8 milestone May 20, 2020
@joestringer
Copy link
Member

Looking at the failures, I see #11512 ; Also the k8s-1.18-kernel-4.9 test failed only on that build and doesn't seem directly related to this PR so I filed #11634 to track it as a flake.

CI is otherwise green. The previous iteration of this PR was reviewed also by @aanm and @qmonnet and this one has been out for some time now, so I think it's good to merge.

@joestringer joestringer merged commit 6e36360 into cilium:master May 21, 2020
1.8.0 automation moved this from In progress to Merged May 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note/minor This PR changes functionality that users may find relevant to operating Cilium. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
No open projects
1.8.0
  
Merged
Development

Successfully merging this pull request may close these issues.

ARP responder replying to requests for IP on the same endpoint that owns that IP.
3 participants