Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ICMP errors generated by tracked flows treated as related traffic #2247

Merged
merged 21 commits into from Apr 21, 2020

Conversation

tomastigera
Copy link
Contributor

@tomastigera tomastigera commented Apr 2, 2020

Description

Todos

  • Unit tests (full coverage)
  • Integration tests (delete as appropriate) In plan/Not needed/Done
  • Documentation
  • Backport
  • Release note

Release Note

The BPF dataplane now handles ICMP error messages as "related" traffic so that they follow the same path back through the dataplane as the packet they respond to. This improves compatibility with path MTU detection and other non-mainline traffic.

@tomastigera tomastigera force-pushed the tomas-icmp-related branch 2 times, most recently from 82cf3e2 to cda10e9 Compare April 6, 2020 22:10
@tomastigera tomastigera marked this pull request as ready for review April 7, 2020 00:50
@tomastigera tomastigera requested a review from a team as a code owner April 7, 2020 00:50
@@ -327,6 +332,14 @@ enum calico_ct_result_type {
CALI_CT_INVALID,
};

#define CALI_CT_RELATED (1 << 8)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i decided to pass the flag as part of the code (a) the result struct is part of the state and I did not want to change it and (b) it effectively creates a whole set of return values, which was another option, but they would only differ in related or not

static CALI_BPF_INLINE struct calico_ct_result calico_ct_v4_lookup(struct ct_ctx *ctx)
{
__u8 proto = ctx->proto;
__u8 proto_orig = ctx->proto;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ctx->proto changes once we decode the icmp payload

if (state->ip_proto == IPPROTO_ICMP && ct_related) {
/* do not fix up embedded L4 checksum for related ICMP */
} else {
switch (ip_header->protocol) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

state carries information about the entire packet, the state->ip_proto is still icmp, but we have moved the ip_header to point to the inner header. All the editing it done where point to, so we should use that info even in the case of regular NAT

@tomastigera tomastigera force-pushed the tomas-icmp-related branch 3 times, most recently from 09677d0 to b9e82c7 Compare April 9, 2020 19:43
Can be reused in other locations.
This is a prerequisite for letting host to handle TTL exceeded.
Sets up endpoints, routes etc, but does not run
to complement the fact that we can create an inactive workload
Check that the ports are fixed up after NATing the ICMP related back
Needs to update csum in the inner IP header instead of the outer.

When icmp generated by the next hop node, it needs to be placed in the
tunnel.
bpf-gpl/tc.c Outdated Show resolved Hide resolved
Copy link
Member

@fasaxc fasaxc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good, with some nice cleanups in the CT lookup key calculation; thanks for that!

There are a few areas marked with XXX; I guess those either need cleaning up or we need to discuss the right way forward?

}

if (!icmp_skb_get_hdr(skb, &icmp)) {
CALI_DEBUG("Ooops, we already passed one such a check!!!\n");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth making the log say what failed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps, but the function above also logs and this is more like this never happens, I'd panic if I could but did not feel like I should let this go completely unhandled.

bpf-gpl/tc.c Outdated Show resolved Hide resolved
bpf-gpl/tc.c Show resolved Hide resolved
This is triggered by missing tcpdump or bad filter
only felix-test image has tcpdump installed, extrnal client has not
if SNAT, we also need to fix up the source in the outer IP

FV tests when ICMP is returned from host and from the backing workload
When we generate an ICMP in a workload or at a host in response to
traffic that originated through a NP tunnel ammend the outer source IP
as if the reponse was from the original node as all the rest is internal
to the cluster.
@fasaxc fasaxc added this to the Calico v3.14.0 milestone Apr 17, 2020
We need to treat related ICMP from tunnel the same way as we do the
original traffic, otherwise it would go through the conntrack on the way
out and would create ICMP tracking record.
@fasaxc fasaxc merged commit e3ece8a into projectcalico:master Apr 21, 2020
@tomastigera tomastigera deleted the tomas-icmp-related branch April 21, 2020 20:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants