Wrong MTU in Fragmentation Needed/Packet too Big messages #170

Closed
ydahhrk opened this Issue Sep 11, 2015 · 5 comments

Projects

None yet

2 participants

@ydahhrk
Member
ydahhrk commented Sep 11, 2015

Jool's 3.3 series isn't compensating for the difference between the IPv4/v6 headers when generating the MTU field of 'Fragmentation Needed' and 'Packet too Big' ICMP errors.

This only affects ICMP errors generated at Jool. Translating ICMP errors get their MTU adjusted just fine.

This is supposed to be messing up Path MTU discovery... but if this is the case, it's weird that nobody seems to have tumbled into this problem.

Or maybe it's the actual cause of this comment? (I guess it's a stretch.)

I think I should tag this as critical, so we'll probably release 3.3.4 tomorrow.

@ydahhrk ydahhrk added this to the 3.3.4 milestone Sep 11, 2015
@toreanderson
Contributor

I'm guessing the prevalence of 1500-byte MTUs and TCP MSS prevents the need for PMTUD. The IPv6 speaker will advertise an TCP MSS of 1440, causing the IPv4 speaker to never send larger frames than 1480 bytes - which fits snugly into an 1500 bytes large IPv6 frame.

I doubt this issue has anything to do with issue #136.

@ydahhrk ydahhrk added a commit that referenced this issue Sep 11, 2015
@ydahhrk ydahhrk Issue #170, MTU adjustment.
I noticed another problem: Linux will refuse to send Jool's ICMPv4 errors when in SIIT mode, but likely only when it doesn't have a default gateway.
This appears to be because it doesn't have routes towards IPv4 translatable addresses.
Still thinking how to fix this.
145812b
@ydahhrk
Member
ydahhrk commented Sep 17, 2015

Found the other problem.

SIIT Jool cannot create ICMPv4 errors if IPv4 forwarding is disabled. (the routing table has nothing to do with it.)

Therefore, removing this step from the documentation was a mistake:

sudo sysctl -w net.ipv4.conf.all.forwarding=1

I repeat: IPv4 FORWARDING MUST BE ENABLED.

Ok, looks good. Will hopefully release 3.3.4 tomorrow.

@toreanderson
Contributor

Hm. I cannot reproduce that. My SIIT-DC BRs are all running Jool e7cdebc with no forwarding sysctl set for any interface, neither for IPv4 nor for IPv6. They are returning ICMP errors just fine. I quickly verified the following cases:

  • ICMPv4 TTL expired (by tracerouting from IPv4 to IPv6)
  • ICMPv6 HLIM expired (by tracerouting from IPv6 to IPv4)
  • ICMPv4 Frag Needed (by sending a 1500-byte ping with DF=1 from IPv4 to IPv6)
@ydahhrk
Member
ydahhrk commented Sep 18, 2015

A little background:

Before calling icmp_send() (the Linux function that handles creating ICMPv4 errors), Jool has to in-route (route4_input()) the offending (original) packet.

Why? I wish I knew. It's odd that by design the kernel prohibits ICMPv4 errors during the whole prerouting chain unless the original packet is explicitly in-routed (example).

route4_input() is a Jool wrapper for the Linux function ip_route_input(). ip_route_input() is a wrapper for ip_route_input_common() (ip_route_input_noref() in newer kernels), which (upon caching not yielding results) calls ip_route_input_slow(). This function is the one that cares about IPv4 forwarding.

Hm. I cannot reproduce that. My SIIT-DC BRs are all running Jool e7cdebc with no forwarding sysctl set for any interface, neither for IPv4 nor for IPv6. They are returning ICMP errors just fine.

Now that you mention it, the forwarding validation yields flat failure in lower kernels, but falls back to assume the packet is to be delivered locally in higher kernels. This assumption code ends up setting a (perhaps) misled, but valid, blackhole-ish destination on the packet which nevertheless prevents icmp_send() from failing.

(I'm assuming you're using one of the "higher" kernels.)

Honestly, I don't know if this is good or bad. In an ideal world, icmp_send() would not even require the packet to have a known destination. (why would it? icmpv6_send() doesn't.) Failing that, Jool would ideally define its own "dummy" destination (thereby removing the need to call route4_input() entirely), but the kernel doesn't seem to export a way to create one the destination constructor requires the official IPv4-destination-options, which are private.

Therefore, relying on the "fake local" destination seems to be the way to go. Until we get out of prerouting, that is. Man, I really need to get that migration started already.

Summary (also tl;dr)

So, from a quick overview of Linux's code, IPv4 forwarding is mandatory in kernels 3.5 and lower. IPv4 forwarding is not needed in kernels 3.6 and higher. I can reliably toggle ICMPv4 errors in Ubuntu 12.04 (kernel 3.2) by tweaking IPv4 forwarding, but nothing happens if I do the same in Ubuntu 14.04 (kernel 3.13).

(After the migration, both IP forwardings will be required anyway due to the cross-interface routing.)

@ydahhrk
Member
ydahhrk commented Sep 18, 2015

This assumption code ends up setting a (perhaps) misled, but valid, blackhole-ish destination on the packet which nevertheless prevents icmp_send() from failing.

It seems this isn't a coincidence.

The commit where this happens is torvalds/linux@251da41.

The author wanted the destination to be cached, which does happen, but then success is returned and the destiny of the packet is deferred until the ip_error() function as a result or side effect. ip_error() normally happens during the actual legitimate in-routing. (ie. after the prerouting chain.)

So the "fake local" destination is intended as a placeholder so an ICMP error can be built out of the packet later. In other words, it's exactly what Jool wants it for.

It seems this solution is more elegant that I originally thought. (lower kernel crybabying notwithstanding.)

@ydahhrk ydahhrk closed this Sep 22, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment