Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Externally initiated IPv4 sessions in Stateful NAT64 mode crash the kernel #137
I just had one of my Jool test nodes experience the following kernel panic:
The node is a KVM-based virtual machine, running x86_64 Ubuntu 14.04.2. Jool was loaded using
I will try to see if I can figure out a way to reproduce the crash.
It seems to crash quite quickly after bootup actually. Console messages from another crash with Jool built with debugging:
A tcpdump taken on the software bridge on the VM hypervisor shows the packets arriving on or being received from the Jool node in the moments leading up to the crash (I have the full PCAP if you want it):
There's also something really odd going on with those IPv4 packets that gets translated with to an IPv6 destination addr#port of
Ok, so with an IPv4 /24 routed to it and added to the IPv4 pool, it crashes within minutes just from background internet noise. Every crash I've seen is preceded by the
I've bisected it. The problem with crashes starts occuring with commit b2dbe4b:
However the bogus sessions to
However they don't appear to actually do any damage (like crash the kernel) until commit b2dbe4b, the bogus sessions seem to expire just fine after the five second V4_INIT state, with no ill effects.
I'm very sorry for not noticing this issue before the v3.3.0 release (my testing was mostly focused on the stateless mode).
changed the title from
Linux kernel panic (GPF) with Jool in Stateful NAT64 mode
Externally initiated IPv4 sessions in Stateful NAT64 mode crash the kernel
Mar 10, 2015
From RFC 6146:
That's weird. I do remember testing this before releasing.
I guess if these sessions are really causing this, we should find another way to represent them and/or store them elsewhere.
"Cleaner name: TCP_SYN" happens when Jool is trying to delete expired sessions. I'm going to take a look at this code right away.
On the contrary, sorry for the trouble, and thank you for being so thorough with your testing.
I think I found it.
The "original packet" no longer exists; it's a bogus pointer.
I just uploaded the likely solution to the issue137 branch. I can't test it yet because I have an appointment right now. Will be back in an hour.
This logic was also used in the fragment database, so I tweaked that too.
6c0773c does indeed appear to fix the crashes. (At least my test node hasn't crashed after 20 minutes of uptime with an /24 used as the IPv4 pool, that didn't happen before.)
This doesn't actually seem to work right, though. Maybe I wasn't too clear about it earlier, but these IPv6 packets destined for
(The IPv4 packet was generated on
The fact that Jool forwards such packets violates section 2.5.2 of RFC4291, which states:
added a commit
Mar 10, 2015
Oops, sorry :/
It seems we had never realized this because we do not assign default routes to the NAT64 machine in the lab. I actually had to add it now to force the :: packet to reach the cord.
(Isn't it a little weird that Linux routes packets towards :: without complaining?)
Anyway, I just commited a fix and there no longer seems to be any bogus traffic. Please bounce back if you see any more quirks.
(The code is still at the issue137 branch)
The issue137 branch looks good to me now - no crashes, no packets to
Yes and no I guess. One could argue that the Class E space in IPv4 was forever broken because of host stacks enforced that those addresses shouldn't be used. Now it's impossible to change that because of the huge installed base of host stacks that refuse to work with those addresses. So by allowing packets destined for the unspecified address, I guess the door is kept open for someone to standardise some use of such packets...