When a Jool node in stateless mode receives an IPv4 packet that is in its blacklist with TTL=1, it will steal the packet from its intended receipient and return an ICMPv4 error:
Catching IPv4 packet: 192.168.1.1->192.168.1.2
Translating the Packet.
Sending ICMPv4 error: ICMPERR_HOP_LIMIT, type: 11, code: 0.
Whether the destination address is in the implicit blacklist (due to being assigned to a local interface) or explicitly added with jool_siit --blacklist --add makes no difference.
jool_siit --blacklist --add
The easiest way to reproduce this is to run ping -t 1 $joolnode from a machine in the same LAN segment. The moment you do modprobe jool_siit disabled=0 on «$joolnode», the ICMPv4 echo-replies turn into ttl-exceeded errors. After removing the module again, or running jool_siit --disable, the echo-replies come back.
ping -t 1 $joolnode
modprobe jool_siit disabled=0
For me, this is a blocker stopping me from putting Jool in production, because I need the Jool node to speak OSPF with its upstream router to dynamically advertise the various pools to the network. The upstream router sends OSPF hellos to the OSPF multicast address 126.96.36.199 with TTL=1, but since I cannot figure out how to prevent Jool from stealing these packets from the BIRD routing daemon, the OSPF adjacency cannot form, and the required routes cannot be advertised to the network.
Here's a full list of everything I see that might go wrong, which should cause SIIT Jool to drop packets before realizing they weren't meant to be translated:
The most reliable and natural way to solve this is by switching frameworks. This issue is another argument in favor of #140.
Alternatively, I can move the the address translation to somewhere earlier. I think the best tradeoff between fixing the problem and not changing the code too aggressively is to move it to the beginning of the translate submodule. This will only solve bullets 4 and 5.
Bullet 5 is the only one I really care about here due to the OSPF issue (RFC 2328 requires TTL=1, and I have found no way to override that in JUNOS). So it's not too much trouble to make a quick hack that solves bullet 5, that would be much appreciated.
Issue #167, simple hack version.
Moved the address translation before the TTL translation. This only solves bullet 5.
It should work now. Let's not close this issue until the problem gets a more elegant fix, though.
The kame is dancing, so indeed, that made my OSPF sessions come up. Thanks a lot! :-)