Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jool when translating always drops the packet due to an error #382

Closed
aalmenar opened this issue Jun 17, 2022 · 26 comments
Closed

Jool when translating always drops the packet due to an error #382

aalmenar opened this issue Jun 17, 2022 · 26 comments
Labels
Milestone

Comments

@aalmenar
Copy link

Hi, I have been playing with jool a bit but when using my own address space for NAT64 always gets this issue:

Jool NAT64/ba23ed80/default: ===============================================
Jool NAT64/ba23ed80/default: Packet: 2001:xxx:518f->2001:xxx::1f2f:4d04
Jool NAT64/ba23ed80/default: TCP 41168->80
Jool NAT64/ba23ed80/default: Step 1: Determining the Incoming Tuple
Jool NAT64/ba23ed80/default: Tuple: 2001:xxx:518f#41168 -> 2001:xxx::1f2f:4d04#80 (TCP)
Jool NAT64/ba23ed80/default: Done step 1.
Jool NAT64/ba23ed80/default: Step 2: Filtering and Updating
Jool NAT64/ba23ed80/default: BIB entry: 2001:xxx:518f#41168 - 194.1.1.1#22154 (TCP)
Jool NAT64/ba23ed80/default: Session entry: 2001:xxx:518f#41168 - 2001:xxx::1f2f:4d04#80 | 194.1.1.1#22154 - 31.47.77.4#80 (TCP)
Jool NAT64/ba23ed80/default: Done: Step 2.
Jool NAT64/ba23ed80/default: Step 3: Computing the Outgoing Tuple
Jool NAT64/ba23ed80/default: Tuple: 194.1.1.1#22154 -> 31.47.77.4#80 (TCP)
Jool NAT64/ba23ed80/default: Done step 3.
Jool NAT64/ba23ed80/default: Step 4: Translating the Packet
Jool NAT64/ba23ed80/default: Translating packet addresses 2001:xxx:518f->2001:xxx::1f2f:4d04...
Jool NAT64/ba23ed80/default: Result: 194.1.1.1->31.47.77.4
Jool NAT64/ba23ed80/default: Packet routed via device 'eth0'.
Jool NAT64/ba23ed80/default: Done step 4.
Jool NAT64/ba23ed80/default: Sending packet.
Jool NAT64/ba23ed80/default: dst_output() returned errcode 1.
Jool: Dropping packet.

When jool is down, the server is able to reach anything on public ipv4 space.

i have totally run out of ideas.

my jool.conf:

{
"instance": "default",
"framework": "netfilter",
"global": {
"maximum-simultaneous-opens": 1000,
"drop-externally-initiated-tcp": true,
"pool6": "2001:xxx::/96"
},
"pool4": [
{
"comment": "mark, port range and max-iterations are optional.",
"protocol": "TCP",
"prefix": "194.1.1.1",
"port range": "21001-65535"
},
{
"protocol": "UDP",
"prefix": "194.1.1.1",
"port range": "21001-65535"
},
{
"protocol": "ICMP",
"prefix": "194.1.1.1",
"port range": "21001-65535"
}
]
}

sysctl values:

net.ipv4.ip_local_port_range = 10000 20000
net.ipv4.conf.all.forwarding=1
net.ipv6.conf.all.forwarding=1

Dont know if its a bug or what:

Tried this with:

Jool 4.1.5, 4.1.8, 4.2.0-rc2
Kernels 5.10, 5.15, 5.16

@ydahhrk
Copy link
Member

ydahhrk commented Jun 17, 2022

Well

Error code 1 is "Operation not permitted." This looks like an environment problem.

Have you tried flushing all iptables/nftables rules in all relevant network namespaces? Obviously not permanently, but rather to check whether those are blocking the packet.

@ydahhrk
Copy link
Member

ydahhrk commented Jun 17, 2022

Might want to try disabling reverse path filtering too

@aalmenar
Copy link
Author

Well

Error code 1 is "Operation not permitted." This looks like an environment problem.

Have you tried flushing all iptables/nftables rules in all relevant network namespaces? Obviously not permanently, but rather to check whether those are blocking the packet.

Neither iptables/nftables are installed (Just a plain debian 11 install with jool-tools and jool-dkms). Also have disabled rp filter, but it still happens.

@ydahhrk
Copy link
Member

ydahhrk commented Jun 21, 2022

Which image did you use to install Debian?

@aalmenar
Copy link
Author

aalmenar commented Jun 21, 2022

Which image did you use to install Debian?

This one:
debian-11.3.0-amd64-netinst.iso

@ydahhrk
Copy link
Member

ydahhrk commented Jul 4, 2022

Sorry for taking so long, and double sorry for not coming up with anything.

I installed Debian 11 in a Virtualbox virtual machine, and I could not reproduce the problem.

My host machine:

sudo ip link set vboxnet0 up
sudo ip addr flush dev vboxnet0 scope global
sudo ip address add 2001:db8::8/96 dev vboxnet0

sudo ip link set vboxnet1 up
sudo ip addr flush dev vboxnet1 scope global
sudo ip address add 194.1.1.8/24 dev vboxnet1

sudo ip route add 64:ff9b::/96 via 2001:db8::1

My virtual machine:

sudo apt install ./jool-dkms_4.1.8-1_all.deb ./jool-tools_4.1.8-1_amd64.deb

INTERFACE1=enp0s3
INTERFACE2=enp0s8

sudo ip link set $INTERFACE1 up
sudo ip address flush dev $INTERFACE1 scope global
sudo ip address add 2001:db8::1/96 dev $INTERFACE1
sudo ip link set $INTERFACE2 up
sudo ip address flush dev $INTERFACE2 scope global
sudo ip address add 194.1.1.1/24 dev $INTERFACE2

sudo sysctl -w net.ipv4.conf.all.forwarding=1
sudo sysctl -w net.ipv6.conf.all.forwarding=1
sudo sysctl -w net.ipv4.ip_local_port_range="10000 20000"

sudo modprobe jool
sudo jool file handle jool.conf

In jool.conf, I changed 2001:xxx::/96 to 64:ff9b::/96. Ping bounces back without problems:

$ ping 64:ff9b::194.1.1.8
PING 64:ff9b::194.1.1.8(64:ff9b::c201:108) 56 data bytes
64 bytes from 64:ff9b::c201:108: icmp_seq=1 ttl=63 time=1.19 ms
64 bytes from 64:ff9b::c201:108: icmp_seq=2 ttl=63 time=0.893 ms
64 bytes from 64:ff9b::c201:108: icmp_seq=3 ttl=63 time=1.13 ms
64 bytes from 64:ff9b::c201:108: icmp_seq=4 ttl=63 time=1.01 ms
^C
--- 64:ff9b::194.1.1.8 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3003ms
rtt min/avg/max/mdev = 0.893/1.055/1.188/0.113 ms

Want to discuss your network? You can find my email in my Github profile, if you prefer a less public medium.

@ydahhrk ydahhrk added the Status: Stuck Development paused due to unavailable external input label Jul 21, 2022
@ydahhrk ydahhrk added the Support User needs help label Jan 25, 2023
@nosnilmot
Copy link

I have the "same" problem. Is it OK to report here or should I open a new issue?

Feb 15 17:25:12 aleutia.gravel kernel: Jool NAT64/903a8b40/nat64: ===============================================
Feb 15 17:25:12 aleutia.gravel kernel: Jool NAT64/903a8b40/nat64: Packet: 2a02:xxxx:xxxx:x:cd80:c8c0:1459:22c->64:ff9b::zzz:zzzz
Feb 15 17:25:12 aleutia.gravel kernel: Jool NAT64/903a8b40/nat64: ICMPv6 type:128 code:0 id:18956
Feb 15 17:25:12 aleutia.gravel kernel: Jool NAT64/903a8b40/nat64: Step 1: Determining the Incoming Tuple
Feb 15 17:25:12 aleutia.gravel kernel: Jool NAT64/903a8b40/nat64: Tuple: 2a02:xxxx:xxxx:x:cd80:c8c0:1459:22c#18956 -> 64:ff9b::zzz:zzzz#18956 (ICMP)
Feb 15 17:25:12 aleutia.gravel kernel: Jool NAT64/903a8b40/nat64: Done step 1.
Feb 15 17:25:12 aleutia.gravel kernel: Jool NAT64/903a8b40/nat64: Step 2: Filtering and Updating
Feb 15 17:25:12 aleutia.gravel kernel: Jool NAT64/903a8b40/nat64: BIB entry: 2a02:xxxx:xxxx:x:cd80:c8c0:1459:22c#18956 - 172.26.0.27#65426 (ICMP)
Feb 15 17:25:12 aleutia.gravel kernel: Jool NAT64/903a8b40/nat64: Session entry: 2a02:xxxx:xxxx:x:cd80:c8c0:1459:22c#18956 - 64:ff9b::zzz:zzzz#18956 | 172.26.0.27#65426 - y.yy.yy.yy#65426 (ICMP)
Feb 15 17:25:12 aleutia.gravel kernel: Jool NAT64/903a8b40/nat64: Done: Step 2.
Feb 15 17:25:12 aleutia.gravel kernel: Jool NAT64/903a8b40/nat64: Step 3: Computing the Outgoing Tuple
Feb 15 17:25:12 aleutia.gravel kernel: Jool NAT64/903a8b40/nat64: Tuple: 172.26.0.27#65426 -> y.yy.yy.yy#65426 (ICMP)
Feb 15 17:25:12 aleutia.gravel kernel: Jool NAT64/903a8b40/nat64: Done step 3.
Feb 15 17:25:12 aleutia.gravel kernel: Jool NAT64/903a8b40/nat64: Step 4: Translating the Packet
Feb 15 17:25:12 aleutia.gravel kernel: Jool NAT64/903a8b40/nat64: Routing: 172.26.0.27->y.yy.yy.yy
Feb 15 17:25:12 aleutia.gravel kernel: Jool NAT64/903a8b40/nat64: Packet routed via device 'enp2s0'.
Feb 15 17:25:12 aleutia.gravel kernel: Jool NAT64/903a8b40/nat64: Done step 4.
Feb 15 17:25:12 aleutia.gravel kernel: Jool NAT64/903a8b40/nat64: Sending packet.
Feb 15 17:25:12 aleutia.gravel kernel: Jool NAT64/903a8b40/nat64: dst_output() returned errcode 1.
Feb 15 17:25:12 aleutia.gravel kernel: Jool: Jool: Dropping packet.

Working:
kernel-3.10.0-1160.83.1.el7.x86_64
jool-4.1.7

Not working:
kernel-lt-5.4.231-1.el7.elrepo.x86_64
jool-4.1.9 (also tried jool-4.1.7 + cherry picked 61e6764 necessary to compile for this kernel)

Everything else is identical between configs:
OS: CentOS 7.9
Jool using "netfilter" framework
I am using iptables but same config works fine with 3.10 kernel. There are no jool / nat64 specific iptables rules.
I have tried disabling rp_filter (0) and permissive rp_filter (2)
selinux is 'Enforcing' but have also tried setting to 'Permissive'

/etc/jool.conf:

{
	"comment": "Configuration for the systemd NAT64 Jool service.",

	"instance": "nat64",
	"framework": "netfilter",

	"global": {
		"comment": "Sample pool6 prefix",
		"pool6": "64:ff9b::/96"
	},

	"comment": "Sample pool4 table",
	"pool4": [
		{
			"protocol": "ICMP",
			"prefix": "172.26.0.27",
			"port range": "61001-65535"
		}, {
			"protocol": "TCP",
			"prefix": "172.26.0.27",
			"port range": "61001-65535"
		}, {
			"protocol": "UDP",
			"prefix": "172.26.0.27",
			"port range": "61001-65535"
		}
	]
}

@ydahhrk
Copy link
Member

ydahhrk commented Feb 22, 2023

kernel-lt-5.4.231-1.el7.elrepo.x86_64
OS: CentOS 7.9

Please forgive my ignorance, but I'm having a lot of trouble replicating this environment. (I dropped support for CentOS < 8 fairly long ago, to reduce the release and maintenance overhead.) It seems the distro's official gcc (4.8.5) is too old to even understand the kernel headers, and at this point I'm just guessing what you did.

What's your compiler, and its version?

(If you can provide entire installation instructions, that'd be better)

@ydahhrk ydahhrk removed the Status: Stuck Development paused due to unavailable external input label Feb 22, 2023
ydahhrk added a commit that referenced this issue Feb 22, 2023
Since it looks like it's going to be tricky to reproduce.
@ydahhrk
Copy link
Member

ydahhrk commented Feb 22, 2023

BTW: Since I'm not really sure I'm going to be able to reproduce the problem, I just uploaded a commit that prints the packet when the Operation not permitted is found: 8980f79

If you can compile/install/run it and give me the output, I can analyze the packet and try to find issues with it.

Feel free to censor fields, but let me know what you're censoring. Sample output:

$ sudo dmesg -t
NAT64 Jool v4.1.9.0 module inserted.
Jool: Created instance 'default'.
=================
EPERM packet
    skb fields:
        prev:00000000ad680a96
        next:00000000ad680a96
        dev:00000000d727315d
        len:84
        data_len:0
        mac_len:0
        hdr_len:0
        truesize:768
        pkt_type:0
        ignore_df:0
        ip_summed:0 (CHECKSUM_NONE)
        csum_valid:0
        csum_start:54111
        csum_offset:55397
        mark:0
        inner_transport_header:0
        inner_network_header:0
        inner_mac_header:0
        protocol:2048 (IPv4)
        transport_header:64
        network_header:44
        mac_header:44
        head:00000000f108ca23
        data:44
        tail:128
        end:192
    Content:
        IPv4 header:
            Version: 4
            IHL: 5
            TOS: 0
            Total Length: 84
            Fragment ID: 47434
            Reserved:0 DF:0 MF:0 FragmentOffset: 0
            TTL: 63
            Protocol: 1
            Checksum: 0x3e57
            Source Address: 192.0.2.2
            Destination Address: 192.0.2.5
        ICMPv4 header:
            Type:8 Code:0
            Checksum: 0x4ab8
            Rest 1: 1182
            Rest 2: 1
        Payload:
            93 65 f6 63 00 (...) 33 34 35 36 37 
    shared info:
        nr_frags:0
        tx_flags:0
        gso_size:0
        gso_segs:0
        gso_type:0
        frags:

@nosnilmot
Copy link

kernel-lt-5.4.231-1.el7.elrepo.x86_64
OS: CentOS 7.9

Please forgive my ignorance, but I'm having a lot of trouble replicating this environment.

I was originally trying out newer kernels on CentOS 7 because I needed a TCP BBR capable kernel, and that was easier than a full OS upgrade, but I recognize that was a nonstandard configuration and unsupportable by you and others.

I have now upgraded the OS to RHEL 8 with standard RH kernel (4.18.0). Jool still doesn't work for me though.

I then installed a clean RHEL 8 in a VM and tried to replicate the setup (network interfaces, sysctl knobs, iptables rules, routes), and thought I had it working in the VM with everything identical to the physical host (where it was still not working). Then tweaked some things and it stopped working in the VM too, tried to reverse what I'd tweaked and it continued to not work.

(I dropped support for CentOS < 8 fairly long ago, to reduce the release and maintenance overhead.)

That's understandable.

It seems the distro's official gcc (4.8.5) is too old to even understand the kernel headers, and at this point I'm just guessing what you did.
(If you can provide entire installation instructions, that'd be better)

This is no longer relevant, but I was using kernel-lt from elrepo.org, which was built with gcc 9 from devtoolset-9 package: yum install centos-release-scl-rh && yum install devtoolset-9

BTW: Since I'm not really sure I'm going to be able to reproduce the problem, I just uploaded a commit that prints the packet when the Operation not permitted is found: 8980f79

I was thinking about "Operation not permitted" - wouldn't that normally be returned as -EPERM - ie. -1 not 1?. I suspect the 1 return by dst_output might indicate something other than permission issue and may not even indicate complete failure? (although obviously the packets are not getting where I want them). That doesn't make the logging any less useful though.

I'm also not sure that dst_output always returns 1, I think sometimes the Jool logs indicate packets are being sent successfully but are not actually going anywhere. I'm assuming I should be able to see the outbound packets with tcpdump?

If you can compile/install/run it and give me the output, I can analyze the packet and try to find issues with it.

I will certainly put it on both my physical host and VM and try to replicate.

@nosnilmot
Copy link

After installing 8980f79 from issue382 branch I managed to capture a log with logging-debug: true when a packet looked like it was sent successfully (according to the log) but as far as I can tell was not actually sent out of the network interface (there was certainly no response to the ping):

IPV6PREFIX, POOL4IP, NAT64DST and IP4DST are censored, hopefully it is self-evident what they are

Jool NAT64/87d30a00/nat64: ===============================================
Jool NAT64/87d30a00/nat64: Packet: IPV6PREFIX:d9c2:11c6:8262:ee90->64:ff9b::NAT64DST
Jool NAT64/87d30a00/nat64: ICMPv6 type:128 code:0 id:47478
Jool NAT64/87d30a00/nat64: Step 1: Determining the Incoming Tuple
Jool NAT64/87d30a00/nat64: Tuple: IPV6PREFIX:d9c2:11c6:8262:ee90#47478 -> 64:ff9b::NAT64DST#47478 (ICMP)
Jool NAT64/87d30a00/nat64: Done step 1.
Jool NAT64/87d30a00/nat64: Step 2: Filtering and Updating
Jool NAT64/87d30a00/nat64: BIB entry: IPV6PREFIX:d9c2:11c6:8262:ee90#47478 - POOL4IP#63934 (ICMP)
Jool NAT64/87d30a00/nat64: Session entry: IPV6PREFIX:d9c2:11c6:8262:ee90#47478 - 64:ff9b::NAT64DST#47478 | POOL4IP#63934 - IP4DST#63934 (ICMP)
Jool NAT64/87d30a00/nat64: Done: Step 2.
Jool NAT64/87d30a00/nat64: Step 3: Computing the Outgoing Tuple
Jool NAT64/87d30a00/nat64: Tuple: POOL4IP#63934 -> IP4DST#63934 (ICMP)
Jool NAT64/87d30a00/nat64: Done step 3.
Jool NAT64/87d30a00/nat64: Step 4: Translating the Packet
Jool NAT64/87d30a00/nat64: Routing: POOL4IP->IP4DST
Jool NAT64/87d30a00/nat64: Packet routed via device 'enp2s0'.
Jool NAT64/87d30a00/nat64: Done step 4.
Jool NAT64/87d30a00/nat64: Sending packet.
Jool NAT64/87d30a00/nat64: Success.
Jool NAT64/87d30a00/nat64: ===============================================

I turned off logging-debug but then soon after (I left the ping running for no more than a few minutes), the EPERM logs started showing up (I haven't censored anything here, but the Content looks suspiciously empty):

=================
EPERM packet
    skb fields:
        prev:0000000000000000
        next:0000000000000000
        dev:00000000aa3d0c56
        len:3098544836
        data_len:1083562863
        mac_len:0
        hdr_len:0
        truesize:768
        pkt_type:0
        ignore_df:0
        ip_summed:2 (CHECKSUM_COMPLETE)
        csum_valid:1
        csum_start:42513
        csum_offset:25225
        mark:0
        inner_transport_header:0
        inner_network_header:0
        inner_mac_header:0
        protocol:2048 (IPv4)
        transport_header:118
        network_header:98
        mac_header:84
        head:000000002d2096c6
        data:84
        tail:134
        end:192
    Content:
        [Unknown layer 3 protocol: 0]
    shared info:
        nr_frags:0
        tx_flags:0
        gso_size:0
        gso_segs:0
        gso_type:0
        frags:

@ydahhrk
Copy link
Member

ydahhrk commented Feb 23, 2023

I was thinking about "Operation not permitted" - wouldn't that normally be returned as -EPERM - ie. -1 not 1?. I suspect the 1 return by dst_output might indicate something other than permission issue and may not even indicate complete failure? (although obviously the packets are not getting where I want them).

Good point. There don't seem to be any IPv4 output functions that return positive EPERM. Although the POSTROUTING chain is a big black box.

I'm also not sure that dst_output always returns 1, I think sometimes the Jool logs indicate packets are being sent successfully but are not actually going anywhere. I'm assuming I should be able to see the outbound packets with tcpdump?

If they reach the network interface, yes. There's possibly a lot of code before that, is the thing.

I turned off logging-debug but then soon after (I left the ping running for no more than a few minutes), the EPERM logs started showing up (I haven't censored anything here, but the Content looks suspiciously empty):

Ok, this output is haunting. This is definitely a bug now.

@ydahhrk ydahhrk added Bug and removed Support User needs help labels Feb 23, 2023
@nosnilmot
Copy link

I'm still trying to recreate in a VM to provide you with a more concrete and minimal reproducer, but I'm not getting very far - currently it is working in the VM.

I have found one odd issue that has been distracting because externally it looks like I've reproduced the issue, but actually it's something else.

As soon as I load the iptables ruleset (there are no jool specific rules, jool is in netfilter mode), the jool module unloads itself (or is unloaded):

[root@hostname ~]# systemctl start iptables
[root@hostname ~]# dmesg -t | tail -2
Jool: Deleted instance 'nat64'.
NAT64 Jool v4.1.9.0 module removed.

Translation starts working again after reloading module

[root@hostname ~]# modprobe jool
[root@hostname ~]# jool file handle /etc/jool.conf

@nosnilmot
Copy link

As soon as I load the iptables ruleset (there are no jool specific rules, jool is in netfilter mode), the jool module unloads itself (or is unloaded):

that was caused by iptables systemd unit conflicting with and stopping firewalld, which has this config option, and jool_common module depends on nf_defrag_ipv6,nf_defrag_ipv4 so was also getting unloaded. so not a jool issue at all:

# Clean up kernel modules on exit
# If set to yes or true the firewall related kernel modules will be
# unloaded on exit or stop of firewalld. This might attempt to unload
# modules not originally loaded by firewalld.
# Default: yes
CleanupModulesOnExit=yes

ydahhrk added a commit that referenced this issue Feb 23, 2023
@ydahhrk
Copy link
Member

ydahhrk commented Feb 23, 2023

Debug commit, second version: issue382 branch

This time, it'll print the incoming packet, the translated packet, and two additional pieces information in the middle:

=================
EPERM in packet
    skb fields:
        prev:0000000000000000
        next:0000000000000000
        dev:00000000e4cff640
        len:84
        data_len:0
        mac_len:14
        hdr_len:0
        truesize:768
        pkt_type:0
        ignore_df:0
        ip_summed:0 (CHECKSUM_NONE)
        csum_valid:0
        csum_start:25320
        csum_offset:41005
        mark:0
        inner_transport_header:0
        inner_network_header:0
        inner_mac_header:0
        protocol:2048 (IPv4)
        transport_header:36
        network_header:16
        mac_header:2
        head:00000000f79b743d
        data:16
        tail:100
        end:192
    Content:
        IPv4 header:
            Version: 4
            IHL: 5
            TOS: 0
            Total Length: 84
            Fragment ID: 2278
            Reserved:0 DF:0 MF:0 FragmentOffset: 0
            TTL: 64
            Protocol: 1
            Checksum: 0xedbb
            Source Address: 192.0.2.5
            Destination Address: 192.0.2.2
        ICMPv4 header:
            Type:0 Code:0
            Checksum: 0xe032
            Rest 1: 2505
            Rest 2: 1
        Payload:
            8d b1 f7 63 00 (...) 33 34 35 36 37 
    shared info:
        nr_frags:0
        tx_flags:0
        gso_size:0
        gso_segs:0
        gso_type:0
        frags:
out packet index: 0
state flags: 6
=================
EPERM out packet
    skb fields:
        prev:0000000027b49b98
        next:0000000027b49b98
        dev:000000009ab09f72
        len:104
        data_len:0
        mac_len:0
        hdr_len:0
        truesize:768
        pkt_type:0
        ignore_df:0
        ip_summed:0 (CHECKSUM_NONE)
        csum_valid:0
        csum_start:25320
        csum_offset:41005
        mark:0
        inner_transport_header:0
        inner_network_header:0
        inner_mac_header:0
        protocol:34525 (IPv6)
        transport_header:64
        network_header:24
        mac_header:10
        head:00000000bd9ae3b1
        data:24
        tail:128
        end:192
    Content:
        IPv6 header:
            Version: 6
            Priority: 0
            Flow Label: 0x000000
            Payload Length: 64
            Next Header: 58
            Hop Limit: 63
            Source Address: 64:ff9b::c000:205
            Destination Address: 2001:db8::5
        ICMPv6 header:
            Type:129 Code:0
            Checksum: 0x37bc
            Rest 1: 16641
            Rest 2: 1
        Payload:
            8d b1 f7 63 00 (...) 33 34 35 36 37 
    shared info:
        nr_frags:0
        tx_flags:0
        gso_size:0
        gso_segs:0
        gso_type:0
        frags:

If you can provide the new output, the review should be significantly easier.

@ydahhrk
Copy link
Member

ydahhrk commented Feb 23, 2023

Cancel; I need to rethink this.

@ydahhrk
Copy link
Member

ydahhrk commented Feb 23, 2023

Ok, the issue382 branch should work properly again.

@ydahhrk
Copy link
Member

ydahhrk commented Feb 23, 2023

I'm sorry; I need to ask two very obnoxious questions.

Are you completely sure that there are no stray routes in your routing table?

ip route
ip -6 route
Jool NAT64/87d30a00/nat64: Packet routed via device 'enp2s0'.

Is the packet really meant to depart through this interface?

@nosnilmot
Copy link

Ok, the issue382 branch should work properly again.

"WTF; I can't replicate anything. I literally wish my VM were more
unstable."

I feel your pain.

Yesterday, after much back and forth trying different things and seemingly random things working and failing in VM I thought I had found a pattern (looks obvious when written like this, but it was not obvious when trying different things minutes apart, without realising it was not what I had done that changed behaviour just the passing of time/packets):

  1. Immediately after boot, Jool/NAT64 works fine
  2. After some period (2-5 minutes typically) ICMP echo reply would stop arriving at client logging-debug true logs indicated ICMP echo request packets were being sent out but I don't think they were
  3. After a short while longer, those logs would turn into 'dst_output returned 1' / EPERM errors
  4. From that point on and until next reboot (removing nat64 instance and rmmod/modprobe jool did not clear it) it was impossible to use ping to that IP to test (always dst_output returned 1), but other IPs would work (briefly, following the same pattern).

Of course today it's not behaving like that at all and I have a ping that has been running successfully via VM for nearly 2 hours. Nothing in your latest commits was expected to fix it, right? (haven't tried it on physical host yet)

I'm sorry; I need to ask two very obnoxious questions.

I don't think they're obnoxious.

Are you completely sure that there are no stray routes in your routing table?

Fairly sure on the translator side, they're quite straightforward. On the client side, it is Mac OS, so ... who knows 😄 There's a lot of stuff in there though I don't thnk any of it should be affecting this (on client only the 64:ff9b::/96 route is relevant, right?):

$ netstat -rn | wc -l
     637

Is it client route table you looking for or on the translator? if client I'll try to clean it up and send it, or find a different client to test from. Details for translator, both physical host and VM, are below.

ip route
ip -6 route

Censored fields:

IPV6PREFIX - local IPv6 network prefix
NAT64DST - IPv6 NAT64 address of destination of ping for testing
HWPOOL4IP - IPv4 pool4 address of physical host (public IP of translator)
HW2NDIP4 - Secondary IPv4 address of physical host (public)
VMPOOL4IP - IPv4 pool4 address of VM (public)
VM2NDIP4 - Secondary IPv4 address of VM (public)
IP4GW - IPv4 gateway address (public)
IP4NET - IPv4 network address (public)
HWPRVIP4 - Private IPv4 address of physical host
VMPRVIP4 - Private IPv4 address of VM
PRVIP4NET - Private IPv4 network address

Physical Host: ip route:

default via IP4GW dev enp2s0 proto static metric 100 
IP4NET/29 dev enp2s0 proto kernel scope link src HWPOOL4IP metric 100 
IP4NET/29 dev enp2s0 proto kernel scope link src HW2NDIP4 metric 100 
PRVIP4NET/16 dev enp3s0 proto kernel scope link src HWPRVIP4 metric 101 

Physical Host: ip -6 route

::1 dev lo proto kernel metric 256 pref medium
64:ff9b::/96 via fe80::20c:29ff:feea:663b dev enp3s0 proto ra metric 101 pref medium
IPV6PREFIX::/64 dev enp3s0 proto ra metric 101 pref medium
fe80::/64 dev enp2s0 proto kernel metric 1024 pref medium
fe80::/64 dev enp3s0 proto kernel metric 1024 pref medium
default via fe80::21d:aaff:fe84:2138 dev enp3s0 proto ra metric 101 pref medium

Note: the 64:ff9b route here is pointing to the VM currently, but is not then when I am testing physical host.

VM: ip route

default via IP4GW dev ens160 proto static metric 100 
IP4NET/29 dev ens160 proto kernel scope link src VMPOOL4IP metric 100 
IP4NET/29 dev ens160 proto kernel scope link src VM2NDIP4 metric 100 
PRVIP4NET/16 dev ens224 proto kernel scope link src VMPRVIP4 metric 101 

VM: ip -6 route

::1 dev lo proto kernel metric 256 pref medium
IPV6PREFIX::/64 dev ens224 proto ra metric 101 pref medium
fe80::/64 dev ens224 proto kernel metric 1024 pref medium
fe80::/64 dev ens160 proto kernel metric 1024 pref medium
default via fe80::21d:aaff:fe84:2138 dev ens224 proto ra metric 101 pref medium
Jool NAT64/87d30a00/nat64: Packet routed via device 'enp2s0'.

Is the packet really meant to depart through this interface?

Yes. For the purposes of Jool/NAT64:
enp2s0 (ens140 in VM) = IPv4
enp3s0 (ens224 in VM) = IPv6

@nosnilmot
Copy link

Ok, the issue382 branch should work properly again.

I have reproduced on VM, here's an example from the logs:

=================
EPERM in packet
    skb fields:
        prev:0000000000000000
        next:0000000000000000
        dev:0000000073e11aa1
        len:56
        data_len:0
        mac_len:14
        hdr_len:0
        truesize:2304
        pkt_type:0
        ignore_df:0
        ip_summed:0 (CHECKSUM_NONE)
        csum_valid:0
        csum_start:0
        csum_offset:0
        mark:0
        inner_transport_header:0
        inner_network_header:0
        inner_mac_header:0
        protocol:34525 (IPv6)
        transport_header:118
        network_header:78
        mac_header:64
        head:00000000cb3c08f3
        data:78
        tail:134
        end:1728
    Content:
        IPv6 header:
            Version: 6
            Priority: 0
            Flow Label: 0x0a0200
            Payload Length: 16
            Next Header: 58
            Hop Limit: 64
            Source Address: IPV6PREFIX:748b:128f:8250:669d
            Destination Address: 64:ff9b::808:808
        ICMPv6 header:
            Type:128 Code:0
            Checksum: 0x3abe
            Rest 1: 9160
            Rest 2: 443
        Payload:
            63 f8 ac 96 00 07 83 76 
    shared info:
        nr_frags:0
        tx_flags:0
        gso_size:0
        gso_segs:0
        gso_type:0
        frags:
out packet index: 0
state flags: 1
=================
EPERM out packet
    skb fields:
        prev:0000000000000000
        next:0000000000000000
        dev:00000000408a0526
        len:36
        data_len:0
        mac_len:0
        hdr_len:0
        truesize:768
        pkt_type:0
        ignore_df:0
        ip_summed:0 (CHECKSUM_NONE)
        csum_valid:0
        csum_start:0
        csum_offset:0
        mark:0
        inner_transport_header:0
        inner_network_header:0
        inner_mac_header:0
        protocol:2048 (IPv4)
        transport_header:118
        network_header:98
        mac_header:98
        head:00000000bcfdc912
        data:98
        tail:134
        end:192
    Content:
        IPv4 header:
            Version: 4
            IHL: 5
            TOS: 0
            Total Length: 36
            Fragment ID: 15106
            Reserved:0 DF:0 MF:0 FragmentOffset: 0
            TTL: 63
            Protocol: 1
            Checksum: 0xf8f6
            Source Address: VMPOOL4IP
            Destination Address: 8.8.8.8
        ICMPv4 header:
            Type:8 Code:0
            Checksum: 0x6243
            Rest 1: 65524
            Rest 2: 443
        Payload:
            63 f8 ac 96 00 07 83 76 
    shared info:
        nr_frags:0
        tx_flags:0
        gso_size:0
        gso_segs:0
        gso_type:0
        frags:
=================

All of the log enties have same state flags: 1

I think I've narrowed down all the configuration required to reproduce in a VM from a clean RHEL8 install, although I haven't gone as far as to start from scratch just yet.

Of course it all comes down to the one reason I wanted an updated kernel in the first place, BBR support:

/etc/sysctl.d/90-bbr.conf:

net.core.default_qdisc=fq
net.ipv4.tcp_congestion_control=bbr

Reboot after setting those - I don't think they take effect after link is up, I didn't figure out how to reproduce when setting these after boot (watch out for these being included in initrd too, otherwise they might still be applied).

/etc/jool.conf:

{
  "comment": "pool4 table",
  "instance": "nat64",
  "framework": "netfilter",
  "global": {
    "comment": "pool6 prefix",
    "pool6": "64:ff9b::/96"
  },
  "pool4": [
    {
      "protocol": "ICMP",
      "prefix": "VMPOOL4IP",
      "port range": "61001-65535"
    },
    {
      "protocol": "TCP",
      "prefix": "VMPOOL4IP",
      "port range": "61001-65535"
    },
    {
      "protocol": "UDP",
      "prefix": "VMPOOL4IP",
      "port range": "61001-65535"
    }
  ]
}

Configure network:

# set these
VMPOOL4IP=
IP4GW=
IPV6PREFIX=

IP4IF=ens160
IP6IF=ens224
VMIP6=$IPV6PREFIX:20c:29ff:feea:663b/64

sudo ip link set $IP4IF up
sudo ip address flush dev $IP4IF scope global
sudo ip address add $VMPOOL4IP/29 dev $IP4IF
sudo ip route add default via $IP4GW

sudo ip link set $IP6IF up
sudo ip address flush dev $IP6IF scope global
sudo ip address add $VMIP6 dev $IP6IF

(default IPv6 route discovered by RA)

Start jool:

sudo modprobe jool
sudo /usr/local/bin/jool file handle /etc/jool.conf

Then ping from an IPv6 client that has a route for 64:ff9b::/96 via this VM and wait ~5 minutes:

ping6 -n 64:ff9b::8.8.8.8

@ydahhrk
Copy link
Member

ydahhrk commented Feb 24, 2023

  1. After some period (2-5 minutes typically) ICMP echo reply would stop arriving at client logging-debug true logs indicated ICMP echo request packets were being sent out but I don't think they were
  2. After a short while longer, those logs would turn into 'dst_output returned 1' / EPERM errors

This reminds me of an old troubleshooting experience in which neighbors were disappearing from the neighbor table for some reason. It'd work for a while, then the neighbor entry would expire, and not get renewed automatically. Perhaps because Jool was eating ARP packets. So the ping would randomly stop working.

I'm still trying to replicate the bug, but maybe try monitoring that table:

$ ip neigh
192.168.1.174 dev wlp3s0 lladdr 84:16:f9:15:d5:4b STALE
192.168.1.1 dev wlp3s0 lladdr 64:09:ad:3a:fa:96 REACHABLE
fe80::e52e:76dc:64a7:cd95 dev wlp3s0 lladdr 84:16:f9:15:d5:4b STALE

The neighbor tables of the VM, host and gateway are all suspects.

Nothing in your latest commits was expected to fix it, right? (haven't tried it on physical host yet)

Nothing at all.

(on client only the 64:ff9b::/96 route is relevant, right?)

As long as it's the most specific one for the relevant packets, I think it's safe to assume so.

Is it client route table you looking for or on the translator?

On the translator; I was thinking of another troubleshoot in which Jool's routing table had a bunch of bogus routes of unknown origin, so the packet TTL'd bouncing pointlessly on loopback.

Although every other routing table is also relevant. (But the clients' are much harder to misconfigure, I imagine.)

Details for translator, both physical host and VM, are below.

Oh, my network was different. Good to know.

Actually, I'm getting more confused now. Please bear my cluelessness, as I'm a developer, not a network admin/designer. It seems this is the path the packet has to travel:

                                                    +-----------------+
                                                    | enp2s0 (ens224) |
+----------------+   +-----------------+            | VMPOOL4IP       |   +---------+
| Mac            |   | enp3s0 (ens160) |            | VM2NDIP4        |   | Gateway |   +------
| IPV6PREFIX::x1 |---| IPV6PREFIX::x2  |--- Jool ---| HWPOOL4IP       |---| IP4GW   |---| ...
+----------------+   +-----------------+            | HW2NDIP4        |   +---------+   +------
                                                    | HWPRVIP4        |
                                                    | VMPRVIP4        |
                                                    +-----------------+

HWPOOL4IP - IPv4 pool4 address of physical host (public IP of translator)
VMPOOL4IP - IPv4 pool4 address of VM (public)

What does this mean? Are you NATting VMPOOL4IP into HWPOOL4IP or something? And why? You seem to be using bridges.

HWPRVIP4 - Private IPv4 address of physical host
VMPRVIP4 - Private IPv4 address of VM
PRVIP4NET - Private IPv4 network address

What's the private network for? Something unrelated?

default via IP4GW dev ens160 proto static metric 100
IP4NET/29 dev ens160 proto kernel scope link src VMPOOL4IP metric 100
IP4NET/29 dev enp2s0 proto kernel scope link src HW2NDIP4 metric 100

Back to the neighbor theory: Is enp2s0 answering ARP requests for the pool4 addresses?

Some ARP tangle would explain the problem fairly well, because the routing table would yield a route for Jool to use, and then Linux would later fail to fetch the packet once it reaches layer 2.

Once communication breaks, is it possible for the physical machine and/or VM to ping the gateway using VMPOOL4IP as a source?

I think this would do it:

ping IP4GW -I VMPOOL4IP

And if it doesn't work, does it also remain broken after you remove Jool?

I have reproduced on VM, here's an example from the logs:

Ok, so it looks like the suspiciously empty packet was actually a consequence of my recent brain fart. We're back to not knowing if this is a Jool bug or not; the packets look fine.

Of course it all comes down to the one reason I wanted an updated kernel in the first place, BBR support:

Suspicious. A TCP-specific technology, breaking ICMP traffic?

Hmmmmmm

@ydahhrk ydahhrk added Support User needs help and removed Bug labels Feb 24, 2023
@nosnilmot
Copy link

Actually, I'm getting more confused now. Please bear my cluelessness, as I'm a developer, not a network admin/designer. It seems this is the path the packet has to travel:

Oh no! I've horribly confused you. It's late here so I'm not going to reply to your whole message but I have 2 parallel configurations I have been testing: 1 - a physical host (HW), 2 - a virtual machine (VM).

They are installed in parallel on the same networks to try to replicate exactly the problem from HW on VM where there will be less network traffic to make diagnosis easier (well that was the plan).

So it looks EITHER like this:

                                                    +-----------------+
+----------------+   +-----------------+            | enp2s0          |   +---------+
| Mac            |   | enp3s0          |            | HWPOOL4IP       |   | Gateway |   +------
| IPV6PREFIX::x1 |---| IPV6PREFIX::x2  |--- Jool ---| HW2NDIP4        |---| IP4GW   |---| ...
+----------------+   +-----------------+            | HWPRVIP4        |   +---------+   +------
                                                    +-----------------+

OR like this:

                                                    +-----------------+
+----------------+   +-----------------+            | ens224          |   +---------+
| Mac            |   | ens160          |            | VMPOOL4IP       |   | Gateway |   +------
| IPV6PREFIX::x1 |---| IPV6PREFIX::x2  |--- Jool ---| VM2NDIP4        |---| IP4GW   |---| ...
+----------------+   +-----------------+            | VMPRVIP4        |   +---------+   +------
                                                    +-----------------+

... depending on whether radvd is running on the physical host or the VM to advertise the route for 64:ff9b::/96

@ydahhrk
Copy link
Member

ydahhrk commented Feb 25, 2023

In the VM version, how are you connecting the interfaces to the physical network? Is it a bridge? Or the Mac is actually the host?

@nosnilmot
Copy link

Brief status update from my side:

Spurred on by the comments here I have been rebuilding parts of my network to provide more robust isolation of network segments and eliminate possibility of ARP issues (although I don't think there were any, just to be sure...).

Unfortunately it hasn't all gone to plan and I have encountered a few other unrelated issues that will slow down diagnosis of these Jool/NAT64 issues in the short term.

@zajdee
Copy link

zajdee commented Apr 29, 2023

net.core.default_qdisc=fq

I didn't expect to end up here, however seeing fq here indicates that this issue indeed is the same as the one I have reported in #400. @ydahhrk was kind enough to perform kernel debugging and #400 now contains a precisely pinpointed spot in the kernel which is causing these packet drops. Unfortunately at this moment we don't understand what's the root cause. Switching to a different qdisc than fq helped in my case.

ydahhrk added a commit that referenced this issue May 3, 2023
Appears to be the proper solution for #382 and #400.
@ydahhrk ydahhrk added Bug and removed Support User needs help labels Jun 11, 2023
@ydahhrk ydahhrk added this to the 4.1.10 milestone Jun 11, 2023
@ydahhrk
Copy link
Member

ydahhrk commented Jun 12, 2023

4.1.10 released; closing.

@ydahhrk ydahhrk closed this as completed Jun 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants