UDP6 checksum problem #1832

sconemad · 2017-02-07T15:00:39Z

I've found a problem in the way that the UDP checksum is calculated for outgoing UDP packets over IPv6 (I'm using the latest up-to-date Raspbian on a Pi 3 if that makes any difference)

I'm testing an application on the Pi which uses UDP transport over IPv6, sending datagrams to another machine. I've noticed that occasionally a datagram is sent from the Pi which is never received by the application on the other machine, even when repeatedly re-transmitted. Using wireshark, I was able to see the UDP packets arriving on the other machine, and noticed that the "missing" packets had a UDP checksum of 0.

According to RFC 2460, checksums are not optional for UDP packets over IPv6 like they are for IPv4, so the value 0 is invalid (in the case where the checksum is calculated to be 0, it's supposed to be replaced with 0xFFFF). I believe this is why the other machine is discarding these packets. Since the checksum is a 16 bit number, this would happen on average for 1 in every 65536 packets.

I've written a small test program in c to demonstrate this issue (attached as udpcs.c.txt).
You'll need a Pi with Raspbian, and another machine (e.g. Linux x86), connected over Ethernet with IPv6 supported.
Copy the test program onto both machines and build with:
gcc -o udpcs udpcs.c

On the other (non-pi) machine, run ./udpcs. This will start a simple UDP6 echo server.
On the Pi, run ./udpcs SOURCE DEST

Where SOURCE is the IPv6 address of the Pi, and DEST is the IPv6 address of the other machine. This will send test UDP6 datagrams to the server, consisting of a sequence number and random data, and wait for the datagram to be echoed back, retrying if there is no reply within 1 second. It also calculates and reports the UDP checksum for each packet.

When I run this, I get a steady stream of datagrams being sent and received, but at some point it will stop and keep retrying, where the packet checksum is 0xFFFF. Looking at the output on the other machine, you can see the packet in question is never received.

Here's an example of what I get on the client:

...
SEND(seq=18103 checksum=0d62 size=128) ... RECV(seq=18103 checksum=0d62 size=128)
SEND(seq=18104 checksum=6c9c size=128) ... RECV(seq=18104 checksum=6c9c size=128)
SEND(seq=18105 checksum=f37e size=128) ... RECV(seq=18105 checksum=f37e size=128)
SEND(seq=18106 checksum=bee2 size=128) ... RECV(seq=18106 checksum=bee2 size=128)
SEND(seq=18107 checksum=ffff size=128) ... retrying due to timeout
SEND(seq=18107 checksum=ffff size=128) ... retrying due to timeout
SEND(seq=18107 checksum=ffff size=128) ... retrying due to timeout
...repeats...

And on the server:

...
RECV(seq=18103 size=128) ... SEND(size=128)
RECV(seq=18104 size=128) ... SEND(size=128)
RECV(seq=18105 size=128) ... SEND(size=128)
RECV(seq=18106 size=128) ... SEND(size=128)
...stops here...

(obviously, the exact point where it stops will vary, since the UDP checksum depends on the source and destination IPv6 addresses and the random data).

By default on the Pi, UDP checksums are calculated on the NIC before sending (using checksum offloading), which can be disabled by installing "ethtool" and running sudo ethtool --offload eth0 rx off tx off. Doing this makes the problem go away, which would indicate that the NIC's checksum calculation algorithm is at fault. Does anyone know where this lives, and how to go about getting it fixed?

udpcs.c.txt

The text was updated successfully, but these errors were encountered:

P33M · 2017-02-07T15:31:24Z

If it's a function of a setting of the onboard NIC, then have you tried using a different USB-ethernet adapter?

There are multiple chipsets available on the open market from different manufacturers. This would tell you for definite if it's a function of hardware.

popcornmix · 2017-02-07T15:40:44Z

Do you believe checksums are correct except the case where the checksum is 0x0000 which should be replaced by 0xffff? Or are you getting many spurious 0x0000 checksums?

Ferroin · 2017-02-07T15:46:08Z

@P33M if he fixed it using ethtool, it's either the hardware, the firmware for the NIC (I don't recall whether the chip the Pi's use has loadable firmware), or the driver. In any of the three cases, if he gets an adapter with a different chipset, I can almost guarantee it will work (the almost referring to the possibility that he happens to find a different chipset that has exactly the same behavior).

popcornmix · 2017-02-07T16:05:02Z

I suspect the issues lies in here:
https://github.com/raspberrypi/linux/blob/rpi-4.4.y/drivers/net/usb/smsc95xx.c#L1894

I'm not sure if it needs a check for 0x0000 and replacement with 0xffff.

Ferroin · 2017-02-07T16:08:56Z

Possibly. I hadn't noticed the ethtool command used before, but that implies heavily that it's hardware or firmware, because that particular ethtool command disables checksum offloading.

@sconemad Do things work when you only disable offloading for transmit, or do you have to disable both to get it to work?

JamesH65 · 2017-02-07T16:24:50Z

Changing the smsc driver there will only fix for packets < length 45 as they use a SW checksum, otherwise I presume the HW inserts the checksum on transmit, after this function. IIR the code correctly!

Might be worth hacking it to always use SW checksum to see if it fixes the problem.

popcornmix · 2017-02-07T16:29:15Z

I changed the if (skb->len <= 45) to if (1) and it still failed.

Ferroin · 2017-02-07T16:42:40Z

The driver seems to default to having TX checksum offloading enabled, which would suggest that the if (skb->len <= 45) may be pointless to begin with.

@popcornmix any chance you could try that modified kernel and then run sudo ethtool --offload eth0 rx off tx off before doing the actual test? If that works, then that points at a possibly separate issue in the driver where it's not preventing packets it's checksummed in SW from being re checksummed in HW.

JamesH65 · 2017-02-07T16:49:54Z

@dom presumably you also put in a if (sum == 0) sum = 0xffff thingy in the SW calc section as well?

@Ferroin If checksums are enabled, won't it always get to that if < 45 section?

popcornmix · 2017-02-07T16:51:27Z

I'm just testing the replacement of 0 with 0xffff and it is running longer.
Just testing the SW section currently - need to check if HW block needs the same.

Ferroin · 2017-02-07T16:56:55Z

@JamesH65 If however the hardware is still checksumming the packets that would match that section for some reason, then there's no point in having the section as-is, since the hardware would be stomping on the software checksum anyway. Looking at the code again, if that is the case, then it's another (possibly related, possibly unrelated) hardware bug.

I could of course be completely off-base on this, most of my expertise regarding kernel code is very high-level stuff, not drivers...

sconemad · 2017-02-07T17:14:28Z

@popcornmix Yes, I think the problem is only with packets where the checksum is calculated to be 0, which should be replaced with 0xFFFF.

popcornmix · 2017-02-07T19:02:51Z

@Ferroin This patch fixes the problem with hardware checksums enabled (the default set up), but obviously wastes cpu always calculating crcs.
But it suggests this code is having an effect (i.e. presumably is called after HW checksum has occurred), or at least the HW is not overwriting this value.

Unfortunately I can't find the checksum in the not skb->len <= 45 code path. The same offset doesn't contain a value matching the software computed value, and trying to change 0->0xffff doesn't fix the bug.
Dumping out 64 bytes from skb->data doesn't allow me to spot the checksum, so I'm a bit stumped about exactly what is going on in this code.

With the patch present, running
sudo ethtool --offload eth0 rx off tx off
doesn't change behaviour - it is still happy.

popcornmix · 2017-02-07T19:05:47Z

BTW: I have emailed the author of this code and he has pinged microchip so perhaps we will get some advice from them.

sconemad · 2017-02-07T22:03:37Z

@Ferroin Yes, I only need to disable transmit offloading to make it work. I guess this means that the receive offloading works, since the returned datagram is going to have the same checksum in my test code, since only the source and destination addresses are flipped.

Thanks for looking into this everyone. Please let me know if you'd like me to test anything.

Electron752 · 2017-02-08T04:48:12Z

@sconemad: Is this for personal use? I went to Microchip's website and they have a newer driver but you have to agree to some legal terms to download which I think includes non-disclosure. I did download it and briefly looked at it.

If you aren't planning to distribute the source code or this is for a closed source device, you can probably just use Microchip's driver. Or at least test it and see if it fixes the issue.

Otherwise, you will probably just have to wait for the original author to fix it. The product appears to be now called LAN9514.

http://www.microchip.com/wwwproducts/en/LAN9514
Just look at the software library tab.

JamesH65 · 2017-02-08T09:20:17Z

My understanding of the code is that the line 1934

if (csum)
tx_cmd_b |= TX_CMD_B_CSUM_ENABLE;

Adds a flag to the packet to enable the HW checksum to be added later. Which is why you cannot see it in this function, and cannot correct it in this function. I have no idea how to get round that - it would appear that this is a firmware issue and will need to be fixed there.

JamesH65 · 2017-02-08T10:02:09Z

I should clarify; if we force SW checksums all the time we can fix it here (as per Dom's change), but if we want to use HW checksums, I suspect that needs a firmware change.

If no firmware change, it's a trade off between added CPU burden, vs losing a packet and requiring a retry on average every 65k. Might be a sweet spot dependent on packet size. ie over a particular size, use HW anyway and take the chance of a bad checksum, because the CPU burden is too high the rest of the time.

NWilson · 2017-02-08T10:05:54Z

The problem for our application (and the reason we noticed this bug), is that retrying failed packets won't work! When we resend the dropped packet, it's sent unchanged, and indeed, without corrupting the data we're trying to send it's not possible to change the data on retransmission. So, every retransmission also fails due to the checksum. Random packet loss with UDP is bearable, but systematic packet loss is crippling.

sconemad · 2017-02-08T10:22:52Z

(sorry, pressed the wrong button!)

I was basically going to say what @NWilson said above (he is the one who originally discovered this issue BTW).

This was found in testing RealVNC Server (included in Raspbian), so we're quite keen to see a general solution for this, although it's nice to know that there are ways to workaround it.

Ferroin · 2017-02-08T13:20:23Z

If someone could do some testing to get numbers both on how much forcing SW checksumming would impact network throughput and how much CPU load it would impose, that would be good.

Electron752 · 2017-02-08T19:39:17Z

@Ferroin - That kind of test may actually be a bunch of work to write. Lots of different configurations to test, etc. udp performance is hard to test anyway.

At this point, maybe it is best to wait and see what the driver author and Microchip say about this. I know Microchip is generally very open about hardware and firmware issues. They are also constantly tweaking the devices so this issue may in fact only happen on certain revisions or it may happen in only very specific cases which are easy to detect and correct in the driver. If it is indeed a firmware issue instead of a silicon issue, it may even be possible to update the firmware.

sconemad · 2017-02-09T10:04:59Z

I'm not sure how helpful this is, but I was able to do some fairly simple tests by modifying my test program to send and receive 100,000 x 1000 byte datagrams, and measuring the run time - which came out to be around 1 minute. (Luckily I didn't get any packets where the 0 checksum)

Running it quite a few times with checksum offloading on and off, I wasn't able to detect any significant difference in any of the timings.

JamesH65 · 2017-02-09T10:22:06Z

Do you have a Pi1 to try it on? The impact on a single core ARM of this era may be significant, and we need to ensure things still work OK on all Pi models.

…

On 9 February 2017 at 10:05, Andrew Wedgbury ***@***.***> wrote: I'm not sure how helpful this is, but I was able to do some fairly simple tests by modifying my test program to send and receive 100,000 x 1000 byte datagrams, and measuring the run time - which came out to be around 1 minute. (Luckily I didn't get any packets where the 0 checksum) Running it quite a few times with checksum offloading on and off, I wasn't able to detect any significant difference in any of the timings. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1832 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADqrHexuYhD3UQpJdvAX7QypPu503Prfks5rauTSgaJpZM4L5mWh> .

-- James Hughes Principal Software Engineer, Raspberry Pi (Trading) Ltd

popcornmix · 2017-02-09T16:02:47Z

This is on a Pi1 at 700MHz

pi@domnfs:~ $ sudo ethtool --offload eth0 rx on tx on
pi@domnfs:~ $ dd if=/dev/zero bs=100M count=1 | nc -v -v -n 192.168.4.43 2000 & top -d30 -b -n2 | grep Cpu | tail -1
%Cpu(s):  0.9 us, 42.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi, 56.9 si,  0.0 st
104857600 bytes (105 MB) copied, 33.3049 s, 3.1 MB/s
pi@domnfs:~ $ 
pi@domnfs:~ $ dd if=/dev/zero bs=100M count=1 | nc -v -v -n 192.168.4.43 2000 & top -d30 -b -n2 | grep Cpu | tail -1
%Cpu(s):  1.4 us, 41.9 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi, 56.7 si,  0.0 st
104857600 bytes (105 MB) copied, 34.8077 s, 3.0 MB/s

pi@domnfs:~ $ sudo ethtool --offload eth0 rx off tx off
pi@domnfs:~ $ dd if=/dev/zero bs=100M count=1 | nc -v -v -n 192.168.4.43 2000 & top -d30 -b -n2 | grep Cpu | tail -1
%Cpu(s):  1.2 us, 41.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi, 57.7 si,  0.0 st
104857600 bytes (105 MB) copied, 34.7566 s, 3.0 MB/s
pi@domnfs:~ $ dd if=/dev/zero bs=100M count=1 | nc -v -v -n 192.168.4.43 2000 & top -d30 -b -n2 | %Cpu(s):  1.4 us, 41.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi, 57.3 si,  0.0 st
104857600 bytes (105 MB) copied, 34.9811 s, 3.0 MB/s

So with HW checksums on we get:
3.1 MB/s with CPU 42.2% (system) and 0.9% (user)
3.0 MB/s with CPU 41.9% (system) and 1.4% (user)

So with SW checksums on we get:
3.0 MB/s with CPU 41.2% (system) and 1.2% (user)
3.0 MB/s with CPU 41.3% (system) and 1.4% (user)

So I'm not measuring a difference with this test.

njh · 2017-04-02T14:53:43Z

I have just discovered a related issue that IPv6 UDP checksum calculation is failing for very small packets (see #1944). Switching off hardware offloading using ethtool fixes the problem.

Looks like the LAN9514 is a bit dodgy for IPv6...

popcornmix · 2017-04-02T14:58:02Z

From other issue:
Microchip has said they will be submitting a patch for #1832 (and a couple of other issues) upstream within about a month.

njh · 2017-05-03T18:07:21Z

Any news on a fix from Microchip?

popcornmix · 2017-05-03T19:04:46Z

No news. I've pinged them - I'll report here when there is any news.

popcornmix · 2017-05-03T19:18:11Z

Update: "The work is ongoing. Should see the patch next week."

kernel: irq_bcm2836: Send event when onlining sleeping cores kernel: ARM: dts: bcm283x: Reserve first page for firmware See: raspberrypi/linux#1989 kernel: smsc95xx: Avoid HW TX CSUM for IPV6 See: raspberrypi/linux#1832

popcornmix · 2017-05-09T19:15:47Z

Patch from microchip has been submitted to netdev mailing list.
I have cherry-picked it here: a9ad627
and latest rpi-update kernel includes this patch.
Please test this and let me know if you are happy. It appeared to fix sconemad's udpcs test for me.
My understanding is that is disables HW TX checksums for IPV6.

sconemad · 2017-05-10T15:00:56Z

Tested with kernel 4.9.27-v7+ #997 SMP Tue May 9 19:58:37 BST 2017 obtained using rpi-update.

Both my udpcs test app and RealVNC server no longer encounter this problem, so this appears to have fixed it.

Thanks!

sconemad closed this as completed Feb 8, 2017

sconemad reopened this Feb 8, 2017

popcornmix mentioned this issue Apr 2, 2017

IPv6 UDP checksum error when sending small packets #1944

Closed

P33M closed this as completed May 18, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UDP6 checksum problem #1832

UDP6 checksum problem #1832

sconemad commented Feb 7, 2017

P33M commented Feb 7, 2017

popcornmix commented Feb 7, 2017

Ferroin commented Feb 7, 2017

popcornmix commented Feb 7, 2017

Ferroin commented Feb 7, 2017

JamesH65 commented Feb 7, 2017

popcornmix commented Feb 7, 2017

Ferroin commented Feb 7, 2017

JamesH65 commented Feb 7, 2017

popcornmix commented Feb 7, 2017

Ferroin commented Feb 7, 2017

sconemad commented Feb 7, 2017 •

edited

Loading

popcornmix commented Feb 7, 2017

popcornmix commented Feb 7, 2017

sconemad commented Feb 7, 2017

Electron752 commented Feb 8, 2017

JamesH65 commented Feb 8, 2017

JamesH65 commented Feb 8, 2017 •

edited

Loading

NWilson commented Feb 8, 2017 via email •

edited

Loading

sconemad commented Feb 8, 2017

Ferroin commented Feb 8, 2017

Electron752 commented Feb 8, 2017

sconemad commented Feb 9, 2017

JamesH65 commented Feb 9, 2017 via email

popcornmix commented Feb 9, 2017

njh commented Apr 2, 2017

popcornmix commented Apr 2, 2017

njh commented May 3, 2017

popcornmix commented May 3, 2017

popcornmix commented May 3, 2017

popcornmix commented May 9, 2017

sconemad commented May 10, 2017

UDP6 checksum problem #1832

UDP6 checksum problem #1832

Comments

sconemad commented Feb 7, 2017

P33M commented Feb 7, 2017

popcornmix commented Feb 7, 2017

Ferroin commented Feb 7, 2017

popcornmix commented Feb 7, 2017

Ferroin commented Feb 7, 2017

JamesH65 commented Feb 7, 2017

popcornmix commented Feb 7, 2017

Ferroin commented Feb 7, 2017

JamesH65 commented Feb 7, 2017

popcornmix commented Feb 7, 2017

Ferroin commented Feb 7, 2017

sconemad commented Feb 7, 2017 • edited Loading

popcornmix commented Feb 7, 2017

popcornmix commented Feb 7, 2017

sconemad commented Feb 7, 2017

Electron752 commented Feb 8, 2017

JamesH65 commented Feb 8, 2017

JamesH65 commented Feb 8, 2017 • edited Loading

NWilson commented Feb 8, 2017 via email • edited Loading

sconemad commented Feb 8, 2017

Ferroin commented Feb 8, 2017

Electron752 commented Feb 8, 2017

sconemad commented Feb 9, 2017

JamesH65 commented Feb 9, 2017 via email

popcornmix commented Feb 9, 2017

njh commented Apr 2, 2017

popcornmix commented Apr 2, 2017

njh commented May 3, 2017

popcornmix commented May 3, 2017

popcornmix commented May 3, 2017

popcornmix commented May 9, 2017

sconemad commented May 10, 2017

sconemad commented Feb 7, 2017 •

edited

Loading

JamesH65 commented Feb 8, 2017 •

edited

Loading

NWilson commented Feb 8, 2017 via email •

edited

Loading