New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raspberry pi as router - very slow #301

Closed
goldie83 opened this Issue May 20, 2013 · 14 comments

Comments

Projects
None yet
7 participants
@goldie83

goldie83 commented May 20, 2013

Hi,
Raspberry Pi Model B - 512MB RAM working mainly as a router. (raspbian wheezy)
Symptoms: slow internet (not going over 600kiB/s), raspberry acts like on heavy load.
The second NIC is on the USB bus. Watch for: time spent servicing software interrupts.

top shows:
top - 00:19:56 up 4:28, 1 user, load average: 0.19, 0.49, 0.42
Tasks: 68 total, 1 running, 67 sleeping, 0 stopped, 0 zombie
%Cpu(s): 10.0 us, 7.1 sy, 0.0 ni, 34.3 id, 0.0 wa, 0.0 hi, 48.6 si, 0.0 st

pi@goldie ~ $ uname -a
Linux goldie 3.6.11+ #446 PREEMPT Fri May 10 20:17:25 BST 2013 armv6l GNU/Linux

pi@goldie ~ $ lsusb
Bus 001 Device 002: ID 0424:9512 Standard Microsystems Corp.
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 003: ID 0424:ec00 Standard Microsystems Corp.
Bus 001 Device 004: ID 0a46:9601 Davicom Semiconductor, Inc. DM9601 Fast Ethernet Adapter
Bus 001 Device 005: ID 04fc:0c25 Sunplus Technology Co., Ltd SATALink SPIF225A

pi@goldie ~ $ procinfo
Memory: Total Used Free Buffers
RAM: 383752 249500 134252 23276
Swap: 9767484 0 9767484

Bootup: Mon May 20 19:51:20 2013 Load average: 1.22 0.70 0.43 3/161 4946

user : 00:19:02.52 9.8% page in : 154808
nice : 00:00:00.00 0.0% page out: 66441
system: 00:04:22.47 2.3% page act: 19394
IOwait: 00:00:15.95 0.1% page dea: 3
hw irq: 00:00:02.54 0.0% page flt: 1683962
sw irq: 00:21:39.39 11.2% swap in : 0
idle : 02:28:00.45 76.5% swap out: 0
uptime: 04:24:12.00 context : 1968187

irq 3: 571046 irq 75: 1
irq 32: 382177216 irq 77: 224
irq 52: 0 irq 83: 20
irq 65: 31685 irq 84: 18204
irq 66: 1

mmcblk0 218r mmcblk0p2 80r
mmcblk0p1 50r sda 5830r 5096w

eth0 TX 1.03GiB RX 2.92GiB lo TX 48.25KiB RX 48.25KiB
eth1 TX 2.97GiB RX 1022.39MiB

and

pi@goldie ~ $ ls /proc/irq/32
dwc_otg dwc_otg_hcd:usb1 dwc_otg_pcd spurious

@ghost ghost assigned ghollingworth May 20, 2013

@Ferroin

This comment has been minimized.

Show comment
Hide comment
@Ferroin

Ferroin May 20, 2013

Contributor

At least part of this is probably due to bus contention between the Ethernet chips (and other devices).
USB also is not very efficient (comparatively) for devices like network adapters that need high I/O bandwidth because the devices have to deal with the bus management overhead (which can be very high if there are multiple hubs between the device and the bus controller).
Furthermore, all the data being relayed between the adapters has to traverse the bus twice because the adapters can't transfer the data directly between each other.

In fact, assuming that you are using a USB disk as root (I assume that's what the SATALink is for), and that all devices are running at high speed (480Mb/s signaling rate), 600kiB/s is actually very good.

Contributor

Ferroin commented May 20, 2013

At least part of this is probably due to bus contention between the Ethernet chips (and other devices).
USB also is not very efficient (comparatively) for devices like network adapters that need high I/O bandwidth because the devices have to deal with the bus management overhead (which can be very high if there are multiple hubs between the device and the bus controller).
Furthermore, all the data being relayed between the adapters has to traverse the bus twice because the adapters can't transfer the data directly between each other.

In fact, assuming that you are using a USB disk as root (I assume that's what the SATALink is for), and that all devices are running at high speed (480Mb/s signaling rate), 600kiB/s is actually very good.

@popcornmix

This comment has been minimized.

Show comment
Hide comment
@popcornmix

popcornmix May 21, 2013

Collaborator

@Ferroin
are you sure you've got your units right?
600kB/s=4.8Mb/s which is 1% of 480Mb/s signalling rate.

I'm not sure what speed you should expect, but that seems too low to me.

Collaborator

popcornmix commented May 21, 2013

@Ferroin
are you sure you've got your units right?
600kB/s=4.8Mb/s which is 1% of 480Mb/s signalling rate.

I'm not sure what speed you should expect, but that seems too low to me.

@goldie83

This comment has been minimized.

Show comment
Hide comment
@goldie83

goldie83 May 21, 2013

@Ferroin
you are right with the USB disk - but I didn't notice any activity of the disk while routing packets.

goldie83 commented May 21, 2013

@Ferroin
you are right with the USB disk - but I didn't notice any activity of the disk while routing packets.

@stegmannt

This comment has been minimized.

Show comment
Hide comment
@stegmannt

stegmannt May 21, 2013

My Pi (model B, revision 1) also serves as a router and firewall to a students residence network. Down- and upstream are usually around 3-5 MByte/s, which is something like 50% I get out of my notebook. It's not high speed, but ok for the Pi, isn't it?

Your low speed may also be caused by the usb network adapter. Here's mine:

Bus 001 Device 004: ID 0b95:7720 ASIX Electronics Corp. AX88772

stegmannt commented May 21, 2013

My Pi (model B, revision 1) also serves as a router and firewall to a students residence network. Down- and upstream are usually around 3-5 MByte/s, which is something like 50% I get out of my notebook. It's not high speed, but ok for the Pi, isn't it?

Your low speed may also be caused by the usb network adapter. Here's mine:

Bus 001 Device 004: ID 0b95:7720 ASIX Electronics Corp. AX88772

@Ferroin

This comment has been minimized.

Show comment
Hide comment
@Ferroin

Ferroin May 21, 2013

Contributor

@popcornmix
As it is the 480Mb/s signal rate means maybe 240Mb/s actual throughput due to the management overhead, then there is the fact that most USB network adapters (the SMSC95xx included in this list) are controlled via reading and writing to device registers, which is not at all efficient over USB (I2C or SPI would be better for that), and they limit (USB) packet size for such control data. On top of this all, many can only transfer a single Ethernet frame at a time on slower systems like the pi.

Also by 600kiB/s i mean actual throughput on top of TCP/IP, not Ethernet level throughput; TCP and IPv4 add a lot of processing overhead (especially if you are doing packet filtering, not just flat relaying between the interfaces).

Contributor

Ferroin commented May 21, 2013

@popcornmix
As it is the 480Mb/s signal rate means maybe 240Mb/s actual throughput due to the management overhead, then there is the fact that most USB network adapters (the SMSC95xx included in this list) are controlled via reading and writing to device registers, which is not at all efficient over USB (I2C or SPI would be better for that), and they limit (USB) packet size for such control data. On top of this all, many can only transfer a single Ethernet frame at a time on slower systems like the pi.

Also by 600kiB/s i mean actual throughput on top of TCP/IP, not Ethernet level throughput; TCP and IPv4 add a lot of processing overhead (especially if you are doing packet filtering, not just flat relaying between the interfaces).

@goldie83

This comment has been minimized.

Show comment
Hide comment
@goldie83

goldie83 May 21, 2013

@stegmannt
maybe its not the best testing method, but:

  • eth1 is connected to the internet. Fetched a single file from web: 7 MB/s peak (of course written to USB disk on same bus)
  • copying this file between two partitions on the disk about 10MB/s
  • getting from apache server on raspberry this file is about 600kB/s (eth0) - ethernet card on raspberry - and I think this is the problem (how to investigate this?)
root@goldie:/home/pi# dd if=/home/pi/debian-testing-amd64-DVD-1.iso of=/var/www/debian-testing-amd64-DVD-1.iso bs=4096
973896+0 records in
973896+0 records out
3989078016 bytes (4.0 GB) copied, 385.982 s, 10.3 MB/s
pi@goldie ~ $ dmesg | grep eth1
[   13.408808] dm9601 1-1.2:1.0: eth1: register 'dm9601' at usb-bcm2708_usb-1.2, Davicom DM9601 USB Ethernet, 00:06:06:e0:0a:5f
pi@goldie ~ $ wget http://cdimage.debian.org/cdimage/weekly-builds/amd64/iso-dvd/debian-testing-amd64-DVD-1.iso
...
2013-05-21 15:03:19 (5.08 MB/s) - `debian-testing-amd64-DVD-1.iso' saved [3989078016/3989078016]
root@dlink-4378AA:/mnt/HD/HD_a2/tmp# wget http://192.168.20.1/debian-testing-amd64-DVD-1.iso
--2013-05-21 14:02:23--  http://192.168.20.1/debian-testing-amd64-DVD-1.iso
Connecting to 192.168.20.1:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3989078016 (3.7G) [application/x-iso9660-image]
Saving to: `debian-testing-amd64-DVD-1.iso'
 4% [>                                      ] 196,256,512  614K/s  eta 1h 41m

goldie83 commented May 21, 2013

@stegmannt
maybe its not the best testing method, but:

  • eth1 is connected to the internet. Fetched a single file from web: 7 MB/s peak (of course written to USB disk on same bus)
  • copying this file between two partitions on the disk about 10MB/s
  • getting from apache server on raspberry this file is about 600kB/s (eth0) - ethernet card on raspberry - and I think this is the problem (how to investigate this?)
root@goldie:/home/pi# dd if=/home/pi/debian-testing-amd64-DVD-1.iso of=/var/www/debian-testing-amd64-DVD-1.iso bs=4096
973896+0 records in
973896+0 records out
3989078016 bytes (4.0 GB) copied, 385.982 s, 10.3 MB/s
pi@goldie ~ $ dmesg | grep eth1
[   13.408808] dm9601 1-1.2:1.0: eth1: register 'dm9601' at usb-bcm2708_usb-1.2, Davicom DM9601 USB Ethernet, 00:06:06:e0:0a:5f
pi@goldie ~ $ wget http://cdimage.debian.org/cdimage/weekly-builds/amd64/iso-dvd/debian-testing-amd64-DVD-1.iso
...
2013-05-21 15:03:19 (5.08 MB/s) - `debian-testing-amd64-DVD-1.iso' saved [3989078016/3989078016]
root@dlink-4378AA:/mnt/HD/HD_a2/tmp# wget http://192.168.20.1/debian-testing-amd64-DVD-1.iso
--2013-05-21 14:02:23--  http://192.168.20.1/debian-testing-amd64-DVD-1.iso
Connecting to 192.168.20.1:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3989078016 (3.7G) [application/x-iso9660-image]
Saving to: `debian-testing-amd64-DVD-1.iso'
 4% [>                                      ] 196,256,512  614K/s  eta 1h 41m
@stegmannt

This comment has been minimized.

Show comment
Hide comment
@stegmannt

stegmannt May 22, 2013

@goldie83

I made some testing. What I found out is, that network speed heavily depends on the tool used to copy a file.

SCP with compression enabled

$ scp -C pi@192.168.xx.xx:debian-testing-amd64-CD-1.iso . 
debian-testing-amd64-CD-1.iso                                           10%   67MB   1.0MB/s   09:50 ETA

SCP with default options

$ scp pi@192.168.xx.xx:debian-testing-amd64-CD-1.iso . 
debian-testing-amd64-CD-1.iso                                           25%  168MB   3.4MB/s   02:21 ETA

SCP with RC4

$scp -c arcfour pi@192.168.xx.xx:debian-testing-amd64-CD-1.iso . 
debian-testing-amd64-CD-1.iso                                           10%   70MB   4.1MB/s   02:21 ETA

Netcat

$ dd if=/dev/zero bs=1024K count=512 | nc -v 192.168.xx.yy 2222
Connection to 192.168.xx.yy 2222 port [tcp/*] succeeded!
512+0 Datensätze ein
512+0 Datensätze aus
536870912 Bytes (537 MB) kopiert, 115,15 s, 4,7 MB/s

As a conclusion I suspect Apache to be your bottleneck. Have your tried any other tools?

edit

And I just learned about iperf:

$ iperf -c 192.168.xx.yy -f M -t 20
------------------------------------------------------------
Client connecting to 192.168.xx.yy, TCP port 5001
TCP window size: 0.02 MByte (default)
------------------------------------------------------------
[  3] local 192.168.xx.zz port 39326 connected with 192.168.xx.yy port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-20.0 sec   145 MBytes  7.24 MBytes/sec

stegmannt commented May 22, 2013

@goldie83

I made some testing. What I found out is, that network speed heavily depends on the tool used to copy a file.

SCP with compression enabled

$ scp -C pi@192.168.xx.xx:debian-testing-amd64-CD-1.iso . 
debian-testing-amd64-CD-1.iso                                           10%   67MB   1.0MB/s   09:50 ETA

SCP with default options

$ scp pi@192.168.xx.xx:debian-testing-amd64-CD-1.iso . 
debian-testing-amd64-CD-1.iso                                           25%  168MB   3.4MB/s   02:21 ETA

SCP with RC4

$scp -c arcfour pi@192.168.xx.xx:debian-testing-amd64-CD-1.iso . 
debian-testing-amd64-CD-1.iso                                           10%   70MB   4.1MB/s   02:21 ETA

Netcat

$ dd if=/dev/zero bs=1024K count=512 | nc -v 192.168.xx.yy 2222
Connection to 192.168.xx.yy 2222 port [tcp/*] succeeded!
512+0 Datensätze ein
512+0 Datensätze aus
536870912 Bytes (537 MB) kopiert, 115,15 s, 4,7 MB/s

As a conclusion I suspect Apache to be your bottleneck. Have your tried any other tools?

edit

And I just learned about iperf:

$ iperf -c 192.168.xx.yy -f M -t 20
------------------------------------------------------------
Client connecting to 192.168.xx.yy, TCP port 5001
TCP window size: 0.02 MByte (default)
------------------------------------------------------------
[  3] local 192.168.xx.zz port 39326 connected with 192.168.xx.yy port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-20.0 sec   145 MBytes  7.24 MBytes/sec
@goldie83

This comment has been minimized.

Show comment
Hide comment
@goldie83

goldie83 May 22, 2013

Wan't go faster...

pi@goldie ~ $ dd if=/dev/zero bs=1024K count=16 | nc -v 192.168.20.30 2222
Connection to 192.168.20.30 2222 port [tcp/*] succeeded!
16+0 records in
16+0 records out
16777216 bytes (17 MB) copied, 26.5187 s, 633 kB/s
pi@goldie ~ $ iperf -c 192.168.20.30 -f M -t 20 -p 2222
------------------------------------------------------------
Client connecting to 192.168.20.30, TCP port 2222
TCP window size: 0.02 MByte (default)
------------------------------------------------------------
[  3] local 192.168.20.1 port 54523 connected with 192.168.20.30 port 2222
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-20.3 sec  12.4 MBytes  0.61 MBytes/sec

goldie83 commented May 22, 2013

Wan't go faster...

pi@goldie ~ $ dd if=/dev/zero bs=1024K count=16 | nc -v 192.168.20.30 2222
Connection to 192.168.20.30 2222 port [tcp/*] succeeded!
16+0 records in
16+0 records out
16777216 bytes (17 MB) copied, 26.5187 s, 633 kB/s
pi@goldie ~ $ iperf -c 192.168.20.30 -f M -t 20 -p 2222
------------------------------------------------------------
Client connecting to 192.168.20.30, TCP port 2222
TCP window size: 0.02 MByte (default)
------------------------------------------------------------
[  3] local 192.168.20.1 port 54523 connected with 192.168.20.30 port 2222
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-20.3 sec  12.4 MBytes  0.61 MBytes/sec

@goldie83 goldie83 closed this May 22, 2013

@goldie83 goldie83 reopened this May 22, 2013

@popcornmix

This comment has been minimized.

Show comment
Hide comment
@popcornmix

popcornmix May 22, 2013

Collaborator

@goldie83
What do you get from internet ethernet interface with external USB ethernet unplugged?

pi@raspberrypi:~ $ iperf -c 10.177.66.43 -f M -t 20
------------------------------------------------------------
Client connecting to 10.177.66.43, TCP port 5001
TCP window size: 0.02 MByte (default)
------------------------------------------------------------
[  3] local 10.177.71.41 port 57453 connected with 10.177.66.43 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-20.0 sec   225 MBytes  11.2 MBytes/sec

You could try the fiq_split kernel:
http://www.raspberrypi.org/phpBB3/viewtopic.php?f=28&t=39175&hilit=fiq_split&start=25#p345905

Collaborator

popcornmix commented May 22, 2013

@goldie83
What do you get from internet ethernet interface with external USB ethernet unplugged?

pi@raspberrypi:~ $ iperf -c 10.177.66.43 -f M -t 20
------------------------------------------------------------
Client connecting to 10.177.66.43, TCP port 5001
TCP window size: 0.02 MByte (default)
------------------------------------------------------------
[  3] local 10.177.71.41 port 57453 connected with 10.177.66.43 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-20.0 sec   225 MBytes  11.2 MBytes/sec

You could try the fiq_split kernel:
http://www.raspberrypi.org/phpBB3/viewtopic.php?f=28&t=39175&hilit=fiq_split&start=25#p345905

@stegmannt

This comment has been minimized.

Show comment
Hide comment
@stegmannt

stegmannt May 23, 2013

Also, sdram_freq seems to have an impact on ethernet/usb? performance.

With default options I get ~7MBytes/s, but with sdram_freq=500 I suddenly get:

iperf -c 192.168.xx.yy -f M -t 20
------------------------------------------------------------
Client connecting to 192.168.xx.yy, TCP port 5001
TCP window size: 0.02 MByte (default)
------------------------------------------------------------
[  3] local 192.168.xx.yy port 39555 connected with 192.168.xx.yy port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-20.0 sec   205 MBytes  10.3 MBytes/sec

This effect is reproducible over here.

stegmannt commented May 23, 2013

Also, sdram_freq seems to have an impact on ethernet/usb? performance.

With default options I get ~7MBytes/s, but with sdram_freq=500 I suddenly get:

iperf -c 192.168.xx.yy -f M -t 20
------------------------------------------------------------
Client connecting to 192.168.xx.yy, TCP port 5001
TCP window size: 0.02 MByte (default)
------------------------------------------------------------
[  3] local 192.168.xx.yy port 39555 connected with 192.168.xx.yy port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-20.0 sec   205 MBytes  10.3 MBytes/sec

This effect is reproducible over here.

@hvenzke

This comment has been minimized.

Show comment
Hide comment
@hvenzke

hvenzke May 24, 2013

well ,

at my site changed few buffer settings ..
http://serverfault.com/questions/277009/what-does-net-ratelimit-44-callbacks-suppressed-mean-on-a-linux-based-router to get RID of

[ 1954.422186] smsc95xx 1-1.1:1.0: eth0: kevent 2 may have been dropped
[ 1954.422353] smsc95xx 1-1.1:1.0: eth0: kevent 2 may have been dropped
[ 1954.422440] smsc95xx 1-1.1:1.0: eth0: kevent 2 may have been dropped

hvenzke commented May 24, 2013

well ,

at my site changed few buffer settings ..
http://serverfault.com/questions/277009/what-does-net-ratelimit-44-callbacks-suppressed-mean-on-a-linux-based-router to get RID of

[ 1954.422186] smsc95xx 1-1.1:1.0: eth0: kevent 2 may have been dropped
[ 1954.422353] smsc95xx 1-1.1:1.0: eth0: kevent 2 may have been dropped
[ 1954.422440] smsc95xx 1-1.1:1.0: eth0: kevent 2 may have been dropped
@goldie83

This comment has been minimized.

Show comment
Hide comment
@goldie83

goldie83 Jun 7, 2013

i have made a mistake. eth1 was an external NIC (dm9601) - and this was all the time the bottleneck. Now i've exchanged the NIC to an:

Bus 001 Device 004: ID 0b95:772a ASIX Electronics Corp. AX88772A Fast Ethernet

and it is fast. Between two raspberrys:

root@goldie:~# iperf -c 192.168.20.30 -f M -t 20
------------------------------------------------------------
Client connecting to 192.168.20.30, TCP port 5001
TCP window size: 0.02 MByte (default)
------------------------------------------------------------
[  3] local 192.168.20.1 port 42496 connected with 192.168.20.30 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-20.0 sec   149 MBytes  7.44 MBytes/sec

I'm closing this issue - because people have reported that the dm9601 has performance issues:
http://www.raspberrypi.org/phpBB3/viewtopic.php?p=117823&sid=6de6013e9cc3b222ac9bcee61607acd2#p117823
http://forum.odroid.com/viewtopic.php?f=10&t=1208

goldie83 commented Jun 7, 2013

i have made a mistake. eth1 was an external NIC (dm9601) - and this was all the time the bottleneck. Now i've exchanged the NIC to an:

Bus 001 Device 004: ID 0b95:772a ASIX Electronics Corp. AX88772A Fast Ethernet

and it is fast. Between two raspberrys:

root@goldie:~# iperf -c 192.168.20.30 -f M -t 20
------------------------------------------------------------
Client connecting to 192.168.20.30, TCP port 5001
TCP window size: 0.02 MByte (default)
------------------------------------------------------------
[  3] local 192.168.20.1 port 42496 connected with 192.168.20.30 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-20.0 sec   149 MBytes  7.44 MBytes/sec

I'm closing this issue - because people have reported that the dm9601 has performance issues:
http://www.raspberrypi.org/phpBB3/viewtopic.php?p=117823&sid=6de6013e9cc3b222ac9bcee61607acd2#p117823
http://forum.odroid.com/viewtopic.php?f=10&t=1208

@goldie83 goldie83 closed this Jun 7, 2013

@popcornmix

This comment has been minimized.

Show comment
Hide comment
@popcornmix

popcornmix Jun 8, 2013

Collaborator

Thanks for the info.

Collaborator

popcornmix commented Jun 8, 2013

Thanks for the info.

@demian2g

This comment has been minimized.

Show comment
Hide comment
@demian2g

demian2g Jan 29, 2018

yep
ID 0fe6:9700 Kontron (Industrial Computer Source / ICS Advent) DM9601 Fast Ethernet Adapter
same problem on OrangePi: <7Mbit peak on iperf and speedtest (https://raw.githubusercontent.com/sivel/speedtest-cli/master/speedtest.py)

demian2g commented Jan 29, 2018

yep
ID 0fe6:9700 Kontron (Industrial Computer Source / ICS Advent) DM9601 Fast Ethernet Adapter
same problem on OrangePi: <7Mbit peak on iperf and speedtest (https://raw.githubusercontent.com/sivel/speedtest-cli/master/speedtest.py)

popcornmix pushed a commit that referenced this issue Feb 26, 2018

Paolo Abeni Pablo Neira Ayuso
netfilter: drop outermost socket lock in getsockopt()
The Syzbot reported a possible deadlock in the netfilter area caused by
rtnl lock, xt lock and socket lock being acquired with a different order
on different code paths, leading to the following backtrace:
Reviewed-by: Xin Long <lucien.xin@gmail.com>

======================================================
WARNING: possible circular locking dependency detected
4.15.0+ #301 Not tainted
------------------------------------------------------
syzkaller233489/4179 is trying to acquire lock:
  (rtnl_mutex){+.+.}, at: [<0000000048e996fd>] rtnl_lock+0x17/0x20
net/core/rtnetlink.c:74

but task is already holding lock:
  (&xt[i].mutex){+.+.}, at: [<00000000328553a2>]
xt_find_table_lock+0x3e/0x3e0 net/netfilter/x_tables.c:1041

which lock already depends on the new lock.
===

Since commit 3f34cfae1230 ("netfilter: on sockopt() acquire sock lock
only in the required scope"), we already acquire the socket lock in
the innermost scope, where needed. In such commit I forgot to remove
the outer-most socket lock from the getsockopt() path, this commit
addresses the issues dropping it now.

v1 -> v2: fix bad subj, added relavant 'fixes' tag

Fixes: 22265a5 ("netfilter: xt_TEE: resolve oif using netdevice notifiers")
Fixes: 202f59a ("netfilter: ipt_CLUSTERIP: do not hold dev")
Fixes: 3f34cfae1230 ("netfilter: on sockopt() acquire sock lock only in the required scope")
Reported-by: syzbot+ddde1c7b7ff7442d7f2d@syzkaller.appspotmail.com
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

nathanchance pushed a commit to nathanchance/pi-kernel that referenced this issue Mar 3, 2018

netfilter: drop outermost socket lock in getsockopt()
commit 01ea306 upstream.

The Syzbot reported a possible deadlock in the netfilter area caused by
rtnl lock, xt lock and socket lock being acquired with a different order
on different code paths, leading to the following backtrace:
Reviewed-by: Xin Long <lucien.xin@gmail.com>

======================================================
WARNING: possible circular locking dependency detected
4.15.0+ raspberrypi#301 Not tainted
------------------------------------------------------
syzkaller233489/4179 is trying to acquire lock:
  (rtnl_mutex){+.+.}, at: [<0000000048e996fd>] rtnl_lock+0x17/0x20
net/core/rtnetlink.c:74

but task is already holding lock:
  (&xt[i].mutex){+.+.}, at: [<00000000328553a2>]
xt_find_table_lock+0x3e/0x3e0 net/netfilter/x_tables.c:1041

which lock already depends on the new lock.
===

Since commit 3f34cfae1230 ("netfilter: on sockopt() acquire sock lock
only in the required scope"), we already acquire the socket lock in
the innermost scope, where needed. In such commit I forgot to remove
the outer-most socket lock from the getsockopt() path, this commit
addresses the issues dropping it now.

v1 -> v2: fix bad subj, added relavant 'fixes' tag

Fixes: 22265a5 ("netfilter: xt_TEE: resolve oif using netdevice notifiers")
Fixes: 202f59a ("netfilter: ipt_CLUSTERIP: do not hold dev")
Fixes: 3f34cfae1230 ("netfilter: on sockopt() acquire sock lock only in the required scope")
Reported-by: syzbot+ddde1c7b7ff7442d7f2d@syzkaller.appspotmail.com
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

popcornmix pushed a commit that referenced this issue Mar 8, 2018

netfilter: drop outermost socket lock in getsockopt()
commit 01ea306 upstream.

The Syzbot reported a possible deadlock in the netfilter area caused by
rtnl lock, xt lock and socket lock being acquired with a different order
on different code paths, leading to the following backtrace:
Reviewed-by: Xin Long <lucien.xin@gmail.com>

======================================================
WARNING: possible circular locking dependency detected
4.15.0+ #301 Not tainted
------------------------------------------------------
syzkaller233489/4179 is trying to acquire lock:
  (rtnl_mutex){+.+.}, at: [<0000000048e996fd>] rtnl_lock+0x17/0x20
net/core/rtnetlink.c:74

but task is already holding lock:
  (&xt[i].mutex){+.+.}, at: [<00000000328553a2>]
xt_find_table_lock+0x3e/0x3e0 net/netfilter/x_tables.c:1041

which lock already depends on the new lock.
===

Since commit 3f34cfae1230 ("netfilter: on sockopt() acquire sock lock
only in the required scope"), we already acquire the socket lock in
the innermost scope, where needed. In such commit I forgot to remove
the outer-most socket lock from the getsockopt() path, this commit
addresses the issues dropping it now.

v1 -> v2: fix bad subj, added relavant 'fixes' tag

Fixes: 22265a5 ("netfilter: xt_TEE: resolve oif using netdevice notifiers")
Fixes: 202f59a ("netfilter: ipt_CLUSTERIP: do not hold dev")
Fixes: 3f34cfae1230 ("netfilter: on sockopt() acquire sock lock only in the required scope")
Reported-by: syzbot+ddde1c7b7ff7442d7f2d@syzkaller.appspotmail.com
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ED6E0F17 pushed a commit to ED6E0F17/linux that referenced this issue Mar 15, 2018

netfilter: drop outermost socket lock in getsockopt()
commit 01ea306 upstream.

The Syzbot reported a possible deadlock in the netfilter area caused by
rtnl lock, xt lock and socket lock being acquired with a different order
on different code paths, leading to the following backtrace:
Reviewed-by: Xin Long <lucien.xin@gmail.com>

======================================================
WARNING: possible circular locking dependency detected
4.15.0+ raspberrypi#301 Not tainted
------------------------------------------------------
syzkaller233489/4179 is trying to acquire lock:
  (rtnl_mutex){+.+.}, at: [<0000000048e996fd>] rtnl_lock+0x17/0x20
net/core/rtnetlink.c:74

but task is already holding lock:
  (&xt[i].mutex){+.+.}, at: [<00000000328553a2>]
xt_find_table_lock+0x3e/0x3e0 net/netfilter/x_tables.c:1041

which lock already depends on the new lock.
===

Since commit 3f34cfae1230 ("netfilter: on sockopt() acquire sock lock
only in the required scope"), we already acquire the socket lock in
the innermost scope, where needed. In such commit I forgot to remove
the outer-most socket lock from the getsockopt() path, this commit
addresses the issues dropping it now.

v1 -> v2: fix bad subj, added relavant 'fixes' tag

Fixes: 22265a5 ("netfilter: xt_TEE: resolve oif using netdevice notifiers")
Fixes: 202f59a ("netfilter: ipt_CLUSTERIP: do not hold dev")
Fixes: 3f34cfae1230 ("netfilter: on sockopt() acquire sock lock only in the required scope")
Reported-by: syzbot+ddde1c7b7ff7442d7f2d@syzkaller.appspotmail.com
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment