Out-of-tree MPTCP uses only 8 interfaces out of 16 #406

arter97 · 2021-01-27T13:18:33Z

Possibly related to #128 but the description and comments don't seem to quite match with what I'm seeing.

We recently had the opportunity to upgrade the server environment from 8 Ethernet ports to 16, but MPTCP doesn’t scale beyond 8 interfaces.

As the server has real users/clients, it’s quite hard to conduct experiments on the server so I created 2 VMs to replicate the issue. The same issue happens on the VM as well.

VM 1 has 17 virtio NICs(eth0-16), each throttled to 30 Mbps.
VM 2 has 1 virtio NIC(eth0), unthrottled.

VM 1:

# ifconfig|grep 'eth[0-9]\|192'
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.122.216  netmask 255.255.255.0  broadcast 192.168.122.255
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.122.221  netmask 255.255.255.0  broadcast 192.168.122.255
eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.122.222  netmask 255.255.255.0  broadcast 192.168.122.255
eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.122.223  netmask 255.255.255.0  broadcast 192.168.122.255
eth4: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.122.224  netmask 255.255.255.0  broadcast 192.168.122.255
eth5: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.122.225  netmask 255.255.255.0  broadcast 192.168.122.255
eth6: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.122.226  netmask 255.255.255.0  broadcast 192.168.122.255
eth7: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.122.227  netmask 255.255.255.0  broadcast 192.168.122.255
eth8: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.122.228  netmask 255.255.255.0  broadcast 192.168.122.255
eth9: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.122.229  netmask 255.255.255.0  broadcast 192.168.122.255
eth10: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.122.236  netmask 255.255.255.0  broadcast 192.168.122.255
eth11: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.122.237  netmask 255.255.255.0  broadcast 192.168.122.255
eth12: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.122.238  netmask 255.255.255.0  broadcast 192.168.122.255
eth13: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.122.239  netmask 255.255.255.0  broadcast 192.168.122.255
eth14: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.122.240  netmask 255.255.255.0  broadcast 192.168.122.255
eth15: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.122.241  netmask 255.255.255.0  broadcast 192.168.122.255
eth16: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.122.242  netmask 255.255.255.0  broadcast 192.168.122.255

VM 2:

# ifconfig|grep 'eth[0-9]\|192'
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.122.211  netmask 255.255.255.0  broadcast 192.168.122.255

VM 1 initiates MPTCP connection via SSH to VM 2:

# ssh arter97@192.168.122.211 cat /dev/urandom | pv > /dev/null
 211MiB 0:00:08 [27.0MiB/s] [                 <=>                                             ]

For some reason, MPTCP uses eth0,1,10,11,12,13,14,15 but nothing else.
(Checked via ifconfig’s TX packets usage)

The issue happens on both mptcp_v0.95(Linux v4.19) and mptcp_trunk(Linux v5.4).
Linux v5.10’s MPTCP v1 uses only 1 interface(eth0) and the performance is capped at 3.41 MiB/s.

Here’s the relevant kernel configs:

CONFIG_MPTCP=y
CONFIG_MPTCP_PM_ADVANCED=y
CONFIG_MPTCP_FULLMESH=y
CONFIG_MPTCP_NDIFFPORTS=y
CONFIG_MPTCP_BINDER=y
CONFIG_MPTCP_NETLINK=y
CONFIG_DEFAULT_MPTCP_PM="fullmesh"
CONFIG_MPTCP_SCHED_ADVANCED=y
# CONFIG_MPTCP_BLEST is not set
CONFIG_MPTCP_ROUNDROBIN=y
CONFIG_MPTCP_REDUNDANT=y
# CONFIG_MPTCP_ECF is not set
CONFIG_DEFAULT_MPTCP_SCHED="default"

# grep . /proc/sys/net/mptcp/*
/proc/sys/net/mptcp/mptcp_checksum:1
/proc/sys/net/mptcp/mptcp_debug:1
/proc/sys/net/mptcp/mptcp_enabled:1
/proc/sys/net/mptcp/mptcp_path_manager:fullmesh
/proc/sys/net/mptcp/mptcp_scheduler:default
/proc/sys/net/mptcp/mptcp_syn_retries:3
/proc/sys/net/mptcp/mptcp_version:0

Here are logs after turning on mptcp_debug.
VM 1:

[ 1410.105894] mptcp_alloc_mpcb: created mpcb with token 0x17fc0de1
[ 1410.106836] mptcp_add_sock: token 0x17fc0de1 pi 1, src_addr:192.168.122.216:50656 dst_addr:192.168.122.211:22
[ 1410.108194] mptcp_add_sock: token 0x17fc0de1 pi 2, src_addr:0.0.0.0:0 dst_addr:0.0.0.0:0
[ 1410.109259] __mptcp_init4_subsockets: token 0x17fc0de1 pi 2 src_addr:192.168.122.241:0 dst_addr:192.168.122.211:22 ifidx: 17
[ 1410.110444] mptcp_add_sock: token 0x17fc0de1 pi 3, src_addr:0.0.0.0:0 dst_addr:0.0.0.0:0
[ 1410.112260] __mptcp_init4_subsockets: token 0x17fc0de1 pi 3 src_addr:192.168.122.221:0 dst_addr:192.168.122.211:22 ifidx: 3
[ 1410.113987] mptcp_add_sock: token 0x17fc0de1 pi 4, src_addr:0.0.0.0:0 dst_addr:0.0.0.0:0
[ 1410.115746] __mptcp_init4_subsockets: token 0x17fc0de1 pi 4 src_addr:192.168.122.236:0 dst_addr:192.168.122.211:22 ifidx: 12
[ 1410.116987] mptcp_add_sock: token 0x17fc0de1 pi 5, src_addr:0.0.0.0:0 dst_addr:0.0.0.0:0
[ 1410.118232] __mptcp_init4_subsockets: token 0x17fc0de1 pi 5 src_addr:192.168.122.237:0 dst_addr:192.168.122.211:22 ifidx: 13
[ 1410.119481] mptcp_add_sock: token 0x17fc0de1 pi 6, src_addr:0.0.0.0:0 dst_addr:0.0.0.0:0
[ 1410.120748] __mptcp_init4_subsockets: token 0x17fc0de1 pi 6 src_addr:192.168.122.238:0 dst_addr:192.168.122.211:22 ifidx: 14
[ 1410.122033] mptcp_add_sock: token 0x17fc0de1 pi 7, src_addr:0.0.0.0:0 dst_addr:0.0.0.0:0
[ 1410.123242] __mptcp_init4_subsockets: token 0x17fc0de1 pi 7 src_addr:192.168.122.239:0 dst_addr:192.168.122.211:22 ifidx: 15
[ 1410.124547] mptcp_add_sock: token 0x17fc0de1 pi 8, src_addr:0.0.0.0:0 dst_addr:0.0.0.0:0
[ 1410.125695] __mptcp_init4_subsockets: token 0x17fc0de1 pi 8 src_addr:192.168.122.240:0 dst_addr:192.168.122.211:22 ifidx: 16

SSH Process ^C

[ 1417.701211] mptcp_close: Close of meta_sk with tok 0x17fc0de1
[ 1417.702439] mptcp_del_sock: Removing subsock tok 0x17fc0de1 pi:8 state 7 is_meta? 0
[ 1417.703854] mptcp_del_sock: Removing subsock tok 0x17fc0de1 pi:7 state 7 is_meta? 0
[ 1417.704928] mptcp_del_sock: Removing subsock tok 0x17fc0de1 pi:4 state 7 is_meta? 0
[ 1417.706020] mptcp_del_sock: Removing subsock tok 0x17fc0de1 pi:6 state 7 is_meta? 0
[ 1417.707097] mptcp_del_sock: Removing subsock tok 0x17fc0de1 pi:2 state 7 is_meta? 0
[ 1417.708414] mptcp_del_sock: Removing subsock tok 0x17fc0de1 pi:3 state 7 is_meta? 0
[ 1417.709302] mptcp_del_sock: Removing subsock tok 0x17fc0de1 pi:1 state 7 is_meta? 0
[ 1417.710163] mptcp_del_sock: Removing subsock tok 0x17fc0de1 pi:5 state 7 is_meta? 0
[ 1417.711122] mptcp_sock_destruct destroying meta-sk token 0x17fc0de1

VM 2:

[ 1465.399436] mptcp_alloc_mpcb: created mpcb with token 0x735227f5
[ 1465.399525] mptcp_add_sock: token 0x735227f5 pi 1, src_addr:192.168.122.211:22 dst_addr:192.168.122.216:50656
[ 1465.405560] mptcp_add_sock: token 0x735227f5 pi 2, src_addr:192.168.122.211:22 dst_addr:192.168.122.241:44461
[ 1465.408522] mptcp_add_sock: token 0x735227f5 pi 3, src_addr:192.168.122.211:22 dst_addr:192.168.122.221:52675
[ 1465.411203] mptcp_add_sock: token 0x735227f5 pi 4, src_addr:192.168.122.211:22 dst_addr:192.168.122.236:47681
[ 1465.413732] mptcp_add_sock: token 0x735227f5 pi 5, src_addr:192.168.122.211:22 dst_addr:192.168.122.237:46163
[ 1465.416097] mptcp_add_sock: token 0x735227f5 pi 6, src_addr:192.168.122.211:22 dst_addr:192.168.122.238:50525
[ 1465.418678] mptcp_add_sock: token 0x735227f5 pi 7, src_addr:192.168.122.211:22 dst_addr:192.168.122.239:39503
[ 1465.418951] mptcp_add_sock: token 0x735227f5 pi 8, src_addr:192.168.122.211:22 dst_addr:192.168.122.240:57097

SSH Process ^C

[ 1472.993924] mptcp_del_sock: Removing subsock tok 0x735227f5 pi:8 state 7 is_meta? 0
[ 1472.994392] mptcp_del_sock: Removing subsock tok 0x735227f5 pi:7 state 7 is_meta? 0
[ 1472.994442] mptcp_del_sock: Removing subsock tok 0x735227f5 pi:6 state 7 is_meta? 0
[ 1472.994475] mptcp_del_sock: Removing subsock tok 0x735227f5 pi:4 state 7 is_meta? 0
[ 1472.994505] mptcp_del_sock: Removing subsock tok 0x735227f5 pi:3 state 7 is_meta? 0
[ 1472.994551] mptcp_del_sock: Removing subsock tok 0x735227f5 pi:2 state 7 is_meta? 0
[ 1472.994596] mptcp_del_sock: Removing subsock tok 0x735227f5 pi:1 state 7 is_meta? 0
[ 1472.994622] mptcp_del_sock: Removing subsock tok 0x735227f5 pi:5 state 7 is_meta? 0
[ 1472.994653] mptcp_close: Close of meta_sk with tok 0x735227f5
[ 1472.994710] mptcp_sock_destruct destroying meta-sk token 0x735227f5

Here’s libvirt definition for both VMs, in case you guys want to try this setup:
VM 1: https://pastebin.com/VeWCLmac
VM 2: https://pastebin.com/NXXmz9tj

Thanks in advance :)

The text was updated successfully, but these errors were encountered:

arter97 · 2021-01-27T13:21:39Z

Mainline kernel's MPTCP config:

# cat /boot/config-5.10.10-051010-generic | grep -i mptcp
CONFIG_MPTCP=y
CONFIG_INET_MPTCP_DIAG=m
CONFIG_MPTCP_IPV6=y
# cat /proc/sys/net/mptcp/enabled 
1

matttbe · 2021-01-27T14:00:34Z

Hello,

I see that you are using the Fullmesh PM. This PM has a hard limit: https://github.com/multipath-tcp/mptcp/blob/mptcp_v0.95/net/mptcp/mptcp_fullmesh.c#L23

Is your goal to use more than 8 addresses per connection? We already talked about that in the past and it was hard for us to find a realistic use case to use so many subflows :-)

You can check the addresses picked by the PM by looking at /proc/net/mptcp_fullmesh. Does it correspond to what you see?

arter97 · 2021-01-28T04:46:45Z

Is your goal to use more than 8 addresses per connection?

Yup.

We already talked about that in the past and it was hard for us to find a realistic use case to use so many subflows :-)

Yeah, I admit my use-case won't be the primary example of MPTCP.

You can check the addresses picked by the PM by looking at /proc/net/mptcp_fullmesh. Does it correspond to what you see?

Yup, it matches it.

I see that you are using the Fullmesh PM. This PM has a hard limit: https://github.com/multipath-tcp/mptcp/blob/mptcp_v0.95/net/mptcp/mptcp_fullmesh.c#L23

Thanks for the pointer.
I've managed to play around with it for a few hours to raise the limit to 16.

The throughput of SSH increased linearly, now reaching 54.0 MiB/s.

I can see why the limit of 8 was put - struct mptcp_cb's u8 mptcp_pm[MPTCP_PM_SIZE] size increases quite drastically, from 608 to 720.
Following the same principle here: https://github.com/multipath-tcp/mptcp_net-next/wiki#overview

sk_buff structure size can't get bigger. It's already large and, if anything, the maintainers hope to reduce its size. Changes to the data structure size are amplified by the large number of instances in a busy system.

I can understand that 8 is a reasonable limitation.

For those who're interested though, I'll leave the commit here:
arter97/x86-kernel@443fcdf

Thanks for the help!

matttbe · 2021-01-28T14:24:07Z

Thank you for having tried and shared the modified code! It can help others :)

By chance, may you share your use case? Maintaining more than 8 addresses, with possibly 8x8 subflows, that's a lot :-)

arter97 · 2021-02-02T09:37:14Z

Hey, sorry for the late reply, got caught up with work recently.

I don't think I can provide the details of the company's internal networking infrastructure, but if I were to make an analogy, we're kind of in a weird position of being able to get as many IP addresses from the ISP as we want, but with each limited to < 50 Mbps.

We know for a fact that the whole switching capacity well exceeds the throughput of the entire addresses combined, so we deployed an MPTCP environment that relays SOCKS5 proxy server from outside's unlimited/unthrottled computer to get faster Internet access.

We're currently using WireGuard with MPTCP, microsocks and redsocks2 for the entire setup.
It works well(ish), but when it doesn't, it's usually the microsocks/redsocks's fault, not MPTCP's :)

matttbe · 2021-02-02T18:39:15Z

I see why you need to use more addresses, thank you for the explanation, an interesting use-case!

And nice to see it works well with all these proxies! Can we force WireGuard to use TCP? Or I guess MPTCP is in a tunnel managed by WireGuard.

arter97 · 2021-02-04T11:50:39Z

Yeah, MPTCP is living inside WireGuard tunnels.

I didn't conduct an experiment yet to see whether which is better: "Multiple WireGuarded interfaces with MPTCP and unencrypted microsocks proxy" or "Unencrypted interfaces with MPTCP and encrypted SOCKS5 proxy(e.g., ssh or shadowsocks)"

I opted for WireGuard as it naturally gets parallelized across multiple CPU cores, but who knows, maybe the latter can outperform ¯_(ツ)_/¯

I should experiment around that sooner or later..

arter97 · 2021-05-02T10:35:18Z

Just leaving here an update on our use-case :)

We settled on using WireGuard + MPTCP + shadowsocks-rust (without encryption: plain), and it is rock solid for months now.

If we don't use WireGuard, something goes wrong with shadowsocks-rust and TCP connections randomly hang, which I don't believe is due to either MPTCP or shadowsocks-rust itself.
After setting up WireGuard and forcing our Internet connections to go through UDP fixed everything. Now that the connections are encrypted, we simply switched to plain encryption from shadowsocks-rust configuration.

matttbe · 2021-05-03T16:37:19Z

Thank you for sharing this, always useful from our development point of view to know how MPTCP is used :)

starkovv · 2021-05-19T23:44:57Z

@arter97 what version of the kernel and mptcp do you use in your setup?

arter97 · 2021-05-20T10:14:02Z

@starkovv I use a custom kernel based on v5.4 with mptcp_trunk branch merged.

Notable change is arter97/x86-kernel@443fcdf as mentioned in the above comment.

https://github.com/arter97/x86-kernel/tree/5.4

arinc9 · 2021-12-31T07:55:22Z

Just leaving here an update on our use-case :)

We settled on using WireGuard + MPTCP + shadowsocks-rust (without encryption: plain), and it is rock solid for months now.

If we don't use WireGuard, something goes wrong with shadowsocks-rust and TCP connections randomly hang, which I don't believe is due to either MPTCP or shadowsocks-rust itself. After setting up WireGuard and forcing our Internet connections to go through UDP fixed everything. Now that the connections are encrypted, we simply switched to plain encryption from shadowsocks-rust configuration.

This is more or less the setup I have at home. I can get as many 100 Mbps links as I want from the ISP so I plan to use 10 subflows to get 1 Gbps connection.

I use WireGuard to take care of all the non-TCP traffic over the most stable link (especially helpful with encypting DNS traffic and delay-sensitive use cases). iptables picks up TCP traffic and forwards it to the proxy (I use v2ray's vless for that) which goes over multiple links, plaintext.

The reason I use an unknown Chinese protocol is because my home router cannot handle high throughput with encryption. And where I live, I'm pretty sure the ISP uses their firewall to track SOCKS traffic. So I believe using an unknown protocol like vless keeps me under the radar.

@arter97 @matttbe

arter97 mentioned this issue Jan 27, 2021

Route to v5.10 LTS #404

Open

arter97 closed this as completed Jan 28, 2021

matttbe mentioned this issue Aug 7, 2021

Feature Request: increase size of MPTCP_MAX_ADDR #442

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Out-of-tree MPTCP uses only 8 interfaces out of 16 #406

Out-of-tree MPTCP uses only 8 interfaces out of 16 #406

arter97 commented Jan 27, 2021

arter97 commented Jan 27, 2021

matttbe commented Jan 27, 2021 •

edited

Loading

arter97 commented Jan 28, 2021

matttbe commented Jan 28, 2021

arter97 commented Feb 2, 2021

matttbe commented Feb 2, 2021

arter97 commented Feb 4, 2021

arter97 commented May 2, 2021

matttbe commented May 3, 2021

starkovv commented May 19, 2021

arter97 commented May 20, 2021 •

edited

Loading

arinc9 commented Dec 31, 2021

Out-of-tree MPTCP uses only 8 interfaces out of 16 #406

Out-of-tree MPTCP uses only 8 interfaces out of 16 #406

Comments

arter97 commented Jan 27, 2021

arter97 commented Jan 27, 2021

matttbe commented Jan 27, 2021 • edited Loading

arter97 commented Jan 28, 2021

matttbe commented Jan 28, 2021

arter97 commented Feb 2, 2021

matttbe commented Feb 2, 2021

arter97 commented Feb 4, 2021

arter97 commented May 2, 2021

matttbe commented May 3, 2021

starkovv commented May 19, 2021

arter97 commented May 20, 2021 • edited Loading

arinc9 commented Dec 31, 2021

matttbe commented Jan 27, 2021 •

edited

Loading

arter97 commented May 20, 2021 •

edited

Loading