Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server crash on client reconnect with fragmentation enabled #400

Closed
WIPocket opened this issue Aug 29, 2023 · 11 comments
Closed

Server crash on client reconnect with fragmentation enabled #400

WIPocket opened this issue Aug 29, 2023 · 11 comments
Assignees
Labels

Comments

@WIPocket
Copy link

Describe the bug
When a client reconnects, the server exits on signal 8.
This is caused by a division by zero in optimal_fragment_size. frame_calculate_fragment is never called, so frame->max_fragment_size is zero.

To Reproduce
Set fragment and mssfix. Connect with a client, disconnect, connect again right away.
Relevant server configuration:

local ...
proto udp
port 1194

persist-tun
persist-key
user openvpn
group openvpn

dev tun0

ca ...
cert ...
key ...
dh ...
crl-verify ...

server 10.33.82.0 255.255.255.0

ccd-exclusive
client-config-dir ...

verb 3

keepalive 10 60
ping-timer-rem

fragment 1300
mssfix 1300

Expected behavior
The instance should be properly initialized and frame_calculate_fragment should be called. The server should not crash.

Version information (please complete the following information):
Server:

  • OS: FreeBSD 13.2-RELEASE-p2
  • OpenVPN version: openvpn-2.6.6
    Client:
  • OS: Windows 10
  • OpenVPN version: OpenVPN 2.5.7 [git:release/2.5/3d792ae9557b959e] Windows-MSVC [SSL (OpenSSL)] [LZO] [LZ4] [PKCS11] [AEAD] built on Oct 28 2022

Additional context
On the first connection I get:

openvpn: ... [...] Peer Connection Initiated with ...
openvpn: ... MULTI_sva: pool returned IPv4=10.33.82.6, IPv6=(Not enabled)
openvpn: ... OPTIONS IMPORT: reading client specific options from: ...
openvpn: ... MULTI: Learn: 10.33.82.25 -> ...
openvpn: ... MULTI: primary virtual IP for ...: 10.33.82.25

On the second connection, this doesnt happen. Instead, I get this first line:

openvpn: ... TLS: Initial packet from ...

Crash backtrace:

Program received signal SIGFPE, Arithmetic exception.
Integer divide by zero.
0x0000000000272abb in optimal_fragment_size (len=16, max_frag_size=0) at fragment.c:307
307     fragment.c: No such file or directory.
(gdb) where
#0  0x0000000000272abb in optimal_fragment_size (len=16, max_frag_size=0) at fragment.c:307
#1  0x00000000002728d3 in fragment_outgoing (f=0x8273aa000, buf=0x82736b148, frame=0x82736aeb0) at fragment.c:340
#2  0x000000000026ac65 in encrypt_sign (c=0x82736a1c0, comp_frag=true) at forward.c:641
#3  0x00000000002d0b1d in check_ping_send_dowork (c=0x82736a1c0) at ping.c:88
#4  0x0000000000271515 in check_ping_send (c=0x82736a1c0) at ./ping.h:84
#5  0x0000000000270ad9 in process_coarse_timers (c=0x82736a1c0) at forward.c:809
#6  0x000000000026e414 in check_coarse_timers (c=0x82736a1c0) at forward.c:831
#7  0x000000000026e2c3 in pre_select (c=0x82736a1c0) at forward.c:1986
#8  0x000000000029c7e9 in multi_process_post (m=0x820aba660, mi=0x82736a000, flags=5) at multi.c:3042
#9  0x00000000002a0447 in multi_process_timeout (m=0x820aba660, mpp_flags=5) at multi.c:3678
#10 0x0000000000299c6b in tunnel_server_udp (top=0x820abb920) at mudp.c:514
#11 0x00000000002a180f in tunnel_server (top=0x820abb920) at multi.c:4159
#12 0x00000000002a817c in openvpn_main (argc=11, argv=0x820abcad8) at openvpn.c:319
#13 0x00000000002a7e32 in main (argc=11, argv=0x820abcad8) at openvpn.c:396

This configuration worked on openvpn-2.4.9_3, FreeBSD 12.2-RELEASE-p14.

@cron2 cron2 added the bug label Sep 8, 2023
@darkbasic
Copy link

I think this might be a similar issue: #417

In my case the client is not trying to reconnect, but someone else on the internet sends an early negotiation malformed packet and crashes the whole server.

@dsommers
Copy link
Member

dsommers commented Sep 27, 2023

I'm not touching the crux of this issue, just pointing out that this seems to be related to the use of the fragment option.

The fragment feature is not supported by ovpn-dco nor implemented in the OpenVPN 3 Core library. So configurations wanting to be forward compatible with newer technologies should avoid using fragment in general.

That said, this issue deserves a fix. But also consider to stop using fragment in a longer perspective.

@darkbasic
Copy link

But also consider to stop using fragment in a longer perspective.

I cannot do so: my provider breaks PMTUD by doing some kind of tunneling while not using jumbo frames: https://forum.fibra.click/d/44630-lista-nera-dei-provider-che-multiplexano-le-sessioni-ppp-via-tunnel-l2tp

@dsommers
Copy link
Member

I believe setting tun-mtu 1300 and mssfix 1300 would resolve this, where the local network stack instead will take care of the fragmentation instead of OpenVPN doing additional fragmentation.

@schwabe
Copy link
Contributor

schwabe commented Sep 27, 2023

I have a hard time reproducing this. Just using the config and connecting/reconnecting a client does not trigger any crash for me

@darkbasic
Copy link

darkbasic commented Sep 27, 2023

I believe setting tun-mtu 1300 and mssfix 1300 would resolve this, where the local network stack instead will take care of the fragmentation instead of OpenVPN doing additional fragmentation.

This is a layer 2 tunnel and this won't work for other hosts except the openvpn client itself: my tap interface is bridged with a whole vlan interface.

@WIPocket
Copy link
Author

WIPocket commented Nov 1, 2023

I managed to bisect this crash to 0d96997.
Backtrace after client reconnect:

Program received signal SIGFPE, Arithmetic exception.
Integer divide by zero.
0x000000000026785b in optimal_fragment_size (len=16, max_frag_size=0) at fragment.c:310
310         const int div = len / mfs_aligned;
(gdb) where
#0  0x000000000026785b in optimal_fragment_size (len=16, max_frag_size=0) at fragment.c:310
#1  0x0000000000267673 in fragment_outgoing (f=0x801139000, buf=0x8010f0fe0, frame=0x8010f0d60) at fragment.c:343
#2  0x00000000002600d2 in encrypt_sign (c=0x8010f01c0, comp_frag=true) at forward.c:554
#3  0x00000000002b8add in check_ping_send_dowork (c=0x8010f01c0) at ping.c:90
#4  0x0000000000266395 in check_ping_send (c=0x8010f01c0) at ./ping.h:84
#5  0x0000000000265b41 in process_coarse_timers (c=0x8010f01c0) at forward.c:694
#6  0x0000000000263629 in check_coarse_timers (c=0x8010f01c0) at forward.c:709
#7  0x00000000002634e3 in pre_select (c=0x8010f01c0) at forward.c:1808
#8  0x000000000028bfc9 in multi_process_post (m=0x7fffffffc470, mi=0x8010f0000, flags=5) at multi.c:2930
#9  0x000000000028ea1c in multi_process_incoming_link (m=0x7fffffffc470, instance=0x0, mpp_flags=5) at multi.c:3276
#10 0x000000000028a120 in multi_process_io_udp (m=0x7fffffffc470) at mudp.c:211
#11 0x0000000000289c96 in tunnel_server_udp (top=0x7fffffffd590) at mudp.c:325
#12 0x0000000000290a5f in tunnel_server (top=0x7fffffffd590) at multi.c:3928
#13 0x00000000002964f4 in openvpn_main (argc=7, argv=0x7fffffffe5e0) at openvpn.c:313
#14 0x00000000002961d2 in main (argc=7, argv=0x7fffffffe5e0) at openvpn.c:390
(gdb) print *(struct frame *) 0x8010f0d60
$1 = {buf = {payload_size = 1736, headroom = 136, tailroom = 136}, link_mtu = 1625, mss_fix = 0, max_fragment_size = 0, extra_frame = 125, tun_mtu = 1500, extra_buffer = 0,
  extra_tun = 0, extra_link = 3}

@schwabe
Copy link
Contributor

schwabe commented Nov 1, 2023

Thanks I already have a patch for that will hopefully will be commited in the next few days. It is currently under review.

@WIPocket
Copy link
Author

WIPocket commented Nov 8, 2023

Thanks I already have a patch for that will hopefully will be commited in the next few days. It is currently under review.

Any updates on this? Can you please post the patch (I failed to find it in your fork of openvpn or any mailing list), so I can check if the issue is fixed?

@cron2
Copy link
Contributor

cron2 commented Nov 10, 2023

Hi. The patches are in the 2.6.7 release that hit github yesterday. Due to some internal coordination and availability issues it was not coordinated as well as it could have been.

The commits are

commit 57a5cd1
Author: Arne Schwabe arne@rfc2549.org
Date: Fri Oct 27 14:19:37 2023 +0200

Fix using to_link buffer after freed

and

commit 1cfca65
Author: Arne Schwabe arne@rfc2549.org
Date: Thu Oct 19 15:14:33 2023 +0200

Remove saving initial frame code

the second one fixes the div-by-zero, the other one was found trying to figure out what happens trying to reply to a corrupt inbound TLS packet.

@WIPocket
Copy link
Author

The crash seems to be fixed, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants