Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

App reconnects every 2 minutes after update from 0.7.36 to 0.7.37 #1486

Closed
andronov-alexey opened this issue May 16, 2022 · 21 comments
Closed

Comments

@andronov-alexey
Copy link

General information

  1. Android Version: 12
  2. Version of the app (version number/play store version/self-built): GooglePlay 0.7.37
  3. Server: SoftEther

Hello,

After latest app update to version 0.7.37 app started reconnecting to server every 2 minutes.
The issue goes away when I install previous version 0.7.36

Here's the client logs:

[Server] Inactivity timeout (--ping-restart), restarting
SIGUSR1[soft,ping-restart] received, process restarting

So it seems like new app version (0.7.37) sets some kind of reconnect timeout, for example:

ping-restart 0
keepalive 10 120

Regards

@schwabe
Copy link
Owner

schwabe commented May 16, 2022

  • Is that on an existing profile or a freshly imported profile?
  • are those reconnect timeouts thing you see in the generated config or that you speculate?
  • can you also share a log?

@andronov-alexey
Copy link
Author

andronov-alexey commented May 16, 2022

Thanks for the answer

Is that on an existing profile or a freshly imported profile?

It was an existing profile

are those reconnect timeouts thing you see in the generated config or that you speculate?

It's my guess based on timestamps from UI log

While I was saving a log I found out it was more descriptive than the one shown in app UI
There (see client_log.txt) I found the following entry:

PUSH: Received control message: 'PUSH_REPLY,ping 3,ping-restart 10, ...'

So turns out server pushes timeout setting and server is to blame
To solve the issue I added the following setting to vpn config on the client:
pull-filter ignore "ping-restart"

But the issue still there although now intervals became ~3 minutes (see client_log ignore ping-restart.txt):

2022-05-16 23:40:17 [Finland] Inactivity timeout (--ping-restart), restarting
2022-05-16 23:42:52 [Finland] Inactivity timeout (--ping-restart), restarting
2022-05-16 23:46:58 [Finland] Inactivity timeout (--ping-restart), restarting
2022-05-16 23:50:32 [Finland] Inactivity timeout (--ping-restart), restarting
2022-05-16 23:53:03 [Finland] Inactivity timeout (--ping-restart), restarting

And additionally there two run-time exceptions:
1.

java.lang.NoSuchMethodException: com.android.org.bouncycastle.asn1.x509.X509Name.getInstance [class java.lang.Object]
at java.lang.Class.getMethod(Class.java:2103)
at java.lang.Class.getMethod(Class.java:1724)
at de.blinkt.openvpn.core.X509Utils.getCertificateFriendlyName(X509Utils.java:136)
at de.blinkt.openvpn.core.X509Utils.getCertificateFriendlyName(X509Utils.java:82)
at de.blinkt.openvpn.views.FileSelectLayout.setData(FileSelectLayout.java:155)
at de.blinkt.openvpn.fragments.Settings_Basic.loadPreferences(Settings_Basic.java:192)
at de.blinkt.openvpn.fragments.KeyChainSettingsFragment.onStart(KeyChainSettingsFragment.kt:208)
at androidx.fragment.app.Fragment.performStart(Fragment.java:3053)
at androidx.fragment.app.FragmentStateManager.start(FragmentStateManager.java:587)
at androidx.fragment.app.FragmentStateManager.moveToExpectedState(FragmentStateManager.java:279)
at androidx.fragment.app.FragmentStore.moveToExpectedState(FragmentStore.java:113)
at androidx.fragment.app.FragmentManager.moveToState(FragmentManager.java:1374)
at androidx.fragment.app.FragmentManager.dispatchStateChange(FragmentManager.java:2841)
at androidx.fragment.app.FragmentManager.dispatchStart(FragmentManager.java:2791)
at androidx.fragment.app.FragmentController.dispatchStart(FragmentController.java:273)
at androidx.fragment.app.FragmentActivity.onStart(FragmentActivity.java:486)
at androidx.appcompat.app.AppCompatActivity.onStart(AppCompatActivity.java:246)
at android.app.Instrumentation.callActivityOnStart(Instrumentation.java:1467)
at android.app.Activity.performStart(Activity.java:8079)
at android.app.ActivityThread.handleStartActivity(ActivityThread.java:3732)
at android.app.servertransaction.TransactionExecutor.performLifecycleSequence(TransactionExecutor.java:221)
at android.app.servertransaction.TransactionExecutor.cycleToPath(TransactionExecutor.java:201)
at android.app.servertransaction.TransactionExecutor.executeLifecycleState(TransactionExecutor.java:173)
at android.app.servertransaction.TransactionExecutor.execute(TransactionExecutor.java:97)
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2253)
at android.os.Handler.dispatchMessage(Handler.java:106)
at android.os.Looper.loopOnce(Looper.java:201)
at android.os.Looper.loop(Looper.java:288)
at android.app.ActivityThread.main(ActivityThread.java:7870)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:548)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1003)

java.lang.NoSuchMethodException: com.android.org.bouncycastle.asn1.x509.X509Name.getInstance [class java.lang.Object]
at java.lang.Class.getMethod(Class.java:2103)
at java.lang.Class.getMethod(Class.java:1724)
at de.blinkt.openvpn.core.X509Utils.getCertificateFriendlyName(X509Utils.java:136)
at de.blinkt.openvpn.core.X509Utils.getCertificateFriendlyName(X509Utils.java:82)
at de.blinkt.openvpn.views.FileSelectLayout.setData(FileSelectLayout.java:155)
at de.blinkt.openvpn.fragments.Settings_Basic.loadPreferences(Settings_Basic.java:192)
at de.blinkt.openvpn.fragments.KeyChainSettingsFragment.onStart(KeyChainSettingsFragment.kt:208)
at androidx.fragment.app.Fragment.performStart(Fragment.java:3053)
at androidx.fragment.app.FragmentStateManager.start(FragmentStateManager.java:587)
at androidx.fragment.app.FragmentStateManager.moveToExpectedState(FragmentStateManager.java:279)
at androidx.fragment.app.FragmentManager.executeOpsTogether(FragmentManager.java:1840)
at androidx.fragment.app.FragmentManager.removeRedundantOperationsAndExecute(FragmentManager.java:1758)
at androidx.fragment.app.FragmentManager.execSingleAction(FragmentManager.java:1670)
at androidx.fragment.app.BackStackRecord.commitNowAllowingStateLoss(BackStackRecord.java:323)
at androidx.fragment.app.FragmentPagerAdapter.finishUpdate(FragmentPagerAdapter.java:249)
at androidx.viewpager.widget.ViewPager.populate(ViewPager.java:1244)
at androidx.viewpager.widget.ViewPager.populate(ViewPager.java:1092)
at androidx.viewpager.widget.ViewPager.onMeasure(ViewPager.java:1622)
at android.view.View.measure(View.java:25833)
at android.view.ViewGroup.measureChildWithMargins(ViewGroup.java:6980)
at android.widget.LinearLayout.measureChildBeforeLayout(LinearLayout.java:1552)
at android.widget.LinearLayout.measureVertical(LinearLayout.java:842)
at android.widget.LinearLayout.onMeasure(LinearLayout.java:721)
at android.view.View.measure(View.java:25833)
at android.view.ViewGroup.measureChildWithMargins(ViewGroup.java:6980)
at android.widget.FrameLayout.onMeasure(FrameLayout.java:194)
at androidx.appcompat.widget.ContentFrameLayout.onMeasure(ContentFrameLayout.java:145)
at android.view.View.measure(View.java:25833)
at android.view.ViewGroup.measureChildWithMargins(ViewGroup.java:6980)
at androidx.appcompat.widget.ActionBarOverlayLayout.onMeasure(ActionBarOverlayLayout.java:496)
at android.view.View.measure(View.java:25833)
at android.view.ViewGroup.measureChildWithMargins(ViewGroup.java:6980)
at android.widget.FrameLayout.onMeasure(FrameLayout.java:194)
at android.view.View.measure(View.java:25833)
at android.view.ViewGroup.measureChildWithMargins(ViewGroup.java:6980)
at android.widget.LinearLayout.measureChildBeforeLayout(LinearLayout.java:1552)
at android.widget.LinearLayout.measureVertical(LinearLayout.java:842)
at android.widget.LinearLayout.onMeasure(LinearLayout.java:721)
at android.view.View.measure(View.java:25833)
at android.view.ViewGroup.measureChildWithMargins(ViewGroup.java:6980)
at android.widget.FrameLayout.onMeasure(FrameLayout.java:194)
at com.android.internal.policy.DecorView.onMeasure(DecorView.java:758)
at android.view.View.measure(View.java:25833)
at android.view.ViewRootImpl.performMeasure(ViewRootImpl.java:3676)
at android.view.ViewRootImpl.measureHierarchy(ViewRootImpl.java:2460)
at android.view.ViewRootImpl.performTraversals(ViewRootImpl.java:2739)
at android.view.ViewRootImpl.doTraversal(ViewRootImpl.java:2179)
at android.view.ViewRootImpl$TraversalRunnable.run(ViewRootImpl.java:8787)
at android.view.Choreographer$CallbackRecord.run(Choreographer.java:1037)
at android.view.Choreographer.doCallbacks(Choreographer.java:845)
at android.view.Choreographer.doFrame(Choreographer.java:780)
at android.view.Choreographer$FrameDisplayEventReceiver.run(Choreographer.java:1022)
at android.os.Handler.handleCallback(Handler.java:938)
at android.os.Handler.dispatchMessage(Handler.java:99)
at android.os.Looper.loopOnce(Looper.java:201)
at android.os.Looper.loop(Looper.java:288)
at android.app.ActivityThread.main(ActivityThread.java:7870)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:548)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1003)

So I'm wondering:

  1. Why old 0.7.36 version worked with "ping-restart 10" option from server?
  2. Why new 0.7.37 still reconnects with option pull-filter ignore "ping-restart"?
  3. Should I create new issues for run-time exceptions?

@schwabe
Copy link
Owner

schwabe commented May 17, 2022

The runtime expections are red herrings. Ignore them for this ticket.

The default is to still restart a connection, so ignoring ping-restart just gives you the default which is still USR1. The ping restart of 3 10 is very agressive. That means that if there is any 10s hickup, the client will disconnect. And yes for UDP there is a 120s default. You can going back to 0.7.36 but this looks more like some other problem like bad server connection or mismatched control parameter being negotiated so pings are not getting through. So a complete log to look for those problems would be good.

@andronov-alexey
Copy link
Author

Okay, I'll keep using the previous version then
Thanks

@schwabe
Copy link
Owner

schwabe commented May 17, 2022

@andronov-alexey well, if you using the old version, this bug will probably not be fixed since I have no idea what is happening in your setup.

@andronov-alexey
Copy link
Author

I looked at server logs and found out that SoftEther server breaks the connection after 20 seconds and then the client drops it too after another 10 seconds.
In the following example connection established at 06:23:19, server breaks it at 06:23:42 and client breaks it at 10:23:51:

Server:

06:23:18.415 Connection "CID-23390": Successfully authenticated as user "user".
06:23:18.415 Connection "CID-23390": The new session "SID-USER-[OPENVPN_L3]-2914" has been created. (IP address: 111.111.111.111, Port number: 15074, Physical underlying protocol: "Legacy VPN - OPENVPN_L3")
06:23:18.415 Session "SID-USER-[OPENVPN_L3]-2914": The parameter has been set. Max number of TCP connections: 1, Use of encryption: Yes, Use of compression: No, Use of Half duplex communication: No, Timeout: 20 seconds.
06:23:19.452 OpenVPN Session 4285 (111.111.111.111:15074 -> 111.111.111.112:222) Channel 0: The full strings replied: "PUSH_REPLY,ping 3,ping-restart 10,ifconfig 192.168.30.29 192.168.30.30,dhcp-option DOMAIN server.net.domain,dhcp-option DNS 94.140.14.14,dhcp-option DNS 94.140.15.15,route-gateway 192.168.30.30,redirect-gateway def1"
06:23:42.514 Session "SID-USER-[OPENVPN_L3]-2914": The session has been terminated. The statistical information is as follows: Total outgoing data size: 4775 bytes, Total incoming data size: 5779 bytes.
06:23:42.544 Connection "CID-23390" terminated by the cause "The VPN session has been deleted. It is possible that either the administrator disconnected the session or the connection from the client to the VPN Server has been disconnected." (code 11).
06:23:42.544 Connection "CID-23390" has been terminated.

Client:

10:23:19 MANAGEMENT: >STATE:1652854999,CONNECTED,SUCCESS,192.168.30.29,111.111.111.112,222,,
10:23:51 [Finland] Inactivity timeout (--ping-restart), restarting

So I guess the server cant ping the client and drops the connection after timeout.
I tried to disable firewall on server to no avail.

Also "OpenVPN Connect" is working fine with the same config file

@schwabe
Copy link
Owner

schwabe commented May 18, 2022

Can you post a whole log? There is something fishy going on but I have no idea what it is. Does it reproduce if you use OpenVPN3 core under general settings?

@andronov-alexey
Copy link
Author

I provided two full logs at the earlier reply
They are all pretty similar (look for "Inactivity timeout (--ping-restart), restarting" lines)

With OpenVPN3 enabled I can't connect at all with the error:
Client exception in transport_recv: crypto_alg: AES-128-CBC: bad cipher for data channel use

@andronov-alexey
Copy link
Author

I can also send you my ".ovpn" config file if that would help

@yan12125
Copy link

I believe I got the same issue. I tried building the APK, and git bisect identified schwabe/openvpn@5356017 as the first bad commit. With schwabe/openvpn@6013396, I don't get any restart after several minutes.

Here are some general information:

General information

  1. Android version: 9
  2. Android Vendor/Custom ROM: HTC stock ROM + Magisk
  3. Device: HTC U11
  4. (version number/play store version/self-built): F-Droid 0.7.37. The issue also appears with self-built APK

Description of the issue

TCP/UDP connections are restarted after roughly 10s/30s, respectively

Logs (from self-built APK)

Configuration file

ics-openvpn-config.txt (exported from the app; key & cert removed)

Additional information

@schwabe
Copy link
Owner

schwabe commented Jun 12, 2022

If that commit in OpenVPN is really the offending one, Softether does not implement the OpenVPN protocol correctly. Having repeated ACKs should not trigger a complete meltdown in the protocol

@schwabe
Copy link
Owner

schwabe commented Jun 12, 2022

Can you capture a tcpdump of the connection as pcap?

@yan12125
Copy link

Can you capture a tcpdump of the connection as pcap?

Sure!

ics-openvpn-tcp-filtered.pcap.gz
ics-openvpn-udp-filtered.pcap.gz

Both are captured on the Wi-Fi router using WireShark (I'm more familiar with it than tcpdump) and filtered to keep traffic to/from the VPN server only (ip.src == 140.109.17.11 || ip.dst == 140.109.17.11). Please tell me if you need complete traffic.

@schwabe
Copy link
Owner

schwabe commented Jun 12, 2022

Yeah, I found the problem. Basically Softether never implemented the protocol correctly:

In its code (Proto_OpenVPN.c) there is a code fragement:

	if (ret->NumAck > 4)
	{
		goto LABEL_ERROR;
	}

while OpenVPN always had up to 8 acks per P_ACK_V1 packet.

So basically Softether ignores valid OpenVPN ACK packets and then drops the connection after 2 minutes.

schwabe added a commit to schwabe/SoftEtherVPN that referenced this issue Jun 12, 2022
OpenVPN always allowed 8 ACKs in P_ACK_V1 packets but only used
up to 4 in other control packets. Since Softether drops all packets with
more than 4 ACKs it also drops legimate P_ACK_V1.

See also this issue: schwabe/ics-openvpn#1486
JackieKu pushed a commit to JackieKu/SoftEtherVPN that referenced this issue Jun 13, 2022
OpenVPN always allowed 8 ACKs in P_ACK_V1 packets but only used
up to 4 in other control packets. Since Softether drops all packets with
more than 4 ACKs it also drops legimate P_ACK_V1.

See also this issue: schwabe/ics-openvpn#1486
@JackieKu
Copy link

Tried SoftEtherVPN/SoftEtherVPN#1610, works for me.

@yan12125
Copy link

Thank you very much for the fix! It may take some time for a new release of SoftEther VPN and for network administrators at our department to upgrade, so I need a workaround for now. Besides using an older commit, is limiting the number of ACKs a valid workaround?

diff --git a/src/openvpn/reliable.h b/src/openvpn/reliable.h
index 0f3caa34..5455d42e 100644
--- a/src/openvpn/reliable.h
+++ b/src/openvpn/reliable.h
@@ -41,7 +41,7 @@
  *  @{ */
 
 
-#define RELIABLE_ACK_SIZE 8     /**< The maximum number of packet IDs
+#define RELIABLE_ACK_SIZE 4     /**< The maximum number of packet IDs
                                  *   waiting to be acknowledged which can
                                  *   be stored in one \c reliable_ack
                                  *   structure. */

schwabe added a commit to schwabe/SoftEtherVPN that referenced this issue Jun 13, 2022
OpenVPN always allowed 8 ACKs in P_ACK_V1 packets but only used
up to 4 in other control packets. Since Softether drops all packets with
more than 4 ACKs it also drops legimate P_ACK_V1.

See also this issue: schwabe/ics-openvpn#1486
@schwabe
Copy link
Owner

schwabe commented Jun 13, 2022

No. That is absolutley not a good workaround. That would make OpenVPN misbehave like Softether and break with other OpenVPN servers.

@yan12125
Copy link

Thank you very much for clarafication! I will switch to 0.7.36 instead.

@schwabe
Copy link
Owner

schwabe commented Jun 14, 2022

@yan12125 try this:

diff --git a/src/openvpn/ssl_pkt.c b/src/openvpn/ssl_pkt.c
index e86fbc1b..d6e5c323 100644
--- a/src/openvpn/ssl_pkt.c
+++ b/src/openvpn/ssl_pkt.c
@@ -177,6 +177,15 @@ write_control_auth(struct tls_session *session,
 {
     uint8_t header = ks->key_id | (opcode << P_OPCODE_SHIFT);
 
+    /* Workaround for Softether servers. Softether has a bug that it only
+     * allows 4 ACks in packets and drops packets if more ACKs are contained
+     * in a packet (see commit a14d812dcbc in Softether) */
+    if (session->tls_wrap.mode == TLS_WRAP_NONE && !session->opt->server
+            && !(session->opt->crypto_flags & CO_USE_TLS_KEY_MATERIAL_EXPORT))
+    {
+            max_ack = min_int(max_ack, 4);
+    }
+
     ASSERT(link_socket_actual_defined(&ks->remote_addr));
     ASSERT(reliable_ack_write
                (ks->rec_ack, ks->lru_acks, buf, &ks->session_id_remote,

@yan12125
Copy link

This works for both TCP and UDP connections, thanks!

@schwabe
Copy link
Owner

schwabe commented Jul 22, 2022

Closed by commit 02eabb0

@schwabe schwabe closed this as completed Jul 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants