Skip to content

IPsec traffic stalling on 20.7 #118

@fraenki

Description

@fraenki

Describe the bug
After upgrading from 20.1.4 to 20.7.2 IPsec phase 2 tunnels will randomly stall (IKEv1, mode tunnel IPv4). Only restarting strongswan seems to fix this issue (temporarely).

Several other people have reported the same issue on 20.7. More reports, details, hardware specs, etc. are available in the forums:
https://forum.opnsense.org/index.php?topic=18918.0

To Reproduce
Steps to reproduce the behavior:

  1. Establish an IPsec tunnel between two physical OPNsense 20.7 boxes
  2. It was better reproducable with IKEv2 (at least in my case)
  3. Send enough traffic through the tunnel (200-400 MB in my case)
  4. The tunnel completely stalls, no traffic gets through

Downgrading to OPNsense 20.1 immediately fixes this issue (according to other reporters in the forum).

However, @mimugmail tried to reproduce this issue on OPNsense virtual machines, but did not succeed unfortunately. So this issue is probably related to specific vendors/hardware or drivers. Until now all reporters were using Intel NICs.

Expected behavior
Traffic should get through the IPsec tunnel, it should not stall after transferring a specific amount of data.

Relevant log files
When using IKEv2 strongswan is actually able to detect the problem and restart the SA after a while:

Sep  8 17:03:05 charon[62985]: 05[IKE] <con5|1> giving up after 10 path probings
Sep  8 17:03:05 charon[62985]: 05[IKE] <con5|1> restarting CHILD_SA con5 

Note that with IKEv1 strongswan will still think that the tunnel is active and alive, so with IKEv1 you need to restart strongswan.
Of course, in both cases this does not really help, because the tunnel will get stuck again after very little traffic.

Additional context
I've manually built packages for strongswan 5.8.3 and 5.9.0 and tested them on OPNsense 20.7.2, but this did not change anything.

Other reporters and myself tested several different IPsec settings, but this did not change anything (details of the various settings can be found in the Forum). Changing the options in Interfaces: Settings did not help either.

An interesting observation: when one box was still running OPNsense 20.1 and only the other box was on OPNsense 20.7, this issue did not occur at all. Only after upgrading both boxes to 20.7 we first experienced this issue. Also this issue does not IPsec connections between OPNsense and other firewalls – those are 100% stable. Only OPNsense-to-OPNsense connections are affected.

Environment
My setup:
OPNsense 20.7.2 / 20.7.3 (amd64, OpenSSL).
Supermicro A2SDi-2C-HLN4F (Intel C3338 CPU, Intel C3000 NIC)

Other reporters:
Landitec scope7-1510 (Intel Atom C3558)
Supermicro A2SDi-4C-HLN4F (almost the same as mine)

Metadata

Metadata

Assignees

Labels

upstreamThird party issue

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions