Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPSec Route missing after WAN DHCP Renew (#3414 related?) #5263

Closed
2 tasks done
HamburgerJungeJr opened this issue Oct 8, 2021 · 25 comments · Fixed by #5286
Closed
2 tasks done

IPSec Route missing after WAN DHCP Renew (#3414 related?) #5263

HamburgerJungeJr opened this issue Oct 8, 2021 · 25 comments · Fixed by #5286
Assignees
Labels
feature Adding new functionality
Milestone

Comments

@HamburgerJungeJr
Copy link

HamburgerJungeJr commented Oct 8, 2021

Important notices

Our forum is located at https://forum.opnsense.org , please consider joining discussions there in stead of using GitHub for these matters.

Before you ask a new question, we ask you kindly to acknowledge the following:

Hi,

I have a routed IPSec-Tunnel to our corporate network.
After a WAN-DHCP renew the routes to the IPSec-Tunnel are no longer available in the Status list. But they are still active under configuration.

If I disable the route and re-enable it the IPSec tunnel works again. So I'm pretty sure its an issue with the route.

So I think its the same issue as in #3414.

I'm running Opnsense 21.7.2_1. Do I need to apply some patches? Or are the in #3414 mentioned patches (9046727, 3e4df24) already applied?

@HamburgerJungeJr HamburgerJungeJr added the support Community support label Oct 8, 2021
@fichtner
Copy link
Member

fichtner commented Oct 9, 2021

Hi @HamburgerJungeJr,

Patches are all there. I was sort of expecting this edge case to come up eventually. Can you tell me which interface ipsec is running on? Which interface are your broken routes on? Which interface(s) are your WAN(s)?

Thanks,
Franco

@HamburgerJungeJr
Copy link
Author

Hi,
I'm not quite sure what you mean with 'interface' but heres my complete vpn config:

VPNConfig Phase1

2021-10-09_11-22
2021-10-09_11-22_1
2021-10-09_11-22_2

VPNConfig Phase2

2021-10-09_11-23
2021-10-09_11-23_1

Gateway config

2021-10-09_11-23_2
2021-10-09_11-24

Interface Assignments

2021-10-09_11-24_1

Route config

2021-10-09_11-25

@rfc4711
Copy link
Contributor

rfc4711 commented Oct 18, 2021

On my OPNsense 21.7.3_3-amd64 with the static route to IPsec gw the route needs the be re-applied when the ipsec tunnel is restarted. I tried a very similar config as @HamburgerJungeJr.

@fichtner
Copy link
Member

Merged via 35992e7 to development version. Might not be patchable using opnsense-patch as there are more ipsec changes in the pipline.

@fichtner fichtner self-assigned this Oct 18, 2021
@fichtner fichtner added bug Production bug cleanup Low impact changes and removed support Community support bug Production bug labels Oct 18, 2021
@fichtner fichtner added this to the 22.1 milestone Oct 18, 2021
@fichtner fichtner added feature Adding new functionality and removed cleanup Low impact changes labels Oct 18, 2021
@fichtner
Copy link
Member

GitHub closed this, but I'll leave it open for later feedback.

@fichtner fichtner reopened this Oct 18, 2021
@HamburgerJungeJr
Copy link
Author

I applied the patch and it seems to work now.
I will keep an eye on the problem but for now it seems to be fixed.

@fichtner
Copy link
Member

fichtner commented Oct 18, 2021

@HamburgerJungeJr thanks so far! patches on top of 21.7.3 indeed... if someone else wants to try:

# opnsense-patch 35992e7

Cheers,
Franco

@rfc4711
Copy link
Contributor

rfc4711 commented Oct 18, 2021

@fichtner will we be able to get your patch via 21.7 updates or do I need to wait for the next major release?

I don't have experience applying patches to the live system. Can I go in as root and issue" opnsense-patch 35992e7" which will fix the IP-sec static routing reactivation after interface flap? And after applying the temp-patch and the mainline updates will be patch be overwritten?

@fichtner
Copy link
Member

@rfc4711 on top of 21.7.3 you can install the patch from the command line using:

# opnsense-patch 35992e7

Judging by early feedback it might make 21.7.4 or 21.7.5. We will discuss it later this week and all further feedback is highly appreciated.

Cheers,
Franco

@rfc4711
Copy link
Contributor

rfc4711 commented Oct 18, 2021

Hi @fichtner, I applied the patch on both sides, however, it does not seem to work for restarting the IPSEC tunnel.

root@brick:~ # opnsense-patch 35992e7
Fetched 35992e7 via https://github.com/opnsense/core
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--------------------------
|From 35992e7003537c9d53ba70cfb652c5d02684563d Mon Sep 17 00:00:00 2001
|From: Franco Fichtner <franco@opnsense.org>
|Date: Mon, 18 Oct 2021 09:24:03 +0200
|Subject: [PATCH] ipsec: derive required route interfaces for dynamic changes
| #5263
|
|---
| src/etc/inc/plugins.inc.d/ipsec.inc | 39 +++++++++++++++++++++++------
| src/etc/rc.newwanip                 |  8 ++++++
| 2 files changed, 40 insertions(+), 7 deletions(-)
|
|diff --git a/src/etc/inc/plugins.inc.d/ipsec.inc b/src/etc/inc/plugins.inc.d/ipsec.inc
|index 890eb56324..a675a79c47 100644
|--- a/src/etc/inc/plugins.inc.d/ipsec.inc
|+++ b/src/etc/inc/plugins.inc.d/ipsec.inc
--------------------------
Patching file etc/inc/plugins.inc.d/ipsec.inc using Plan A...
Hunk #1 succeeded at 341 (offset 11 lines).
Hunk #2 succeeded at 1818 (offset -9 lines).
Hunk #3 succeeded at 1859 (offset 11 lines).
Hunk #4 succeeded at 1854 with fuzz 2 (offset -9 lines).
Hunk #5 succeeded at 1909 (offset 23 lines).
Hmm...  The next patch looks like a unified diff to me...
The text leading up to this was:
--------------------------
|diff --git a/src/etc/rc.newwanip b/src/etc/rc.newwanip
|index a050d5804a..888aa82f1b 100755
|--- a/src/etc/rc.newwanip
|+++ b/src/etc/rc.newwanip
--------------------------
Patching file etc/rc.newwanip using Plan A...
Hunk #1 succeeded at 98.
Hunk #2 succeeded at 140.
done
All patches have been applied successfully.  Have a nice day.

Configured the far gateway and the static routes to the loopback addresses, tunnel is up.

ipsec3: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1400
	tunnel inet 192.168.0.140 --> 108.48.45.xxx
	inet6 fe80::227c:14ff:fea0:4d1b%ipsec3 prefixlen 64 scopeid 0xd
	inet 10.0.7.2 --> 10.0.7.1 netmask 0xfffffffc
	groups: ipsec
	reqid: 3
	nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>

sbuxhofer@brick:~ % ping 10.0.7.1
PING 10.0.7.1 (10.0.7.1): 56 data bytes
64 bytes from 10.0.7.1: icmp_seq=0 ttl=64 time=125.897 ms
64 bytes from 10.0.7.1: icmp_seq=1 ttl=64 time=125.530 ms
^C
--- 10.0.7.1 ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 125.530/125.713/125.897/0.184 ms

The ping to addresses via static routes is working from both sides.

root@brick:~ # ping 172.17.17.2
PING 172.17.17.2 (172.17.17.2): 56 data bytes
64 bytes from 172.17.17.2: icmp_seq=0 ttl=64 time=125.660 ms
64 bytes from 172.17.17.2: icmp_seq=1 ttl=64 time=123.244 ms
64 bytes from 172.17.17.2: icmp_seq=2 ttl=64 time=123.938 ms
64 bytes from 172.17.17.2: icmp_seq=3 ttl=64 time=127.699 ms
64 bytes from 172.17.17.2: icmp_seq=4 ttl=64 time=125.303 ms
^C
--- 172.17.17.2 ping statistics ---
5 packets transmitted, 5 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 123.244/125.169/127.699/1.542 ms
root@brick:~ # 
root@brick:~ # ping 10.169.3.2
PING 10.169.3.2 (10.169.3.2): 56 data bytes
64 bytes from 10.169.3.2: icmp_seq=3 ttl=64 time=224.233 ms
64 bytes from 10.169.3.2: icmp_seq=4 ttl=64 time=225.982 ms
64 bytes from 10.169.3.2: icmp_seq=9 ttl=64 time=221.876 ms
64 bytes from 10.169.3.2: icmp_seq=13 ttl=64 time=222.173 ms
64 bytes from 10.169.3.2: icmp_seq=15 ttl=64 time=222.469 ms
64 bytes from 10.169.3.2: icmp_seq=16 ttl=64 time=222.023 ms
64 bytes from 10.169.3.2: icmp_seq=17 ttl=64 time=224.253 ms
64 bytes from 10.169.3.2: icmp_seq=18 ttl=64 time=225.047 ms
64 bytes from 10.169.3.2: icmp_seq=25 ttl=64 time=225.439 ms
64 bytes from 10.169.3.2: icmp_seq=28 ttl=64 time=222.251 ms
64 bytes from 10.169.3.2: icmp_seq=30 ttl=64 time=227.249 ms


^C
--- 10.169.3.2 ping statistics ---
520 packets transmitted, 11 packets received, 97.9% packet loss
round-trip min/avg/max/stddev = 221.876/223.909/227.249/1.780 ms

Restarted the IPSEC process on "brick", after sequence 30 no more connectivity. The route is gone from brick to 10.169.3.2/32 (loopback on shadow)

root@brick:~ # netstat -rn | grep 169
10.169.1.3         link#12            UH          lo1
root@brick:~ # 

the tunnel is up and I have one-sided traffic possible where I can ping from the remote side on shadow to the loopback 10.169.1.3/32 on brick.

root@shadow:~ # ping 10.169.1.3
PING 10.169.1.3 (10.169.1.3): 56 data bytes
64 bytes from 10.169.1.3: icmp_seq=0 ttl=64 time=122.812 ms
64 bytes from 10.169.1.3: icmp_seq=1 ttl=64 time=126.546 ms
64 bytes from 10.169.1.3: icmp_seq=2 ttl=64 time=125.386 ms
64 bytes from 10.169.1.3: icmp_seq=3 ttl=64 time=124.573 ms
64 bytes from 10.169.1.3: icmp_seq=4 ttl=64 time=124.069 ms
64 bytes from 10.169.1.3: icmp_seq=5 ttl=64 time=125.351 ms
^C
--- 10.169.1.3 ping statistics ---
6 packets transmitted, 6 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 122.812/124.790/126.546/1.172 ms

-> went into System->Routes->Configuration and hit Apply on "brick".

root@brick:~ # netstat -rn | grep 169
10.169.1.3         link#12            UH          lo1
10.169.3.2/32      10.0.7.1           UGS      ipsec3

root@brick:~ # ping  10.169.3.2
PING 10.169.3.2 (10.169.3.2): 56 data bytes
64 bytes from 10.169.3.2: icmp_seq=0 ttl=64 time=122.451 ms
64 bytes from 10.169.3.2: icmp_seq=1 ttl=64 time=127.319 ms
64 bytes from 10.169.3.2: icmp_seq=2 ttl=64 time=124.286 ms
64 bytes from 10.169.3.2: icmp_seq=3 ttl=64 time=126.308 ms
64 bytes from 10.169.3.2: icmp_seq=4 ttl=64 time=124.635 ms
^C
--- 10.169.3.2 ping statistics ---
5 packets transmitted, 5 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 122.451/125.000/127.319/1.687 ms

I was thinking to add static routes to each FW loopback address and then configure BGP neighbors, with this method I could backup the IPSEC tunnel with OpenVPN without having to change routing.

@AdSchellevis
Copy link
Member

It shouldn't restart a service, only re-apply static routes. Keeping sessions after failure is a responsibility of the service in question (IPsec) using options like dead peer detection.

@fichtner
Copy link
Member

@rfc4711 doesn't sound like same problem. Might be better to discuss your use case in the forum first.

@HamburgerJungeJr
Copy link
Author

After a few days of testing I think there is still an issue.

Right now I tried to get a new WAN-IP, which didn't change because I think I have to reboot the modem to get a new IP-Address from my provider. Bu ttha tshould not be the issue.

The problem is that after an IP-Renew the vpn get reconnected and the route ist applied again, but I can ping the remote network only from my machine. It is not possible to ping the remote network from the opnsense itself. Therefore Unbound cant resolve the DNS-Names forwarded to the remote DNS-Server.

I checked the routes from the opnsense shell with netstat -r and the route is configured.

@fichtner
Copy link
Member

@HamburgerJungeJr I'm not sure what the issue is but having this here only deal with routing and the route being there it's difficult to extrapolate what could be wrong now.

@AdSchellevis
Copy link
Member

@HamburgerJungeJr is your remote endpoint a hostname? and if so, can you check if the issue disappears when choosing a static ip address.

@HamburgerJungeJr
Copy link
Author

@HamburgerJungeJr is your remote endpoint a hostname? and if so, can you check if the issue disappears when choosing a static ip address.

The remote IPSec Endpoint has an static IP, the local IPSec Endpoint has a dynamic IP-Address with a Dyn-DNS-Hostname.

@AdSchellevis
Copy link
Member

I would expect there are some messages in the system log in that case about the removal of the ipsec vti device (which also holds ownership of the routes).

AdSchellevis added a commit that referenced this issue Oct 28, 2021
When local or remote isn't set to an ip address every configure will start removing the current device (and thus routes), although hostnames will likely always be a bit wacky (needs resolving, might change in which case the underlaying components likely miss the event). It's probably still a good idea to resolve when no addresses are used before concluding a device has changed.

In the process change ipsec_resolve() to support both IPv4 and IPv6, but to limit risk, keep callers at IPv4 (which was the old behaviour), since it's not entirely sure we can use the phase 1 protocol for the tunnel itself as well.
@AdSchellevis
Copy link
Member

might be 2202b02

@AdSchellevis
Copy link
Member

One of our customers with the same issue reports back that opnsense-patch 2202b02 fixes it on his end (after applying 35992e7 as well to avoid patch issues). We can probably drop 35992e7, since it's only trying to fight the effect and not the cause (could have been a nice work around, but shouldn't be needed).

@HamburgerJungeJr
Copy link
Author

I just checked the patch. Now the opnsense itself can ping an remote host after a new wan ip was issued.

Beforehand I updated to 21.7.4 so I can't tell when the issue occurred but now I have four new unassigned interfaces in the interface overview:
2021-10-29_11-15

The first two are marked as up the third and fourth are marked as down

Also I can't see them directly under Interfaces or under Assignment.

@AdSchellevis
Copy link
Member

showing unassigned hosts has no relation to this patch, it's just an improvement in the latest version so you can easily inspect status of these interfaces (which is particular very practical for lag interfaces and their children). So, sounds like case closed then?

@HamburgerJungeJr
Copy link
Author

Thanks for the clarification.

From my point of view the issue seems to fixed so it can be closed.

@AdSchellevis
Copy link
Member

thanks for the feedback, let's close this then. We'll discuss next week what todo with 35992e7

@fichtner
Copy link
Member

@AdSchellevis if it's not needed it should be removed of course

@AdSchellevis
Copy link
Member

@fichtner let's check next week or so and weight the pro's and cons, there's no rush.

fichtner pushed a commit that referenced this issue Nov 24, 2021
When local or remote isn't set to an ip address every configure will start removing the current device (and thus routes), although hostnames will likely always be a bit wacky (needs resolving, might change in which case the underlaying components likely miss the event). It's probably still a good idea to resolve when no addresses are used before concluding a device has changed.

In the process change ipsec_resolve() to support both IPv4 and IPv6, but to limit risk, keep callers at IPv4 (which was the old behaviour), since it's not entirely sure we can use the phase 1 protocol for the tunnel itself as well.

(cherry picked from commit 2202b02)
(cherry picked from commit 27d30a7)
(cherry picked from commit 35992e7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Adding new functionality
Development

Successfully merging a pull request may close this issue.

4 participants