Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

馃悰 BUG: host won't reconnect to lighthouse successfully behind a ip-changed NAT #889

Open
JimLee1996 opened this issue Jun 3, 2023 · 8 comments
Labels
NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.

Comments

@JimLee1996
Copy link

What version of nebula are you using?

1.7.2

What operating system are you using?

Linux

Describe the Bug

  1. Lighthouse: with static public ip
  2. Host: behind a NAT whose public ip may change
  3. bug happens excatly after the NAT public ip changed: the nebula host's reconnection fails

Maybe there should be a counter or pivot which will reload service when tries fail.

Logs from affected hosts

Jun 03 08:17:10 N1 nebula[279798]: level=info msg="Handshake message sent" handshake="map[stage:1 style:ix_psk0]" initiatorIndex=2265257244 localIndex=2265257244 remoteIndex=0 udpAddrs="[*.*.*.*:4242]" vpnIp=192.168.100.1
Jun 03 08:17:17 N1 nebula[279798]: level=info msg="Handshake timed out" durationNs=6891732272 handshake="map[stage:1 style:ix_psk0]" initiatorIndex=2265257244 localIndex=2265257244 remoteIndex=0 udpAddrs="[*.*.*.*:4242]" vpnIp=192.168.100.1
Jun 03 08:18:10 N1 nebula[279798]: level=info msg="Handshake message sent" handshake="map[stage:1 style:ix_psk0]" initiatorIndex=789404569 localIndex=789404569 remoteIndex=0 udpAddrs="[*.*.*.*:4242]" vpnIp=192.168.100.1
Jun 03 08:18:17 N1 nebula[279798]: level=info msg="Handshake timed out" durationNs=6688333707 handshake="map[stage:1 style:ix_psk0]" initiatorIndex=789404569 localIndex=789404569 remoteIndex=0 udpAddrs="[*.*.*.*:4242]" vpnIp=192.168.100.1

# ============================
# a lot of same logs here
# ============================

Jun 03 09:32:52 N1 nebula[279798]: level=info msg="Handshake message sent" handshake="map[stage:1 style:ix_psk0]" initiatorIndex=215861812 localIndex=215861812 remoteIndex=0 udpAddrs="[*.*.*.*:4242]" vpnIp=192.168.100.1
Jun 03 09:32:59 N1 nebula[279798]: level=info msg="Handshake timed out" durationNs=6664379861 handshake="map[stage:1 style:ix_psk0]" initiatorIndex=215861812 localIndex=215861812 remoteIndex=0 udpAddrs="[*.*.*.*:4242]" vpnIp=192.168.100.1
Jun 03 09:32:59 N1 nebula[279798]: level=info msg="Handshake timed out" durationNs=6962423141 handshake="map[stage:1 style:ix_psk0]" initiatorIndex=2959928072 localIndex=2959928072 remoteIndex=0 udpAddrs="[]" vpnIp=192.168.100.10


# ============================
# manually restart neblua host
# ============================

Jun 03 09:33:00 N1 nebula[279798]: level=info msg="Caught signal, shutting down" signal=terminated
Jun 03 09:33:00 N1 nebula[279798]: level=info msg=Goodbye
Jun 03 09:33:00 N1 systemd[1]: Stopping Nebula overlay networking tool...
Jun 03 09:33:00 N1 systemd[1]: nebula.service: Succeeded.
Jun 03 09:33:00 N1 systemd[1]: Stopped Nebula overlay networking tool.
Jun 03 09:33:00 N1 systemd[1]: nebula.service: Consumed 30min 44.211s CPU time.
Jun 03 09:33:00 N1 systemd[1]: Started Nebula overlay networking tool.
Jun 03 09:33:00 N1 nebula[299280]: level=info msg="Firewall rule added" firewallRule="map[caName: caSha: direction:outgoing endPort:0 groups:[] host:any ip: localIp: proto:0 startPort:0]"
Jun 03 09:33:00 N1 nebula[299280]: level=info msg="Firewall rule added" firewallRule="map[caName: caSha: direction:incoming endPort:0 groups:[] host:any ip: localIp: proto:0 startPort:0]"
Jun 03 09:33:00 N1 nebula[299280]: level=info msg="Firewall started" firewallHash=498215dec4e5687a2353f51c10838c113bd1af35ef72b8e8c9f536986ada5417
Jun 03 09:33:00 N1 nebula[299280]: level=info msg="Main HostMap created" network=192.168.100.2/24 preferredRanges="[]"
Jun 03 09:33:00 N1 nebula[299280]: level=info msg="punchy enabled"
Jun 03 09:33:00 N1 nebula[299280]: level=info msg="Loaded send_recv_error config" sendRecvError=always
Jun 03 09:33:00 N1 nebula[299280]: level=info msg="Nebula interface is active" boringcrypto=false build=1.7.2 interface=tun0 network=192.168.100.2/24 udpAddr="0.0.0.0:44710"
Jun 03 09:33:00 N1 nebula[299280]: level=info msg="DNS results changed for host list" newSet="map[*.*.*.*:4242:{}]" origSet="&map[]"

# ============================
# now it's back to normal
# ============================

Jun 03 09:33:00 N1 nebula[299280]: level=info msg="Handshake message sent" handshake="map[stage:1 style:ix_psk0]" initiatorIndex=1175697647 localIndex=1175697647 remoteIndex=0 udpAddrs="[*.*.*.*:4242]" vpnIp=192.168.100.1
Jun 03 09:33:00 N1 nebula[299280]: level=info msg="Handshake message received" certName=ICL durationNs=327741706 fingerprint=a01937d6e07d050ba2cfc91fd2f56ec3f008b33690b7931f3a5bfe99f835f67a handshake="map[stage:2 style:ix_psk0]" initiatorIndex=1175697647 issuer=33768094d6855b7ca53962932dd41ce99b11347d220ff89a33d1f01f0f5ab578 remoteIndex=1175697647 responderIndex=3925596160 sentCachedPackets=1 udpAddr="*.*.*.*:4242" vpnIp=192.168.100.1
Jun 03 09:33:03 N1 nebula[299280]: level=info msg="Handshake message received" certName=Macbook fingerprint=bd3d7b77768b32aa25b5ce82c2cc67a4620b78aaf1ed95999c3e93016c8795f5 handshake="map[stage:1 style:ix_psk0]" initiatorIndex=2451724569 issuer=33768094d6855b7ca53962932dd41ce99b11347d220ff89a33d1f01f0f5ab578 remoteIndex=0 responderIndex=0 udpAddr="192.168.123.10:61939" vpnIp=192.168.100.10
Jun 03 09:33:03 N1 nebula[299280]: level=info msg="Handshake message sent" certName=Macbook fingerprint=bd3d7b77768b32aa25b5ce82c2cc67a4620b78aaf1ed95999c3e93016c8795f5 handshake="map[stage:2 style:ix_psk0]" initiatorIndex=2451724569 issuer=33768094d6855b7ca53962932dd41ce99b11347d220ff89a33d1f01f0f5ab578 remoteIndex=0 responderIndex=3070221539 sentCachedPackets=0 udpAddr="192.168.123.10:61939" vpnIp=192.168.100.10
Jun 03 09:33:03 N1 nebula[299280]: level=info msg="Handshake message received" certName=Macbook fingerprint=bd3d7b77768b32aa25b5ce82c2cc67a4620b78aaf1ed95999c3e93016c8795f5 handshake="map[stage:1 style:ix_psk0]" initiatorIndex=2451724569 issuer=33768094d6855b7ca53962932dd41ce99b11347d220ff89a33d1f01f0f5ab578 remoteIndex=0 responderIndex=0 udpAddr="172.16.16.10:61939" vpnIp=192.168.100.10
Jun 03 09:33:03 N1 nebula[299280]: level=info msg="Handshake message sent" cached=true handshake="map[stage:2 style:ix_psk0]" udpAddr="172.16.16.10:61939" vpnIp=192.168.100.10
issuer=33768094d6855b7ca53962932dd41ce99b11347d220ff89a33d1f01f0f5ab578 remoteIndex=0 responderIndex=0 udpAddr="172.16.16.11:56979" vpnIp=192.168.100.11
issuer=33768094d6855b7ca53962932dd41ce99b11347d220ff89a33d1f01f0f5ab578 remoteIndex=0 responderIndex=0 udpAddr="192.168.123.11:56979" vpnIp=192.168.100.11
Jun 03 09:40:23 N1 nebula[299280]: level=info msg="Handshake message sent" cached=true handshake="map[stage:2 style:ix_psk0]" udpAddr="192.168.123.11:56979" vpnIp=192.168.100.11

Config files from affected hosts

pki:
  ca: /root/bin/nebula/cert/ca.crt
  cert: /root/bin/nebula/cert/SY.crt
  key: /root/bin/nebula/cert/SY.key

static_host_map:
  "192.168.100.1": ["example.com:4242"] # hidden

lighthouse:
  am_lighthouse: false
  interval: 60
  hosts:
    - "192.168.100.1"

listen:
  host: 0.0.0.0
  port: 0

punchy:
  punch: true
  respond: true
  delay: 1s
  respond_delay: 5s

cipher: aes

tun:
  disabled: false
  tx_queue: 500
  mtu: 1300

  # Unsafe routes allows you to route traffic over nebula to non-nebula nodes
  # Unsafe routes should be avoided unless you have hosts/services that cannot run nebula
  # NOTE: The nebula certificate of the "via" node *MUST* have the "route" defined as a subnet in its certificate
  # `mtu`: will default to tun mtu if this option is not specified
  # `metric`: will default to 0 if this option is not specified
  # `install`: will default to true, controls whether this route is installed in the systems routing table.
  # unsafe_routes:
  #   - route: 192.168.1.0/24
  #     via: 192.168.100.1
  #     mtu: 1300
  #     install: true

logging:
  level: info
  format: text
  disable_timestamp: true

firewall:
  outbound_action: drop
  inbound_action: drop

  conntrack:
    tcp_timeout: 12m
    udp_timeout: 3m
    default_timeout: 10m

  # The firewall is default deny. There is no way to write a deny rule.
  # Rules are comprised of a protocol, port, and one or more of host, group, or CIDR
  # Logical evaluation is roughly: port AND proto AND (ca_sha OR ca_name) AND (host OR group OR groups OR cidr)
  # - port: Takes `0` or `any` as any, a single number `80`, a range `200-901`, or `fragment` to match second and further fragments of fragmented packets (since there is no port available).
  #   code: same as port but makes more sense when talking about ICMP, TODO: this is not currently implemented in a way that works, use `any`
  #   proto: `any`, `tcp`, `udp`, or `icmp`
  #   host: `any` or a literal hostname, ie `test-host`
  #   group: `any` or a literal group name, ie `default-group`
  #   groups: Same as group but accepts a list of values. Multiple values are AND'd together and a certificate would have to contain all groups to pass
  #   cidr: a remote CIDR, `0.0.0.0/0` is any.
  #   local_cidr: a local CIDR, `0.0.0.0/0` is any. This could be used to filter destinations when using unsafe_routes.
  #   ca_name: An issuing CA name
  #   ca_sha: An issuing CA shasum

  outbound:
    - port: any
      proto: any
      host: any

  inbound:
    - port: any
      proto: any
      host: any
@brad-defined
Copy link
Collaborator

can you share the lighthouse logs?

What should be happening in this case is, when the host's NATed IP address changes, traffic that it sends to the lighthouse has the new IP address. Lighthouse receives these packets, authenticates them, and updates its discovered IP address for that peer.

That doesn't seem to be happening in your case, so something interesting is happening on your lighthouse. I'm hoping logs give a clue.

@JimLee1996
Copy link
Author

JimLee1996 commented Jun 5, 2023

Thanks for your reply.

I've checked lighthouse logs and it seems that the host's NATed public ip did not change.

Jun 02 10:33:48 ICL nebula[349]: level=info msg="Handshake message received" certName=SY fingerprint=9976ea029ee4b5a0916f79aefa760154847eac3448af92cb7d314ca66f195cff handshake="map[stage:1 style:ix_psk0]" initiatorIndex=2307497560 issuer=33768094d6855b7ca53962932dd41ce99b11347d220ff89a33d1f01f0f5ab578 remoteIndex=0 responderIndex=0 udpAddr="ip1:36740" vpnIp=192.168.100.2
Jun 02 10:33:48 ICL nebula[349]: level=info msg="Handshake message sent" certName=SY fingerprint=9976ea029ee4b5a0916f79aefa760154847eac3448af92cb7d314ca66f195cff handshake="map[stage:2 style:ix_psk0]" initiatorIndex=2307497560 issuer=33768094d6855b7ca53962932dd41ce99b11347d220ff89a33d1f01f0f5ab578 remoteIndex=0 responderIndex=127293448 sentCachedPackets=0 udpAddr="ip1:36740" vpnIp=192.168.100.2
Jun 03 03:34:44 ICL nebula[349]: level=info msg="Tunnel status" certName=SY localIndex=127293448 remoteIndex=2307497560 tunnelCheck="map[method:active state:dead]" vpnIp=192.168.100.2
Jun 03 09:32:59 ICL nebula[349]: level=info msg="Handshake message received" certName=SY fingerprint=9976ea029ee4b5a0916f79aefa760154847eac3448af92cb7d314ca66f195cff handshake="map[stage:1 style:ix_psk0]" initiatorIndex=1175697647 issuer=33768094d6855b7ca53962932dd41ce99b11347d220ff89a33d1f01f0f5ab578 remoteIndex=0 responderIndex=0 udpAddr="ip1:44710" vpnIp=192.168.100.2
Jun 03 09:32:59 ICL nebula[349]: level=info msg="Handshake message sent" certName=SY fingerprint=9976ea029ee4b5a0916f79aefa760154847eac3448af92cb7d314ca66f195cff handshake="map[stage:2 style:ix_psk0]" initiatorIndex=1175697647 issuer=33768094d6855b7ca53962932dd41ce99b11347d220ff89a33d1f01f0f5ab578 remoteIndex=0 responderIndex=3925596160 sentCachedPackets=0 udpAddr="ip1:44710" vpnIp=192.168.100.2

However, here's another clue that, the router(nat devices) did a wan restart cron whose wan ip happened to be kept as the same ip1, but the nat table must be flushed.

Jun  3 03:30:00 cron: restart wan
Jun  3 03:30:00 K2P: Perform WAN manual reconnect
Jun  3 03:30:00 pppd[23884]: Terminating on signal 15
Jun  3 03:30:00 pppd[23884]: Connect time 1440.0 minutes.
Jun  3 03:30:00 pppd[23884]: Sent 3035167251 bytes, received 2273066402 bytes.
Jun  3 03:30:00 K2P: WAN down (ppp0)
Jun  3 03:30:00 PPPoE: Disconnected
Jun  3 03:30:00 pppd[23884]: Connection terminated.
Jun  3 03:30:00 pppd[23884]: Sent PADT
Jun  3 03:30:00 pppd[23884]: Exit.
Jun  3 03:30:01 inadyn[23926]: STOP command received, exiting.
Jun  3 03:30:01 kernel: eth3: ===> VirtualIF_close
Jun  3 03:30:01 kernel: eth3: ===> VirtualIF_open
Jun  3 03:30:01 K2P: Hardware NAT/Routing: Disabled
Jun  3 03:30:01 dnsmasq[19617]: read /etc/hosts - 3 addresses
Jun  3 03:30:01 dnsmasq[19617]: read /etc/storage/dnsmasq/hosts - 0 addresses
Jun  3 03:30:01 dnsmasq-dhcp[19617]: read /etc/dnsmasq/dhcp/dhcp-hosts.rc
Jun  3 03:30:01 dnsmasq-dhcp[19617]: read /etc/storage/dnsmasq/dhcp.conf
Jun  3 03:30:01 pppd[27602]: Plugin rp-pppoe.so loaded.
Jun  3 03:30:01 pppd[27602]: RP-PPPoE plugin version 3.12 compiled against pppd 2.4.7
Jun  3 03:30:01 pppd[27603]: pppd 2.4.7 started by admin, uid 0
Jun  3 03:30:01 pppd[27603]: PPP session is 51130 (0xc7ba)
Jun  3 03:30:01 pppd[27603]: Connected to 4c:09:b4:fc:49:10 via interface eth3
Jun  3 03:30:01 pppd[27603]: Using interface ppp0
Jun  3 03:30:01 pppd[27603]: Connect: ppp0 <--> eth3
Jun  3 03:30:01 pppd[27603]: Remote message: 0;User(07199747900) Authenticate OK, Request Accept by hb.cn
Jun  3 03:30:01 pppd[27603]: PAP authentication succeeded
Jun  3 03:30:01 pppd[27603]: peer from calling number 4C:09:B4:FC:49:10 authorized
Jun  3 03:30:01 pppd[27603]: local  IP address ip1
Jun  3 03:30:01 K2P: WAN up (ppp0)

After that, I've double checked other logs and you are right. When nated public ip chaged to ip2 on another cron, it reconnected successfully.

Jun 05 03:31:46 ICL nebula[349]: level=info msg="Handshake message received" certName=SY fingerprint=9976ea029ee4b5a0916f79aefa760154847eac3448af92cb7d314ca66f195cff handshake="map[stage:1 style:ix_psk0]" initiatorIndex=1272481155 issuer=33768094d6855b7ca53962932dd41ce99b11347d220ff89a33d1f01f0f5ab578 remoteIndex=0 responderIndex=0 udpAddr="ip2:44710" vpnIp=192.168.100.2
Jun 05 03:31:46 ICL nebula[349]: level=info msg="Taking new handshake" certName=SY localIndex=3159396583 remoteIndex=575398791 vpnIp=192.168.100.2
Jun 05 03:31:46 ICL nebula[349]: level=info msg="Handshake message sent" certName=SY fingerprint=9976ea029ee4b5a0916f79aefa760154847eac3448af92cb7d314ca66f195cff handshake="map[stage:2 style:ix_psk0]" initiatorIndex=1272481155 issuer=33768094d6855b7ca53962932dd41ce99b11347d220ff89a33d1f01f0f5ab578 remoteIndex=0 responderIndex=4021870838 sentCachedPackets=0 udpAddr="ip2:44710" vpnIp=192.168.100.2
Jun 05 03:31:58 ICL nebula[349]: level=info msg="Tunnel status" certName=SY localIndex=3159396583 remoteIndex=575398791 tunnelCheck="map[method:active state:dead]" vpnIp=192.168.100.2

So, when the host behind a nated device and this device keeps the public ip but flushs the nat table, weired bug happens.

@brad-defined
Copy link
Collaborator

In the broken flow, it appears that Nebula is still sending its UDP packets to the lighthouse (from the same NAT IP, but from a different NAT port number), and the lighthouse is responding to the new IP/Port. However, the Nebula client is never receiving those Lighthouse replies.

I'm suspicious that the NAT, after that cron job runs, is no longer routing return packets through to the client. If this is what's happening, it would explain why Nebula is unable to reconnect; the solution would be to fix the NAT device to make sure return traffic is routed correctly. Further debugging steps might include running tcpdump on the Nebula client behind the NAT to verify that return Nebula traffic is not arriving as expected, and then proceeding to debug the NAT device itself.

@JimLee1996
Copy link
Author

Thanks a lot! It's very helpful.

@JimLee1996
Copy link
Author

JimLee1996 commented Jun 6, 2023

I've used tcpdump on nebula client and lighthouse. It shows nebula client send packet to lighthouse using the same port, while lighthouse cannot receive the packet.

# client tcpdump logs
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
09:19:38.415469 IP 192.168.2.2.54810 > lighhouse_ip.4242: UDP, length 232
09:19:38.600421 IP 192.168.2.2.54810 > lighhouse_ip.4242: UDP, length 232
09:19:38.901614 IP 192.168.2.2.54810 > lighhouse_ip.4242: UDP, length 232
09:19:39.301481 IP 192.168.2.2.54810 > lighhouse_ip.4242: UDP, length 232
09:19:39.901343 IP 192.168.2.2.54810 > lighhouse_ip.4242: UDP, length 232
09:19:40.500644 IP 192.168.2.2.54810 > lighhouse_ip.4242: UDP, length 232
09:19:41.201538 IP 192.168.2.2.54810 > lighhouse_ip.4242: UDP, length 232
09:19:42.001439 IP 192.168.2.2.54810 > lighhouse_ip.4242: UDP, length 232
09:19:42.901285 IP 192.168.2.2.54810 > lighhouse_ip.4242: UDP, length 232
09:19:43.901181 IP 192.168.2.2.54810 > lighhouse_ip.4242: UDP, length 232

# lighthouse tcpdump logs
# nothing received

After restart nebula client, things are back to normal.

# client tcpdump logs
10:45:27.457736 IP 192.168.2.2.51713 > lighthouse_ip.4242: UDP, length 232
10:45:27.483897 IP lighthouse_ip.4242 > 192.168.2.2.51713: UDP, length 275
10:45:27.489602 IP 192.168.2.2.51713 > lighthouse_ip.4242: UDP, length 66
10:45:27.514848 IP lighthouse_ip.4242 > 192.168.2.2.51713: UDP, length 42
10:45:38.658354 IP 192.168.2.2.51713 > lighthouse_ip.4242: UDP, length 1
10:45:44.158125 IP 192.168.2.2.51713 > lighthouse_ip.4242: UDP, length 1
10:45:49.657637 IP 192.168.2.2.51713 > lighthouse_ip.4242: UDP, length 1
10:45:55.158556 IP 192.168.2.2.51713 > lighthouse_ip.4242: UDP, length 1
10:46:01.157115 IP 192.168.2.2.51713 > lighthouse_ip.4242: UDP, length 1
10:46:07.157549 IP 192.168.2.2.51713 > lighthouse_ip.4242: UDP, length 1
10:46:12.658183 IP 192.168.2.2.51713 > lighthouse_ip.4242: UDP, length 1
10:46:18.157539 IP 192.168.2.2.51713 > lighthouse_ip.4242: UDP, length 1
10:46:23.657866 IP 192.168.2.2.51713 > lighthouse_ip.4242: UDP, length 1
10:46:27.158744 IP 192.168.2.2.51713 > lighthouse_ip.4242: UDP, length 66
10:46:27.183994 IP lighthouse_ip.4242 > 192.168.2.2.51713: UDP, length 42
10:46:35.157989 IP 192.168.2.2.51713 > lighthouse_ip.4242: UDP, length 1
10:46:40.657130 IP 192.168.2.2.51713 > lighthouse_ip.4242: UDP, length 1

# lighthouse tcpdump logs
10:45:25.790125 IP client_wanip.51713 > lighthouse_ip.4242: UDP, length 232
10:45:25.790881 IP lighthouse_ip.4242 > client_wanip.51713: UDP, length 275
10:45:25.821700 IP client_wanip.51713 > lighthouse_ip.4242: UDP, length 66
10:45:25.821880 IP lighthouse_ip.4242 > client_wanip.51713: UDP, length 42
10:45:36.979362 IP client_wanip.51713 > lighthouse_ip.4242: UDP, length 1
10:45:42.473716 IP client_wanip.51713 > lighthouse_ip.4242: UDP, length 1
10:45:47.967715 IP client_wanip.51713 > lighthouse_ip.4242: UDP, length 1
10:45:53.463544 IP client_wanip.51713 > lighthouse_ip.4242: UDP, length 1
10:45:59.455749 IP client_wanip.51713 > lighthouse_ip.4242: UDP, length 1
10:46:05.450467 IP client_wanip.51713 > lighthouse_ip.4242: UDP, length 1
10:46:10.946000 IP client_wanip.51713 > lighthouse_ip.4242: UDP, length 1
10:46:16.439597 IP client_wanip.51713 > lighthouse_ip.4242: UDP, length 1
10:46:21.934627 IP client_wanip.51713 > lighthouse_ip.4242: UDP, length 1
10:46:25.432274 IP client_wanip.51713 > lighthouse_ip.4242: UDP, length 66
10:46:25.432437 IP lighthouse_ip.4242 > client_wanip.51713: UDP, length 42
10:46:33.423356 IP client_wanip.51713 > lighthouse_ip.4242: UDP, length 1
10:46:38.916960 IP client_wanip.51713 > lighthouse_ip.4242: UDP, length 1

It's true that a problematic nat device would cause this issue. It may be better to handle such corner cases, as iptables nat (in my case) is commonly used.

@JimLee1996
Copy link
Author

@brad-defined Hi, sorry to bother again. I've noticed that in n2n (a lagency p2p vpn similar to nebula), handles this complex NAT situation. I think it is necessary feature.
https://github.com/ntop/n2n/blob/dev/src/edge_utils.c#L1615

@johnmaguire johnmaguire added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Jul 19, 2023
@johnmaguire
Copy link
Collaborator

Possible dupe of #515?

@JimLee1996
Copy link
Author

JimLee1996 commented Jul 29, 2023

Thank you @johnmaguire.

It seems a little bit different, as in that scenario, the client's nated external IP changed and issue happens randomly.

While in my scenario, when IP no changed issue happens and after restarting nebula everything comes bak to normal. I believe the issue cames from the incoming direction of nat, as it's not able forward packet to nebula client. (Maybe it loses partial table info after restarted and not try to fully reload when wan IP not changed?).

However, this firewall cannot be controlled by me and I think maybe change reload nebula connection would be very helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

3 participants