Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: Router Discovery broken after ifup lan #135

Closed
aertsk opened this issue Jul 10, 2019 · 1 comment
Closed

Regression: Router Discovery broken after ifup lan #135

aertsk opened this issue Jul 10, 2019 · 1 comment

Comments

@aertsk
Copy link
Contributor

aertsk commented Jul 10, 2019

Description

Following commit:
c6dae8e#diff-25d902c24283ab8cfbac54dfa101ad31

introduces an issue in odhcpd after 'ifup lan' resulting in the fact that Router Discovery is broken until odhcpd is restarted. Periodic Router Advertisement's are still send out.

Following description describes the reproduction on a TP-Link 1750:

Made use of rdisc6 as part of ndsic6 package (apt install ndisc6) to trigger a Router Solicitation from a host toward the gateway.
host# rdisc6 -1 eth0

REPRODUCTION CONFIG

Used following config on a TP-Link 1750:
Config introduces second bridge 'lan2' as the test bridge next to the existing 'lan' bridge that is used to control the gateway.
Run odhcpd on top of lan2 to handle Router Solicitation/Advertisement

root@OpenWrt:~# cat /etc/config/network 

config interface 'loopback'
	option ifname 'lo'
	option proto 'static'
	option ipaddr '127.0.0.1'
	option netmask '255.0.0.0'

config globals 'globals'
	option ula_prefix 'fd34:c595:be3b::/48'

config interface 'lan'
	option type 'bridge'
	option ifname 'eth1.1'
	option proto 'static'
	option ipaddr '192.168.1.1'
	option netmask '255.255.255.0'
	option ip6assign '60'

config interface 'wan'
	option ifname 'eth0.2'
	option proto 'dhcp'

config interface 'wan6'
	option ifname 'eth0.2'
	option proto 'dhcpv6'

config switch
	option name 'switch0'
	option reset '1'
	option enable_vlan '1'

config switch_vlan
	option device 'switch0'
	option vlan '1'
	option ports '2 3 0t'

config switch_vlan
	option device 'switch0'
	option vlan '2'
	option ports '1 6t'

config switch_vlan
	option device 'switch0'
	option vlan '3'
	option ports '4 5 0t'

config interface 'lan2'
	option type 'bridge'
	option ifname 'eth1.3'
	option proto 'static'
	option netmask '255.255.255.0'
	option ipaddr '192.168.2.1'
	option ip6assign '60'

root@OpenWrt:~# 

root@OpenWrt:~# cat /etc/config/dhcp 

config dnsmasq
	option domainneeded '1'
	option boguspriv '1'
	option filterwin2k '0'
	option localise_queries '1'
	option rebind_protection '1'
	option rebind_localhost '1'
	option local '/lan/'
	option domain 'lan'
	option expandhosts '1'
	option nonegcache '0'
	option authoritative '1'
	option readethers '1'
	option leasefile '/tmp/dhcp.leases'
	option resolvfile '/tmp/resolv.conf.auto'
	option nonwildcard '1'
	option localservice '1'

config dhcp 'lan'
	option interface 'lan'
	option start '100'
	option limit '150'
	option leasetime '12h'
	option dhcpv6 'server'
	option ra 'server'

config dhcp 'lan2'
	option interface 'lan2'
	option start '100'
	option limit '150'
	option leasetime '12h'
	option dhcpv6 'server'
	option ra 'server'

config dhcp 'wan'
	option interface 'wan'
	option ignore '1'

config odhcpd 'odhcpd'
	option maindhcp '0'
	option leasefile '/tmp/hosts/odhcpd'
	option leasetrigger '/usr/sbin/odhcpd-update'
	option loglevel '7'

root@OpenWrt:~# 

STEPS TO REPRODUCE

STEP 1:

Apply this config, reboot and connect a test host to switch port 3 or 4

STEP 2:

Make sure that Router Solicitation works at this point on lan2

root@testuser-Latitude-E5410:/usr/localdisk/openwrt/bin/targets/ar71xx/generic# rdisc6 -1 eth0
Soliciting ff02::2 (ff02::2) on eth0...

Hop limit                 :           64 (      0x40)
Stateful address conf.    :          Yes
Stateful other conf.      :          Yes
Mobile home agent         :           No
Router preference         :       medium
Neighbor discovery proxy  :           No
Router lifetime           :            0 (0x00000000) seconds
Reachable time            :  unspecified (0x00000000)
Retransmit time           :  unspecified (0x00000000)
 Source link-layer address: EC:08:6B:27:45:A0
 MTU                      :         1500 bytes (valid)
 Prefix                   : fd34:c595:be3b:10::/64
  On-link                 :          Yes
  Autonomous address conf.:          Yes
  Valid time              :     infinite (0xffffffff)
  Pref. time              :     infinite (0xffffffff)
 Route                    : fd34:c595:be3b::/48
  Route preference        :       medium
  Route lifetime          :         1800 (0x00000708) seconds
 Recursive DNS server     : fd34:c595:be3b:10::1
  DNS server lifetime     :         1800 (0x00000708) seconds
root@testuser-Latitude-E5410:/usr/localdisk/openwrt/bin/targets/ar71xx/generic#

STEP 3:

execute ifup lan2 on target

┌──[aertsk@cplx1015]──[/home/users/aertsk/Downloads]──────────                                                                             ──────[13:40:29]──[0.19]──────
$ ssh root@192.168.1.1


BusyBox v1.31.0 () built-in shell (ash)

  _______                     ________        __
 |       |.-----.-----.-----.|  |  |  |.----.|  |_
 |   -   ||  _  |  -__|     ||  |  |  ||   _||   _|
 |_______||   __|_____|__|__||________||__|  |____|
          |__| W I R E L E S S   F R E E D O M
 -----------------------------------------------------
 OpenWrt SNAPSHOT, r10444-5c094ff660
 -----------------------------------------------------
=== WARNING! =====================================
There is no root password defined on this device!
Use the "passwd" command to set up a new password
in order to prevent unauthorized SSH logins.
--------------------------------------------------
root@OpenWrt:~# 
root@OpenWrt:~# 
root@OpenWrt:~# ifup lan2
root@OpenWrt:~# 

STEP 4:

Trigger another Router Solicitation toward the gateway which should fail at this point

root@testuser-Latitude-E5410:/usr/localdisk/openwrt/bin/targets/ar71xx/generic# rdisc6 -1 eth0
Soliciting ff02::2 (ff02::2) on eth0...
Timed out.
Timed out.
Timed out.
No response.
root@testuser-Latitude-E5410:/usr/localdisk/openwrt/bin/targets/ar71xx/generic#

Internally the Recv-Q on the raw socket starts to increase until odhcpd is restarted manually.

root@OpenWrt:~# netstat -lnp | grep odhcpd
udp        0      0 :::547                  :::*                                2508/odhcpd
udp        0      0 :::547                  :::*                                2508/odhcpd
raw     3968      0 ::%2143929572:58        ::%4381825:*            58          2508/odhcpd
raw        0      0 ::%2143929572:58        ::%4381825:*            58          2508/odhcpd
root@OpenWrt:~#

RECOVER THE GATEWAY:

root@OpenWrt:~# /etc/init.d/odhcpd restart
dedeckeh added a commit that referenced this issue Aug 8, 2019
In case setting one of the socket options fails; make sure the raw
socket is removed from the uloop file descriptor list before the
socket is closed.
In case this is not done and a new raw socket is created with the
same fd value odhcpd will not be triggered by uloop in case RS messages
are received on the socket as reported in #135

Signed-off-by: Hans Dedecker <dedeckeh@gmail.com>
@dedeckeh
Copy link
Contributor

dedeckeh commented Aug 8, 2019

Fixed in commit https://git.openwrt.org/?p=project/odhcpd.git;a=commit;h=000182fe4f94a5a6ec139456a2b74f0cdea13b9c; thank you for the detailed reporting and reproduction scenario

@dedeckeh dedeckeh closed this as completed Aug 8, 2019
aertsk added a commit to aertsk/odhcpd that referenced this issue Aug 13, 2019
Make sure the socket is closed in a case where the bridge goes down
as a result of NO-CARRIER on the bridge. If not present Router Discovery and Router Advertisement will break permanently after the bridge went down.

Related to  openwrt#135

Signed-off-by: Koen Aerts <aertskoen5@gmail.com>
aertsk added a commit to aertsk/odhcpd that referenced this issue Aug 14, 2019
    Make sure the socket is closed in a case where the bridge goes down
    as a result of NO-CARRIER on the bridge. If not present Router Discovery and Router Advertisement will break permanently after the bridge went down.

    Related to  openwrt#135

Signed-off-by: Koen Aerts <aertskoen5@gmail.com>
aertsk added a commit to aertsk/odhcpd that referenced this issue Aug 14, 2019
Make sure the socket is closed in a case where the bridge goes down
as a result of NO-CARRIER on the bridge. If not present Router Discovery and Router Advertisement will break permanently after the bridge went down.

Related to  openwrt#135

Signed-off-by: Koen Aerts <aertskoen5@gmail.com>
aertsk added a commit to aertsk/odhcpd that referenced this issue Aug 14, 2019
router: close socket upon NETEV_IFINDEX_CHANGE (squashed)

    Make sure the socket is closed in a case where the bridge goes down
    as a result of NO-CARRIER on the bridge. If not present Router Discovery and Router Advertisement will break permanently after the bridge went down.

    Related to  openwrt#135

    Signed-off-by: Koen Aerts <aertskoen5@gmail.com>
aertsk added a commit to aertsk/odhcpd that referenced this issue Aug 14, 2019
Make sure the socket is closed in a case where the bridge goes down
as a result of NO-CARRIER on the bridge. If not present Router Discovery and Router Advertisement will break permanently after the bridge went down.

Related to  openwrt#135

Signed-off-by: Koen Aerts <aertskoen5@gmail.com>
dedeckeh pushed a commit that referenced this issue Aug 16, 2019
Make sure the socket is closed in a case where the bridge goes down
as a result of NO-CARRIER on the bridge.
If not present Router Discovery and Router Advertisement will break
permanently after the bridge went down.

Related to  #135

Signed-off-by: Koen Aerts <aertskoen5@gmail.com>
aertsk added a commit to aertsk/odhcpd that referenced this issue Aug 19, 2019
make sure the raw socket is removed from the uloop file descriptor list before the
socket is closed. As introduced in openwrt@000182f

Related to  openwrt#135

Signed-off-by: Koen Aerts <aertskoen5@gmail.com>
aertsk added a commit to aertsk/odhcpd that referenced this issue Aug 19, 2019
make sure the raw socket is removed from the uloop file descriptor list before the
socket is closed. As introduced in openwrt@000182f

Related to  openwrt#135

Signed-off-by: Koen Aerts <aertskoen5@gmail.com>
dedeckeh pushed a commit that referenced this issue Aug 19, 2019
Make sure the raw socket is removed from the uloop file descriptor
list before the socket is closed as introduced in
000182f

Related to  #135

Signed-off-by: Koen Aerts <aertskoen5@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants