Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disconnections still exist on SNAPSHOT with latest fixes. Also internet loss. MT7621|MT7603 #391

Open
Openwrtfunboy opened this issue Apr 10, 2020 · 15 comments

Comments

@Openwrtfunboy
Copy link

Openwrtfunboy commented Apr 10, 2020

Newifi3 D2

SNAPSHOT from 8 APR.

OpenWrt SNAPSHOT r12861-d8bde3687a / LuCI Master git-20.099.75691-88ec332

Disconnections still exists.

I sat next to the router and watched the live broadcast from YouTube.

Fri Apr 10 04:40:30 2020 daemon.notice hostapd: wlan0: AP-STA-DISCONNECTED 00:0a:f5:9e:17:84
Fri Apr 10 04:40:31 2020 daemon.notice hostapd: wlan0: STA 00:0a:f5:9e:17:84 IEEE 802.11: did not acknowledge authentication response
Fri Apr 10 04:40:41 2020 daemon.info hostapd: wlan0: STA 00:0a:f5:9e:17:84 IEEE 802.11: authenticated
Fri Apr 10 04:40:41 2020 daemon.info hostapd: wlan0: STA 00:0a:f5:9e:17:84 IEEE 802.11: associated (aid 4)
Fri Apr 10 04:40:41 2020 daemon.notice hostapd: wlan0: AP-STA-CONNECTED 00:0a:f5:9e:17:84
Fri Apr 10 04:40:41 2020 daemon.info hostapd: wlan0: STA 00:0a:f5:9e:17:84 WPA: pairwise key handshake completed (RSN)
Fri Apr 10 04:40:42 2020 daemon.info dnsmasq-dhcp[2650]: DHCPDISCOVER(br-lan) 00:0a:f5:9e:17:84
Fri Apr 10 04:40:42 2020 daemon.info dnsmasq-dhcp[2650]: DHCPOFFER(br-lan) 192.168.1.232 00:0a:f5:9e:17:84
Fri Apr 10 04:40:42 2020 daemon.info dnsmasq-dhcp[2650]: DHCPREQUEST(br-lan) 192.168.1.232 00:0a:f5:9e:17:84
Fri Apr 10 04:40:42 2020 daemon.info dnsmasq-dhcp[2650]: DHCPACK(br-lan) 192.168.1.232 00:0a:f5:9e:17:84 Network
Fri Apr 10 04:41:26 2020 daemon.notice hostapd: wlan0: AP-STA-DISCONNECTED 00:0a:f5:9e:17:84
Fri Apr 10 04:41:26 2020 daemon.info hostapd: wlan0: STA 00:0a:f5:9e:17:84 IEEE 802.11: authenticated
Fri Apr 10 04:41:33 2020 daemon.info hostapd: wlan0: STA 00:0a:f5:9e:17:84 IEEE 802.11: authenticated
Fri Apr 10 04:41:33 2020 daemon.info hostapd: wlan0: STA 00:0a:f5:9e:17:84 IEEE 802.11: associated (aid 4)
Fri Apr 10 04:41:33 2020 daemon.notice hostapd: wlan0: AP-STA-CONNECTED 00:0a:f5:9e:17:84
Fri Apr 10 04:41:33 2020 daemon.info hostapd: wlan0: STA 00:0a:f5:9e:17:84 WPA: pairwise key handshake completed (RSN)
Fri Apr 10 04:41:33 2020 daemon.info dnsmasq-dhcp[2650]: DHCPDISCOVER(br-lan) 00:0a:f5:9e:17:84
Fri Apr 10 04:41:33 2020 daemon.info dnsmasq-dhcp[2650]: DHCPOFFER(br-lan) 192.168.1.232 00:0a:f5:9e:17:84
Fri Apr 10 04:41:33 2020 daemon.info dnsmasq-dhcp[2650]: DHCPREQUEST(br-lan) 192.168.1.232 00:0a:f5:9e:17:84
Fri Apr 10 04:41:33 2020 daemon.info dnsmasq-dhcp[2650]: DHCPACK(br-lan) 192.168.1.232 00:0a:f5:9e:17:84 Network
Fri Apr 10 04:43:10 2020 daemon.notice hostapd: wlan0: AP-STA-DISCONNECTED 00:0a:f5:9e:17:84
Fri Apr 10 04:43:11 2020 daemon.notice hostapd: wlan0: STA 00:0a:f5:9e:17:84 IEEE 802.11: did not acknowledge authentication response
Fri Apr 10 04:43:11 2020 daemon.info hostapd: wlan0: STA 00:0a:f5:9e:17:84 IEEE 802.11: associated (aid 4)
Fri Apr 10 04:43:11 2020 daemon.notice hostapd: wlan0: AP-STA-CONNECTED 00:0a:f5:9e:17:84
Fri Apr 10 04:43:11 2020 daemon.info hostapd: wlan0: STA 00:0a:f5:9e:17:84 WPA: pairwise key handshake completed (RSN)
Fri Apr 10 04:43:11 2020 daemon.info dnsmasq-dhcp[2650]: DHCPDISCOVER(br-lan) 00:0a:f5:9e:17:84
Fri Apr 10 04:43:11 2020 daemon.info dnsmasq-dhcp[2650]: DHCPOFFER(br-lan) 192.168.1.232 00:0a:f5:9e:17:84
Fri Apr 10 04:43:11 2020 daemon.info dnsmasq-dhcp[2650]: DHCPREQUEST(br-lan) 192.168.1.232 00:0a:f5:9e:17:84
Fri Apr 10 04:43:11 2020 daemon.info dnsmasq-dhcp[2650]: DHCPACK(br-lan) 192.168.1.232 00:0a:f5:9e:17:84 Network
Fri Apr 10 04:43:14 2020 daemon.notice hostapd: wlan0: AP-STA-DISCONNECTED 00:0a:f5:9e:17:84
Fri Apr 10 04:43:15 2020 daemon.info hostapd: wlan0: STA 00:0a:f5:9e:17:84 IEEE 802.11: authenticated
Fri Apr 10 04:43:15 2020 daemon.info hostapd: wlan0: STA 00:0a:f5:9e:17:84 IEEE 802.11: associated (aid 4)
Fri Apr 10 04:43:20 2020 daemon.notice hostapd: wlan0: STA 00:0a:f5:9e:17:84 IEEE 802.11: did not acknowledge authentication response
Fri Apr 10 04:43:24 2020 daemon.info hostapd: wlan0: STA 00:0a:f5:9e:17:84 IEEE 802.11: deauthenticated due to inactivity (timer DEAUTH/REMOVE)
Fri Apr 10 04:43:47 2020 daemon.info hostapd: wlan0: STA 00:0a:f5:9e:17:84 IEEE 802.11: authenticated
Fri Apr 10 04:43:47 2020 daemon.info hostapd: wlan0: STA 00:0a:f5:9e:17:84 IEEE 802.11: associated (aid 4)
Fri Apr 10 04:43:47 2020 daemon.notice hostapd: wlan0: AP-STA-CONNECTED 00:0a:f5:9e:17:84
Fri Apr 10 04:43:47 2020 daemon.info hostapd: wlan0: STA 00:0a:f5:9e:17:84 WPA: pairwise key handshake completed (RSN)
Fri Apr 10 04:43:47 2020 daemon.info dnsmasq-dhcp[2650]: DHCPDISCOVER(br-lan) 00:0a:f5:9e:17:84
Fri Apr 10 04:43:47 2020 daemon.info dnsmasq-dhcp[2650]: DHCPOFFER(br-lan) 192.168.1.232 00:0a:f5:9e:17:84
Fri Apr 10 04:43:47 2020 daemon.info dnsmasq-dhcp[2650]: DHCPREQUEST(br-lan) 192.168.1.232 00:0a:f5:9e:17:84
Fri Apr 10 04:43:47 2020 daemon.info dnsmasq-dhcp[2650]: DHCPACK(br-lan) 192.168.1.232 00:0a:f5:9e:17:84 Network
Fri Apr 10 04:45:17 2020 daemon.notice netifd: wan (2386): udhcpc: sending renew to 10.8.0.1
Fri Apr 10 04:45:18 2020 daemon.notice netifd: wan (2386): udhcpc: lease of 10.8.7.99 obtained, lease time 600
Fri Apr 10 04:48:56 2020 daemon.notice hostapd: wlan0: AP-STA-DISCONNECTED 00:0a:f5:9e:17:84
Fri Apr 10 04:48:57 2020 daemon.info hostapd: wlan0: STA 00:0a:f5:9e:17:84 IEEE 802.11: authenticated
Fri Apr 10 04:48:57 2020 daemon.info hostapd: wlan0: STA 00:0a:f5:9e:17:84 IEEE 802.11: associated (aid 4)
Fri Apr 10 04:48:57 2020 daemon.notice hostapd: wlan0: AP-STA-CONNECTED 00:0a:f5:9e:17:84
Fri Apr 10 04:48:57 2020 daemon.info hostapd: wlan0: STA 00:0a:f5:9e:17:84 WPA: pairwise key handshake completed (RSN)
Fri Apr 10 04:48:57 2020 daemon.info dnsmasq-dhcp[2650]: DHCPDISCOVER(br-lan) 00:0a:f5:9e:17:84
Fri Apr 10 04:48:57 2020 daemon.info dnsmasq-dhcp[2650]: DHCPOFFER(br-lan) 192.168.1.232 00:0a:f5:9e:17:84
Fri Apr 10 04:48:57 2020 daemon.info dnsmasq-dhcp[2650]: DHCPREQUEST(br-lan) 192.168.1.232 00:0a:f5:9e:17:84
Fri Apr 10 04:48:57 2020 daemon.info dnsmasq-dhcp[2650]: DHCPACK(br-lan) 192.168.1.232 00:0a:f5:9e:17:84 Network

@nbd168

@Openwrtfunboy Openwrtfunboy changed the title Disconnections still exist on SNAPSHOT with latest fixes MT7621|MT7603 Disconnections still exist on SNAPSHOT with latest fixes. Also internet loss. MT7621|MT7603 Apr 10, 2020
@Openwrtfunboy
Copy link
Author

At the time of the tests, I noticed that when there are several clients nearby and a little bit further, then the clients who are nearby for some reason lose their Internet connection.

@Openwrtfunboy
Copy link
Author

Also i have same on latest stable 19.07.2

@Openwrtfunboy
Copy link
Author

Empirically, I found out that the loss of the Internet occurs even with clients connected via LAN. 4 clients were connected to the 2.4 network, 5 one was connected to the network, a laptop was connected via wire, which downloaded a 14 gigabyte file and periodically either the laptop or the rest of the clients lost the connection. Moreover, the clients were not next to the router.

@JFtico
Copy link

JFtico commented Apr 10, 2020

Also seeing similar on both 19.07.2 and snapshot r12529-228bb84744 .
See kernel log I posted in #375

@shir474
Copy link

shir474 commented Apr 17, 2020

Try moving the MT7621's switch interrupt to another core.

@JFtico
Copy link

JFtico commented Apr 17, 2020

@shir474 Thanks for the suggestion, how might I implement that?

@Openwrtfunboy
Copy link
Author

@shir474 Thanks for the suggestion, how might I implement that?

Are there any successes, @JFtico ?

@Openwrtfunboy
Copy link
Author

@shir474
Copy link

shir474 commented Apr 29, 2020

It is working for me. You'll have to figure out which interrupt you need to change the smp_affinity for.

move switch interrupt to CPU2

echo 4 > /proc/irq/21/smp_affinity

move one of mt7615 interrupt to CPU1

echo 2 > /proc/irq/26/smp_affinity

cat /proc/interrupts

       CPU0       CPU1       CPU2       CPU3

8: 44463125 44463236 44463197 44463199 MIPS GIC Local 1 timer
9: 17446 0 0 0 MIPS GIC 63 IPI call
10: 0 5948 0 0 MIPS GIC 64 IPI call
11: 0 0 11662 0 MIPS GIC 65 IPI call
12: 0 0 0 5550 MIPS GIC 66 IPI call
13: 525638 0 0 0 MIPS GIC 67 IPI resched
14: 0 18989179 0 0 MIPS GIC 68 IPI resched
15: 0 0 1249439 0 MIPS GIC 69 IPI resched
16: 0 0 0 1283756 MIPS GIC 70 IPI resched
17: 0 0 0 0 MIPS GIC 19 1e000600.gpio-bank0, 1e000600.gpio-bank1, 1e000600.gpio-bank2
19: 4227 0 0 0 MIPS GIC 33 ttyS1
20: 0 0 0 0 MIPS GIC 29 xhci-hcd:usb1
21: 27 0 964175 0 MIPS GIC 10 1e100000.ethernet
23: 16352529 0 0 0 MIPS GIC 11 mt7615e
24: 9527128 0 0 0 MIPS GIC 31 mt7615e
27: 0 0 0 0 1e000600.gpio 17 keys
28: 0 0 0 0 1e000600.gpio 12 keys

@JFtico
Copy link

JFtico commented Apr 29, 2020

shir474, thanks for that. And I've added the echo commands to my rc.local so it applies even after a reboot.
Is there a better place to persist this in a running box ?

This is what I put in rc.local

# Set ethernet on CPU 2 (live core)
echo 4 >/proc/irq/22/smp_affinity
# Move mt76x2e 5GHz radio to CPU 1
echo 2 >/proc/irq/25/smp_affinity

I'll let it run for a few days and see how it goes.

Any idea why this works? Seems fragile.

@shir474
Copy link

shir474 commented Apr 29, 2020

It works because all the load is not on a single core. It would be better if MSI-X was implemented by the driver so that the interrupts were distributed across all cores like IPI.

PS. There's a mistake on my previous post. The second line should have been
echo 2 > /proc/irq/26/smp_affinity
Didn't realize that interrupts where moved in the latest code.

@JFtico
Copy link

JFtico commented Apr 29, 2020

Thanks, funny as on my hardware (ZBT WE3526), interrupts were already distributed, ethernet on CPU0, 5Ghz on CPU2 and 2.4 on CPU3.
Basically all I tried is moving the switch to CPU2 and 5Ghz to CPU1, so now the switch and the 2.4 share a core and 5Ghz plus whatever other interrupts are serviced by CPU0 are on the first core.
I'm thinking I was probably OK as far as core load balancing unless I'm missing something?

@Bartvz
Copy link

Bartvz commented May 1, 2020

It works because all the load is not on a single core. It would be better if MSI-X was implemented by the driver so that the interrupts were distributed across all cores like IPI.

Which hopefully comes to mt7612/mt7603 since it came for mt7615 in these commits:

@nbd168 is that possible?

@JFtico
Copy link

JFtico commented May 4, 2020

After implementing the processor affinity changes, still seeing this after a day or so, with high >1000, proc interrupts error count.

[186702.786100] mt76x2e 0000:01:00.0: Firmware Version: 0.0.00
[186702.786121] mt76x2e 0000:01:00.0: Build: 1
[186702.786128] mt76x2e 0000:01:00.0: Build Time: 201507311614____
[186702.804804] mt76x2e 0000:01:00.0: Firmware running!
[186702.814922] ieee80211 phy1: Hardware restart was requested
[186704.545982] mt76x2e 0000:01:00.0: Firmware Version: 0.0.00
[186704.546002] mt76x2e 0000:01:00.0: Build: 1
[186704.546010] mt76x2e 0000:01:00.0: Build Time: 201507311614____
[186704.564774] mt76x2e 0000:01:00.0: Firmware running!
[186704.574825] ieee80211 phy1: Hardware restart was requested
[186706.405876] mt76x2e 0000:01:00.0: Firmware Version: 0.0.00
[186706.405895] mt76x2e 0000:01:00.0: Build: 1
[186706.405903] mt76x2e 0000:01:00.0: Build Time: 201507311614____
[186706.424680] mt76x2e 0000:01:00.0: Firmware running!
[186706.434810] ieee80211 phy1: Hardware restart was requested
[198203.032760] ------------[ cut here ]------------
[198203.037512] WARNING: CPU: 3 PID: 0 at target-mipsel_24kc_musl/linux-ramips_mt7621/mt76-2020-03-10-a94e43f2/mt7603/mac.c:205 mt7603_filter_tx+0x104/0x16c [mt7603e]
[198203.052056] Modules linked in: pppoe ppp_async pppox ppp_generic nf_conntrack_ipv6 mt76x2e mt76x2_common mt76x02_lib mt7603e mt76 mac80211 iptable_nat ipt_REJECT ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_nat xt_multiport xt_mark xt_mac xt_limit xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_FLOWOFFLOAD xt_CT slhc nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_conntrack_ipv4 nf_nat_ipv4 nf_nat nf_log_ipv4 nf_flow_table_hw nf_flow_table nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache nf_conntrack iptable_mangle iptable_filter ip_tables crc_ccitt compat ledtrig_usbport ledtrig_heartbeat nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 mmc_block mtk_sd mmc_core leds_gpio xhci_plat_hcd xhci_pci xhci_mtk xhci_hcd
[198203.122487]  gpio_button_hotplug usbcore nls_base usb_common
[198203.128227] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.14.172 #0
[198203.134377] Stack : 00000000 8ee50000 00000001 80077704 80630000 805ce338 00000000 00000000
[198203.142799]         80597020 8fc0fcd4 8fc3d30c 8060aa67 80591d60 00000001 8fc0fc78 ac07f5b5
[198203.151222]         00000000 00000000 80780000 00000000 807784a0 00000138 00000005 00000000
[198203.159640]         00000000 00000000 0001f4e3 70617773 00000000 80630000 00000000 8ee442a0
[198203.168059]         8ee48bc4 000000cd 00000009 8ee50000 00000008 802e8690 0000000c 8077000c
[198203.176484]         ...
[198203.179010] Call Trace:
[198203.181560] [<8000c4d4>] show_stack+0x58/0x100
[198203.186081] [<804c7784>] dump_stack+0xa4/0xe0
[198203.190514] [<8002fd60>] __warn+0xe0/0x140
[198203.194680] [<8002fa04>] warn_slowpath_null+0x1c/0x28
[198203.199806] [<8ee442a0>] mt7603_filter_tx+0x104/0x16c [mt7603e]
[198203.205798] [<8ee4440c>] mt7603_wtbl_set_ps+0xa0/0x12c [mt7603e]
[198203.211873] [<8ee418e8>] mt7603_sta_ps+0x38/0x3bc [mt7603e]
[198203.217533] [<8f7e4090>] mt76_rx_poll_complete+0x418/0x52c [mt76]
[198203.223700] [<8f7e0dbc>] mt76_dma_attach+0x858/0xf50 [mt76]
[198203.229400] ---[ end trace e49018ec2de3bc08 ]---

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants