Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

latest iwd cannot find any network in congested area #4

Closed
bdantas opened this issue Dec 8, 2022 · 10 comments
Closed

latest iwd cannot find any network in congested area #4

bdantas opened this issue Dec 8, 2022 · 10 comments

Comments

@bdantas
Copy link

bdantas commented Dec 8, 2022

I am on Tiny Core Linux x86_64 with kernel version 5.15.10. eiwd version 1.20-1 was working perfectly. Yesterday I upgraded to eiwd version 2.0-1 and noticed strange behavior: I can authenticate with all APs that have a password, but only some open APs. Unfortunately, the open AP at my work place is the one that I cannot authenticate with.

Here is the result of an AP scan using sudo iw dev wlan0 scan flush. It includes an open AP that works fine (AndroidAP-bd) and also shows the open AP that I am having trouble authenticating with (ARWCVisitor): https://pastebin.com/WWqcGxMf

I am trying to authenticate like this:

# touch /var/lib/iwd/ARWCVisitor.open
# iwd -i wlan0 &

Can you please help me figure out how to authenticate with ARWCVisitor when using eiwd 2.0-1?

@bdantas
Copy link
Author

bdantas commented Dec 8, 2022

I suspect the issue is that iw detects this particular AP but the new iwd does not detect it. If that's the case, the problem may have to do with how ARWCVisitor broadcasts its ssid rather than with the fact that it is an open network. But it's just a guess. I'm stuck at this point.

@illiliti
Copy link
Owner

illiliti commented Dec 9, 2022

Hi. Could you try to reproduce the bug in original iwd(that one that needs dbus)? That would help me understand whether the bug affects only eiwd or it is generic bug that worth fixing/debugging in upstream iwd.

@bdantas
Copy link
Author

bdantas commented Dec 9, 2022

Hi, illiliti. I was able to reproduce the bug in the original iwd. Something about my workplace causes iwd to not be able to detect any networks at all.

Here is me running the original iwd on my laptop when I'm at home, where everything works perfectly:

$ tce-load -i ./iwd.tcz 
iwd.tcz: OK

$ sudo /usr/local/etc/init.d/dbus start
starting dbus daemon...

$ sudo iwd &
No Diffie-Hellman support found, WPS will not be available
The following options are missing in the kernel:
	CONFIG_KEY_DH_OPERATIONS
Wireless daemon version 2.0
station: Network configuration is disabled.
Wiphy: 0, Name: phy0
	Permanent Address: 00:16:eb:99:80:18
	2.4Ghz Band:
		Bitrates (non-HT):
			 1.0 Mbps
			 2.0 Mbps
			 5.5 Mbps
			11.0 Mbps
			 6.0 Mbps
			 9.0 Mbps
			12.0 Mbps
			18.0 Mbps
			24.0 Mbps
			36.0 Mbps
			48.0 Mbps
			54.0 Mbps
		HT Capabilities:
			HT40
			Short GI for 20Mhz
			Short GI for 40Mhz
		HT RX MCS indexes:
			0-15
	5Ghz Band:
		Bitrates (non-HT):
			 6.0 Mbps
			 9.0 Mbps
			12.0 Mbps
			18.0 Mbps
			24.0 Mbps
			36.0 Mbps
			48.0 Mbps
			54.0 Mbps
		HT Capabilities:
			HT40
			Short GI for 20Mhz
			Short GI for 40Mhz
		HT RX MCS indexes:
			0-15
		VHT Capabilities:
			Short GI for 80Mhz
			Max RX MCS: 0-9 for NSS: 2
			Max TX MCS: 0-9 for NSS: 2
	Ciphers: BIP-CMAC-128 CCMP-128 TKIP
	Supported iftypes: ad-hoc station ap p2p-client p2p-go p2p-device

$ sudo iwctl station wlan0 get-networks
                               Available networks                              
--------------------------------------------------------------------------------
      Network name                      Security            Signal
--------------------------------------------------------------------------------
      bibliotheca_alexandrina           psk                 ****    
      AndroidAP-bd                      open                ****    
      donkey                            psk                 ****    
      DIRECT-A7-HP OfficeJet Pro 6970   psk                 ****    

$ sudo iwctl station wlan0 connect donkey
Type the network passphrase for donkey psk.
Passphrase: *********

$ sudo udhcpc -S -i wlan0 >/dev/null

$ ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=115 time=39.405 ms
64 bytes from 8.8.8.8: seq=1 ttl=115 time=21.158 ms
64 bytes from 8.8.8.8: seq=2 ttl=115 time=21.211 ms
^C
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 21.158/27.258/39.405 ms

At work, things break down at the sudo iwctl station wlan0 get-networks step--it says "No networks available", which is not true. I work at a big company where there are many wireless networks. Maybe too many--iwd seems to get overwhelmed and doesn't even detect the open network being generated by my phone (AndroidAP-bd), which is sitting right next to my laptop.

Can you help me troubleshoot this or are upstream issues outside your purview?

P.S. I also tried running sudo iwctl station wlan0 scan prior to sudo iwctl station wlan0 get-networks but it makes no difference. In fact, sudo iwctl station wlan0 scan just seems to hang and never returns.

@bdantas bdantas changed the title latest eiwd cannot authenticate with some open networks latest iwd cannot find any network in congested area Dec 9, 2022
@bdantas
Copy link
Author

bdantas commented Dec 9, 2022

For the time being, I am using iw to scan and wpa_supplicant to authenticate. iw's scans work well regardless of the number of APs around.

@illiliti
Copy link
Owner

illiliti commented Dec 9, 2022

If you said it was working before then I guess some commit between 1.20 and 2.0 regressed your setup. You could try to debug it yourself using git bisect. It will help you to find commit that caused regression. Afterwards you need to directly report your finding to upstream either via IRC1 or mailing list2.

Footnotes

  1. https://git.kernel.org/pub/scm/network/wireless/iwd.git/tree/README#n274

  2. https://git.kernel.org/pub/scm/network/wireless/iwd.git/tree/README#n270

@illiliti
Copy link
Owner

illiliti commented Dec 9, 2022

One more nuance: if you failed to find regression in iwd, try to bisect(between 0.53 and 0.54) ell1. It is used by iwd as a low-level library, therefore the bug might be there as well.

In any event, contact upstream and they will help you(they will suggest better fix than I would after all).

Footnotes

  1. https://git.kernel.org/pub/scm/libs/ell/ell.git

@bdantas
Copy link
Author

bdantas commented Dec 9, 2022

Sounds like a plan. Thank you for your help and direction.

@bdantas bdantas closed this as completed Dec 9, 2022
@illiliti
Copy link
Owner

Let's keep it open until the bug is resolved.

@illiliti illiliti reopened this Dec 10, 2022
@bdantas
Copy link
Author

bdantas commented Dec 12, 2022

After much pain and suffering testing both iwd and ell commits, I figured out that the regression was in iwd (ell is innocent) between iwd 1.29 (last good release) and iwd 1.30 (first bad release). I did a git bisect which says:

dfcba7c83c60f1a99ecf6f1424e97bf89fcab38e is the first bad commit
commit dfcba7c83c60f1a99ecf6f1424e97bf89fcab38e
Author: James Prestwood <prestwoj@gmail.com>
Date:   Thu Aug 4 11:51:06 2022 -0700

    wiphy: track GET_REG ID
    
    Keep track of the GET_REG call so it can be canceled if the wiphy
    goes down.

 src/wiphy.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

I'll let the iwd developers know.

@bdantas
Copy link
Author

bdantas commented Dec 14, 2022

This issue was fixed upstream with this commit:
https://git.kernel.org/pub/scm/network/wireless/iwd.git/commit/?id=ea93e74a0650f5f0355919184c265ae747e52a37

illiliti, thanks again for your guidance and for maintaining eiwd!

@bdantas bdantas closed this as completed Dec 14, 2022
illiliti pushed a commit that referenced this issue Jan 28, 2023
Under the following conditions IWD can accidentally trigger a second
roam scan while one is already in progress:

 - A low RSSI condition is met. This starts the roam rearm timer.
 - A packet loss condition is met, which triggers a roam scan.
 - The roam rearm timer fires and starts another roam scan while
   also overwriting the first roam scan ID.
 - Then, if IWD gets disconnected the overwritten roam scan gets
   canceled, and the roam state is cleared which NULL's
   station->connected_network.
 - The initial roam scan results then come in with the assumption
   that IWD is still connected which results in a crash trying to
   reference station->connected_network.

This can be fixed by adding a station_cannot_roam check in the rearm
timer. If IWD is already doing a roam scan station->preparing_roam
should be set which will cause it to return true and stop any further
action.

Aborting (signal 11) [/usr/libexec/iwd]
iwd[426]: ++++++++ backtrace ++++++++
iwd[426]: #0  0x7f858d7b2090 in /lib/x86_64-linux-gnu/libc.so.6
iwd[426]: #1  0x443df7 in network_get_security() at ome/locus/workspace/iwd/src/network.c:287
iwd[426]: #2  0x421fbb in station_roam_scan_notify() at ome/locus/workspace/iwd/src/station.c:2516
iwd[426]: #3  0x43ebc1 in scan_finished() at ome/locus/workspace/iwd/src/scan.c:1861
iwd[426]: #4  0x43ecf2 in get_scan_done() at ome/locus/workspace/iwd/src/scan.c:1891
iwd[426]: #5  0x4cbfe9 in destroy_request() at ome/locus/workspace/iwd/ell/genl.c:676
iwd[426]: #6  0x4cc98b in process_unicast() at ome/locus/workspace/iwd/ell/genl.c:954
iwd[426]: #7  0x4ccd28 in received_data() at ome/locus/workspace/iwd/ell/genl.c:1052
iwd[426]: #8  0x4c79c9 in io_callback() at ome/locus/workspace/iwd/ell/io.c:120
iwd[426]: #9  0x4c62e3 in l_main_iterate() at ome/locus/workspace/iwd/ell/main.c:476
iwd[426]: #10 0x4c6426 in l_main_run() at ome/locus/workspace/iwd/ell/main.c:519
iwd[426]: #11 0x4c6752 in l_main_run_with_signal() at ome/locus/workspace/iwd/ell/main.c:645
iwd[426]: #12 0x405987 in main() at ome/locus/workspace/iwd/src/main.c:600
iwd[426]: #13 0x7f858d793083 in /lib/x86_64-linux-gnu/libc.so.6
iwd[426]: +++++++++++++++++++++++++++
illiliti pushed a commit that referenced this issue Dec 7, 2023
If the FT-Authenticate frame has been sent then a deauth is received
the work item for sending the FT-Associate frame is never canceled.
When this runs station->connected_network is NULL which causes a
crash:

src/station.c:station_try_next_transition() 7, target xx:xx:xx:xx:xx:xx
src/wiphy.c:wiphy_radio_work_insert() Inserting work item 5843
src/wiphy.c:wiphy_radio_work_insert() Inserting work item 5844
src/wiphy.c:wiphy_radio_work_done() Work item 5842 done
src/wiphy.c:wiphy_radio_work_next() Starting work item 5843
src/netdev.c:netdev_mlme_notify() MLME notification Remain on Channel(55)
src/ft.c:ft_send_authenticate()
src/netdev.c:netdev_mlme_notify() MLME notification Frame TX Status(60)
src/netdev.c:netdev_link_notify() event 16 on ifindex 7
src/netdev.c:netdev_mlme_notify() MLME notification Del Station(20)
src/netdev.c:netdev_mlme_notify() MLME notification Deauthenticate(39)
src/netdev.c:netdev_deauthenticate_event()
src/netdev.c:netdev_mlme_notify() MLME notification Disconnect(48)
src/netdev.c:netdev_disconnect_event()
Received Deauthentication event, reason: 7, from_ap: true
src/station.c:station_disconnect_event() 7
src/station.c:station_disassociated() 7
src/station.c:station_reset_connection_state() 7
src/station.c:station_roam_state_clear() 7
src/netconfig.c:netconfig_event_handler() l_netconfig event 2
src/netconfig-commit.c:netconfig_commit_print_addrs() removing address: yyy.yyy.yyy.yyy
src/resolve.c:resolve_systemd_revert() ifindex: 7
[DHCPv4] l_dhcp_client_stop:1264 Entering state: DHCP_STATE_INIT
src/station.c:station_enter_state() Old State: connected, new state: disconnected
src/station.c:station_enter_state() Old State: disconnected, new state: autoconnect_quick
src/wiphy.c:wiphy_radio_work_insert() Inserting work item 5845
src/netdev.c:netdev_mlme_notify() MLME notification Cancel Remain on Channel(56)
src/wiphy.c:wiphy_radio_work_done() Work item 5843 done
src/wiphy.c:wiphy_radio_work_next() Starting work item 5844

"Program terminated with signal SIGSEGV, Segmentation fault.",
"#0  0x0000565359ee3f54 in network_bss_find_by_addr ()",
"#0  0x0000565359ee3f54 in network_bss_find_by_addr ()",
"#1  0x0000565359ec9d23 in station_ft_work_ready ()",
"#2  0x0000565359ec0af0 in wiphy_radio_work_next ()",
"#3  0x0000565359f20080 in offchannel_mlme_notify ()",
"#4  0x0000565359f4416b in received_data ()",
"#5  0x0000565359f40d90 in io_callback ()",
"#6  0x0000565359f3ff4d in l_main_iterate ()",
"#7  0x0000565359f4001c in l_main_run ()",
"#8  0x0000565359f40240 in l_main_run_with_signal ()",
"#9  0x0000565359eb3888 in main ()"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants