Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault while connecting to wifi #1

Closed
illiliti opened this issue Jun 17, 2021 · 1 comment
Closed

segfault while connecting to wifi #1

illiliti opened this issue Jun 17, 2021 · 1 comment

Comments

@illiliti
Copy link
Owner

Due to unknown circumstances(something related to dhcp/ipv6), eiwd may cause segfault while connecting to wifi.

Further investigation revealed that it was caused by passing NULL to inet_ntop function.

[0] The problem is here:
https://git.kernel.org/pub/scm/libs/ell/ell.git/tree/ell/icmp6.c#n291

[1] NULL is set to client->ra by l_icmp6_client_stop:
https://git.kernel.org/pub/scm/libs/ell/ell.git/tree/ell/icmp6.c#n516

[2] Which may be called by l_dhcp6_client_stop:
https://git.kernel.org/pub/scm/libs/ell/ell.git/tree/ell/dhcp6.c#n1854

[3] Which is called by dhcp6_client_icmp6_event:
https://git.kernel.org/pub/scm/libs/ell/ell.git/tree/ell/dhcp6.c#n1479

[4] Which is called via client->event_handler:
https://git.kernel.org/pub/scm/libs/ell/ell.git/tree/ell/icmp6.c#n268

[5] client->event_handler is set by l_icmp6_client_set_event_handler:
https://git.kernel.org/pub/scm/libs/ell/ell.git/tree/ell/dhcp6.c#n1513

[6] The actual problem is here:
https://git.kernel.org/pub/scm/libs/ell/ell.git/tree/ell/dhcp6.c#n1478

If managed and other is false, client->ra will be set to NULL and later deferenced by inet_ntop.

https://git.kernel.org/pub/scm/libs/ell/ell.git/tree/ell/icmp6.c#n333:

if (!client->ra) {
    client->ra = r;
    // client->ra will be set to NULL if `managed` and `other` is false[6]
    icmp6_client_event_notify(client, L_ICMP6_CLIENT_EVENT_ROUTER_FOUND);
    // client->ra will deferenced by `inet_ntop` in `icmp6_client_setup_routes`[0]
    icmp6_client_setup_routes(client); 
    return 0;
}

The fix is trivial:

diff --git a/ell/icmp6.c b/ell/icmp6.c
index fe8a506..c7f1458 100644
--- a/ell/icmp6.c
+++ b/ell/icmp6.c
@@ -288,6 +288,10 @@ static void icmp6_client_setup_routes(struct l_icmp6_client *client)
 	char buf[INET6_ADDRSTRLEN];
 	unsigned int i;
 
+	if (!ra) {
+	    return;
+	}
+
 	rt = l_rtnl_route_new_gateway(inet_ntop(AF_INET6, ra->address,
 							buf, sizeof(buf)));
 	if (!rt) {

Looks like upstream is affected too? If so, i will try to send patch to mailing list.

illiliti pushed a commit that referenced this issue Aug 2, 2021
ERROR: AddressSanitizer: global-buffer-overflow on address 0x000000512c08 at pc 0x00000041848d bp 0x7ffcdde71870 sp 0x7ffcdde71860
READ of size 8 at 0x000000512c08 thread T0
    #0 0x41848c in print_attributes monitor/nlmon.c:6268
    #1 0x42ac53 in print_message monitor/nlmon.c:6544
    #2 0x438968 in nlmon_message monitor/nlmon.c:6698
    #3 0x43d5e4 in nlmon_receive monitor/nlmon.c:7658
    #4 0x4b3cd0 in io_callback ell/io.c:120
    #5 0x4b085a in l_main_iterate ell/main.c:478
    #6 0x4b0ee3 in l_main_run ell/main.c:525
    #7 0x4b0ee3 in l_main_run ell/main.c:507
    #8 0x4b13ac in l_main_run_with_signal ell/main.c:647
    #9 0x4072fe in main monitor/main.c:811
@illiliti
Copy link
Owner Author

illiliti commented Nov 6, 2021

Can't reproduce anymore

@illiliti illiliti closed this as completed Nov 6, 2021
ioraff pushed a commit to ioraff/eiwd that referenced this issue Feb 25, 2022
Commit 4d2176d ("handshake: Allow event handler to free handshake")
introduced a re-entrancy guard so that handshake_state objects that are
destroyed as a result of the event do not cause a crash.  It rightly
used a temporary object to store the passed in handshake.  Unfortunately
this caused variable shadowing which resulted in crashes fixed by commit
d22b174 ("handshake: use _hs directly in handshake_event").
However, since the temporary was no longer used, this fix itself caused
a crash:

 #0  0x00005555f0ba8b3d in eapol_handle_ptk_1_of_4 (sm=sm@entry=0x5555f2b4a920, ek=0x5555f2b62588, ek@entry=0x16, unencrypted=unencrypted@entry=false) at src/eapol.c:1236
1236				handshake_event(sm->handshake,
(gdb) bt
 #0  0x00005555f0ba8b3d in eapol_handle_ptk_1_of_4 (sm=sm@entry=0x5555f2b4a920, ek=0x5555f2b62588, ek@entry=0x16, unencrypted=unencrypted@entry=false) at src/eapol.c:1236
 illiliti#1  0x00005555f0bab118 in eapol_key_handle (unencrypted=<optimized out>, frame=<optimized out>, sm=0x5555f2b4a920) at src/eapol.c:2343
 illiliti#2  eapol_rx_packet (proto=<optimized out>, from=<optimized out>, frame=<optimized out>, unencrypted=<optimized out>, user_data=0x5555f2b4a920) at src/eapol.c:2665
 illiliti#3  0x00005555f0bac497 in __eapol_rx_packet (ifindex=62, src=src@entry=0x5555f2b62574 "x\212 J\207\267", proto=proto@entry=34958, frame=frame@entry=0x5555f2b62588 "\002\003",
   len=len@entry=121, noencrypt=noencrypt@entry=false) at src/eapol.c:3017
 illiliti#4  0x00005555f0b8c617 in netdev_control_port_frame_event (netdev=0x5555f2b64450, msg=0x5555f2b62588) at src/netdev.c:5574
 illiliti#5  netdev_unicast_notify (msg=msg@entry=0x5555f2b619a0, user_data=<optimized out>) at src/netdev.c:5613
 illiliti#6  0x00007f60084c9a51 in dispatch_unicast_watches (msg=0x5555f2b619a0, id=<optimized out>, genl=0x5555f2b3fc80) at ell/genl.c:954
 #7  process_unicast (nlmsg=0x7fff61abeac0, genl=0x5555f2b3fc80) at ell/genl.c:973
 #8  received_data (io=<optimized out>, user_data=0x5555f2b3fc80) at ell/genl.c:1098
 #9  0x00007f60084c61bd in io_callback (fd=<optimized out>, events=1, user_data=0x5555f2b3fd20) at ell/io.c:120
 #10 0x00007f60084c536d in l_main_iterate (timeout=<optimized out>) at ell/main.c:478
 #11 0x00007f60084c543e in l_main_run () at ell/main.c:525
 #12 l_main_run () at ell/main.c:507
 #13 0x00007f60084c5670 in l_main_run_with_signal (callback=callback@entry=0x5555f0b89150 <signal_handler>, user_data=user_data@entry=0x0) at ell/main.c:647
 #14 0x00005555f0b886a4 in main (argc=<optimized out>, argv=<optimized out>) at src/main.c:532

This happens when the driver does not support rekeying, which causes iwd to
attempt a disconnect and re-connect.  The disconnect action is
taken during the event callback and destroys the underlying eapol state
machine.  Since a temporary isn't used, attempting to dereference
sm->handshake results in a crash.

Fix this by introducing a UNIQUE_ID macro which should prevent shadowing
and using a temporary variable as originally intended.

Fixes: d22b174 ("handshake: use _hs directly in handshake_event")
Fixes: 4d2176d ("handshake: Allow event handler to free handshake")
Reported-By: Toke Høiland-Jørgensen <toke@toke.dk>
Tested-by: Toke Høiland-Jørgensen <toke@toke.dk>
illiliti pushed a commit that referenced this issue Sep 11, 2022
Certain module dependencies were missing, which could cause a crash on
exit under (very unlikely) circumstances.

  #0  l_queue_peek_head (queue=<optimized out>) at ../iwd-1.28/ell/queue.c:241
  #1  0x0000aaaab752f2a0 in wiphy_radio_work_done (wiphy=0xaaaac3a129a0, id=6)
      at ../iwd-1.28/src/wiphy.c:2013
  #2  0x0000aaaab7523f50 in netdev_connect_free (netdev=netdev@entry=0xaaaac3a13db0)
      at ../iwd-1.28/src/netdev.c:765
  #3  0x0000aaaab7526208 in netdev_free (data=0xaaaac3a13db0) at ../iwd-1.28/src/netdev.c:909
  #4  0x0000aaaab75a3924 in l_queue_clear (queue=queue@entry=0xaaaac3a0c800,
      destroy=destroy@entry=0xaaaab7526190 <netdev_free>) at ../iwd-1.28/ell/queue.c:107
  #5  0x0000aaaab75a3974 in l_queue_destroy (queue=0xaaaac3a0c800,
      destroy=destroy@entry=0xaaaab7526190 <netdev_free>) at ../iwd-1.28/ell/queue.c:82
  #6  0x0000aaaab7522050 in netdev_exit () at ../iwd-1.28/src/netdev.c:6653
  #7  0x0000aaaab7579bb0 in iwd_modules_exit () at ../iwd-1.28/src/module.c:181

In this particular case, wiphy module was de-initialized prior to the
netdev module:

Jul 14 18:14:39 localhost iwd[2867]: ../iwd-1.28/src/wiphy.c:wiphy_free() Freeing wiphy phy0[0]
Jul 14 18:14:39 localhost iwd[2867]: ../iwd-1.28/src/netdev.c:netdev_free() Freeing netdev wlan0[45]
illiliti pushed a commit that referenced this issue Jan 28, 2023
Under the following conditions IWD can accidentally trigger a second
roam scan while one is already in progress:

 - A low RSSI condition is met. This starts the roam rearm timer.
 - A packet loss condition is met, which triggers a roam scan.
 - The roam rearm timer fires and starts another roam scan while
   also overwriting the first roam scan ID.
 - Then, if IWD gets disconnected the overwritten roam scan gets
   canceled, and the roam state is cleared which NULL's
   station->connected_network.
 - The initial roam scan results then come in with the assumption
   that IWD is still connected which results in a crash trying to
   reference station->connected_network.

This can be fixed by adding a station_cannot_roam check in the rearm
timer. If IWD is already doing a roam scan station->preparing_roam
should be set which will cause it to return true and stop any further
action.

Aborting (signal 11) [/usr/libexec/iwd]
iwd[426]: ++++++++ backtrace ++++++++
iwd[426]: #0  0x7f858d7b2090 in /lib/x86_64-linux-gnu/libc.so.6
iwd[426]: #1  0x443df7 in network_get_security() at ome/locus/workspace/iwd/src/network.c:287
iwd[426]: #2  0x421fbb in station_roam_scan_notify() at ome/locus/workspace/iwd/src/station.c:2516
iwd[426]: #3  0x43ebc1 in scan_finished() at ome/locus/workspace/iwd/src/scan.c:1861
iwd[426]: #4  0x43ecf2 in get_scan_done() at ome/locus/workspace/iwd/src/scan.c:1891
iwd[426]: #5  0x4cbfe9 in destroy_request() at ome/locus/workspace/iwd/ell/genl.c:676
iwd[426]: #6  0x4cc98b in process_unicast() at ome/locus/workspace/iwd/ell/genl.c:954
iwd[426]: #7  0x4ccd28 in received_data() at ome/locus/workspace/iwd/ell/genl.c:1052
iwd[426]: #8  0x4c79c9 in io_callback() at ome/locus/workspace/iwd/ell/io.c:120
iwd[426]: #9  0x4c62e3 in l_main_iterate() at ome/locus/workspace/iwd/ell/main.c:476
iwd[426]: #10 0x4c6426 in l_main_run() at ome/locus/workspace/iwd/ell/main.c:519
iwd[426]: #11 0x4c6752 in l_main_run_with_signal() at ome/locus/workspace/iwd/ell/main.c:645
iwd[426]: #12 0x405987 in main() at ome/locus/workspace/iwd/src/main.c:600
iwd[426]: #13 0x7f858d793083 in /lib/x86_64-linux-gnu/libc.so.6
iwd[426]: +++++++++++++++++++++++++++
illiliti pushed a commit that referenced this issue Dec 7, 2023
If the FT-Authenticate frame has been sent then a deauth is received
the work item for sending the FT-Associate frame is never canceled.
When this runs station->connected_network is NULL which causes a
crash:

src/station.c:station_try_next_transition() 7, target xx:xx:xx:xx:xx:xx
src/wiphy.c:wiphy_radio_work_insert() Inserting work item 5843
src/wiphy.c:wiphy_radio_work_insert() Inserting work item 5844
src/wiphy.c:wiphy_radio_work_done() Work item 5842 done
src/wiphy.c:wiphy_radio_work_next() Starting work item 5843
src/netdev.c:netdev_mlme_notify() MLME notification Remain on Channel(55)
src/ft.c:ft_send_authenticate()
src/netdev.c:netdev_mlme_notify() MLME notification Frame TX Status(60)
src/netdev.c:netdev_link_notify() event 16 on ifindex 7
src/netdev.c:netdev_mlme_notify() MLME notification Del Station(20)
src/netdev.c:netdev_mlme_notify() MLME notification Deauthenticate(39)
src/netdev.c:netdev_deauthenticate_event()
src/netdev.c:netdev_mlme_notify() MLME notification Disconnect(48)
src/netdev.c:netdev_disconnect_event()
Received Deauthentication event, reason: 7, from_ap: true
src/station.c:station_disconnect_event() 7
src/station.c:station_disassociated() 7
src/station.c:station_reset_connection_state() 7
src/station.c:station_roam_state_clear() 7
src/netconfig.c:netconfig_event_handler() l_netconfig event 2
src/netconfig-commit.c:netconfig_commit_print_addrs() removing address: yyy.yyy.yyy.yyy
src/resolve.c:resolve_systemd_revert() ifindex: 7
[DHCPv4] l_dhcp_client_stop:1264 Entering state: DHCP_STATE_INIT
src/station.c:station_enter_state() Old State: connected, new state: disconnected
src/station.c:station_enter_state() Old State: disconnected, new state: autoconnect_quick
src/wiphy.c:wiphy_radio_work_insert() Inserting work item 5845
src/netdev.c:netdev_mlme_notify() MLME notification Cancel Remain on Channel(56)
src/wiphy.c:wiphy_radio_work_done() Work item 5843 done
src/wiphy.c:wiphy_radio_work_next() Starting work item 5844

"Program terminated with signal SIGSEGV, Segmentation fault.",
"#0  0x0000565359ee3f54 in network_bss_find_by_addr ()",
"#0  0x0000565359ee3f54 in network_bss_find_by_addr ()",
"#1  0x0000565359ec9d23 in station_ft_work_ready ()",
"#2  0x0000565359ec0af0 in wiphy_radio_work_next ()",
"#3  0x0000565359f20080 in offchannel_mlme_notify ()",
"#4  0x0000565359f4416b in received_data ()",
"#5  0x0000565359f40d90 in io_callback ()",
"#6  0x0000565359f3ff4d in l_main_iterate ()",
"#7  0x0000565359f4001c in l_main_run ()",
"#8  0x0000565359f40240 in l_main_run_with_signal ()",
"#9  0x0000565359eb3888 in main ()"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant