Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connectd: don't crash if connect() fails immediately. #4360

Merged
merged 1 commit into from
Feb 1, 2021

Conversation

rustyrussell
Copy link
Contributor

Took me a while (stressing under valgrind) to reproduce this,
then longer to figure out how it happened.

Turns out io_new_conn() can fail if the init function fails.
In our case, this can happen if connect() immediately returns
an error (inside io_connect). But we've already set the finish
function, which (if this was the last address), will free connect,
making the assignment connect->conn = ... write to a freed address.

Either way, if it fails, try_connect_one_addr() has taken care to
update connect->conn, or free connect, and the caller should not do it.

Here's the valgrind trace:

==384981== Invalid write of size 8
==384981==    at 0x11127C: try_connect_one_addr (connectd.c:880)
==384981==    by 0x112BA1: destroy_io_conn (connectd.c:708)
==384981==    by 0x141459: destroy_conn (poll.c:244)
==384981==    by 0x14147F: destroy_conn_close_fd (poll.c:250)
==384981==    by 0x149EB9: notify (tal.c:240)
==384981==    by 0x149F8B: del_tree (tal.c:402)
==384981==    by 0x14A51A: tal_free (tal.c:486)
==384981==    by 0x140036: io_close (io.c:450)
==384981==    by 0x1400B3: do_plan (io.c:401)
==384981==    by 0x140134: io_ready (io.c:423)
==384981==    by 0x141A57: io_loop (poll.c:445)
==384981==    by 0x112CB0: main (connectd.c:1703)
==384981==  Address 0x4d67020 is 64 bytes inside a block of size 160 free'd
==384981==    at 0x483CA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==384981==    by 0x14A020: del_tree (tal.c:421)
==384981==    by 0x14A51A: tal_free (tal.c:486)
==384981==    by 0x1110C5: try_connect_one_addr (connectd.c:806)
==384981==    by 0x112BA1: destroy_io_conn (connectd.c:708)
==384981==    by 0x141459: destroy_conn (poll.c:244)
==384981==    by 0x14147F: destroy_conn_close_fd (poll.c:250)
==384981==    by 0x149EB9: notify (tal.c:240)
==384981==    by 0x149F8B: del_tree (tal.c:402)
==384981==    by 0x14A51A: tal_free (tal.c:486)
==384981==    by 0x140036: io_close (io.c:450)
==384981==    by 0x1405DC: io_connect_ (io.c:345)
==384981==  Block was alloc'd at
==384981==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==384981==    by 0x149CF1: allocate (tal.c:250)
==384981==    by 0x14A3C6: tal_alloc_ (tal.c:428)
==384981==    by 0x1114F2: try_connect_peer (connectd.c:1526)
==384981==    by 0x111717: connect_to_peer (connectd.c:1558)
==384981==    by 0x1124F5: recv_req (connectd.c:1627)
==384981==    by 0x1188B2: handle_read (daemon_conn.c:31)
==384981==    by 0x13FBCB: next_plan (io.c:59)
==384981==    by 0x140076: do_plan (io.c:407)
==384981==    by 0x140113: io_ready (io.c:417)
==384981==    by 0x141A57: io_loop (poll.c:445)
==384981==    by 0x112CB0: main (connectd.c:1703)

Signed-off-by: Rusty Russell rusty@rustcorp.com.au
Fixes: #4343

Took me a while (stressing under valgrind) to reproduce this,
then longer to figure out how it happened.

Turns out io_new_conn() can fail if the init function fails.
In our case, this can happen if connect() immediately returns
an error (inside io_connect).  But we've already set the finish
function, which (if this was the last address), will free connect,
making the assignment `connect->conn = ...` write to a freed address.

Either way, if it fails, try_connect_one_addr() has taken care to
update connect->conn, or free connect, and the caller should not do it.

Here's the valgrind trace:
```
==384981== Invalid write of size 8
==384981==    at 0x11127C: try_connect_one_addr (connectd.c:880)
==384981==    by 0x112BA1: destroy_io_conn (connectd.c:708)
==384981==    by 0x141459: destroy_conn (poll.c:244)
==384981==    by 0x14147F: destroy_conn_close_fd (poll.c:250)
==384981==    by 0x149EB9: notify (tal.c:240)
==384981==    by 0x149F8B: del_tree (tal.c:402)
==384981==    by 0x14A51A: tal_free (tal.c:486)
==384981==    by 0x140036: io_close (io.c:450)
==384981==    by 0x1400B3: do_plan (io.c:401)
==384981==    by 0x140134: io_ready (io.c:423)
==384981==    by 0x141A57: io_loop (poll.c:445)
==384981==    by 0x112CB0: main (connectd.c:1703)
==384981==  Address 0x4d67020 is 64 bytes inside a block of size 160 free'd
==384981==    at 0x483CA3F: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==384981==    by 0x14A020: del_tree (tal.c:421)
==384981==    by 0x14A51A: tal_free (tal.c:486)
==384981==    by 0x1110C5: try_connect_one_addr (connectd.c:806)
==384981==    by 0x112BA1: destroy_io_conn (connectd.c:708)
==384981==    by 0x141459: destroy_conn (poll.c:244)
==384981==    by 0x14147F: destroy_conn_close_fd (poll.c:250)
==384981==    by 0x149EB9: notify (tal.c:240)
==384981==    by 0x149F8B: del_tree (tal.c:402)
==384981==    by 0x14A51A: tal_free (tal.c:486)
==384981==    by 0x140036: io_close (io.c:450)
==384981==    by 0x1405DC: io_connect_ (io.c:345)
==384981==  Block was alloc'd at
==384981==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==384981==    by 0x149CF1: allocate (tal.c:250)
==384981==    by 0x14A3C6: tal_alloc_ (tal.c:428)
==384981==    by 0x1114F2: try_connect_peer (connectd.c:1526)
==384981==    by 0x111717: connect_to_peer (connectd.c:1558)
==384981==    by 0x1124F5: recv_req (connectd.c:1627)
==384981==    by 0x1188B2: handle_read (daemon_conn.c:31)
==384981==    by 0x13FBCB: next_plan (io.c:59)
==384981==    by 0x140076: do_plan (io.c:407)
==384981==    by 0x140113: io_ready (io.c:417)
==384981==    by 0x141A57: io_loop (poll.c:445)
==384981==    by 0x112CB0: main (connectd.c:1703)
```

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Occasional crash in connectd due to use-after-free
Fixes: ElementsProject#4343
@cdecker
Copy link
Member

cdecker commented Feb 1, 2021

ACK fe8ba49

@cdecker cdecker merged commit 82ed71d into ElementsProject:master Feb 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crash at init exchange (on EXPERIMENTAL=1 rc1 with rusty's node)
2 participants