Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a Segfault during aaft #19178

Closed
wants to merge 2 commits into from
Closed

Fix a Segfault during aaft #19178

wants to merge 2 commits into from

Conversation

thymol0
Copy link
Contributor

@thymol0 thymol0 commented Oct 6, 2021

Steps to reproduce the Segfault:

$ echo 'main(){}' >m.c
$ gcc -o m m.c
$ r2 -e bin.cache=1 -e anal.detectwrites=1 -A m
[x] Analyze all flags starting with sym. and entry0 (aa)
[x] Analyze function calls (aac)
[x] Analyze len bytes of instructions for references (aar)
[x] Finding and parsing C++ vtables (avrr)
segmentation fault

Environment:
OS: ubuntu-18.04 docker image
GCC: gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)

Checklist

  • Closing issues: #issue
  • Mark this if you consider it ready to merge
  • I've added tests (optional)
  • I wrote some lines in the radare2book

@trufae
Copy link
Collaborator

trufae commented Oct 6, 2021

Uhm. I dont see where it is modified inside this foreach to be the reason for safe to be used.

@trufae
Copy link
Collaborator

trufae commented Oct 6, 2021

Can you add a test? may be good to see the backtrace from asan too

@thymol0
Copy link
Contributor Author

thymol0 commented Oct 6, 2021

Deep down that foreach loop body, free is called with the iterator. The segfault is a use after free.

@thymol0
Copy link
Contributor Author

thymol0 commented Oct 6, 2021

I'll see if I can make a test case.

@trufae
Copy link
Collaborator

trufae commented Oct 7, 2021

Yes please, provide the reproducer file at least. ive re-read the function a couple of times but maybe im blind and i dont see any reference to it anywhere. i mean it doesnt hurt to use that _safe. it just makes the loop a little slower, but im trying to understand the root cause of this for a proper fix that doesnt involves saving the iterator.

thanks!

@trufae trufae added this to the 5.5.0 milestone Oct 7, 2021
@thymol0
Copy link
Contributor Author

thymol0 commented Oct 7, 2021

Ok. I have reduced it to this:

r2 - <<EOF
e bin.cache=1
e anal.detectwrites=1
s 0x0
f func
w \x7f\x45\x4c\x46\x02\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x03\x00\x3e\x00\x01\x00\x00\x00\xf0\x04\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\xe0\x18\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\x38\x00\x09\x00\x40\x00\x1c\x00\x1b\x00\x06\x00\x00\x00\x04\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\xf8\x01\x00\x00\x00\x00\x00\x00\xf8\x01\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x03
af
aeim
aft
EOF

It can probably be reduced further but I think this is good enough.

How do you make a test case for a Segfault?

@thymol0
Copy link
Contributor Author

thymol0 commented Oct 7, 2021

Ok. I made a GDB script to stop where the iterator is freed and print the backtrace. This is how to run it:

cat >i <<EOF
"e bin.cache=1"
"e anal.detectwrites=1"
"w \x7f\x45\x4c\x46\x02\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x03\x00\x3e\x00\x01\x00\x00\x00\xf0\x04\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\xe0\x18\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\x38\x00\x09\x00\x40\x00\x1c\x00\x1b\x00\x06\x00\x00\x00\x04\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\xf8\x01\x00\x00\x00\x00\x00\x00\xf8\x01\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x03"
"s 0x0"
"f func"
"af"
"aeim"
"aft"
EOF

cat >g <<'EOF'
set breakpoint pending on
b anal_tp.c:520
ignore 1 1
r
set $it=it
b *free+0 if $rdi==$it
c
bt
EOF

gdb -x g --args r2 -i i -

Breakpoint 1, r_core_anal_type_match (core=0x7ffff5e41010, fcn=0x55555583cae0) at ../libr/core/anal_tp.c:520
520			ut64 addr = bb->addr;
Breakpoint 2 at 0x7ffff7ca7760

Breakpoint 2, 0x00007ffff7ca7760 in free () from /lib64/libc.so.6
#0  0x00007ffff7ca7760 in free () from /lib64/libc.so.6
#1  0x00007ffff7ed7c81 in r_list_delete (list=0x5555556969a0, iter=0x555555667310) at ../libr/util/list.c:126
#2  0x00007ffff7ed7b9e in r_list_delete_data (list=0x5555556969a0, ptr=0x55555566bd10) at ../libr/util/list.c:112
#3  0x00007ffff6a60231 in r_anal_function_remove_block (fcn=0x55555583cae0, bb=0x55555566bd10) at ../libr/anal/function.c:294
#4  0x00007ffff6a8a750 in update_analysis (anal=0x5555555a9170, fcns=0x55555566d790, reachable=0x55555569a9c0) at ../libr/anal/fcn.c:2267
#5  0x00007ffff6a8abaf in r_anal_update_analysis_range (anal=0x5555555a9170, addr=0, size=1) at ../libr/anal/fcn.c:2328
#6  0x00007ffff72024b5 in ev_iowrite_cb (ev=0x5555555e50d0, type=11, user=0x7ffff5e41010, data=0x7fffffffbd80) at ../libr/core/core.c:2746
#7  0x00007ffff7ecc75b in r_event_send (ev=0x5555555e50d0, type=11, data=0x7fffffffbd80) at ../libr/util/event.c:123
#8  0x00007ffff799abe3 in r_io_plugin_write (desc=0x555555832fc0, buf=0x7fffffffc070 "\376", len=1) at ../libr/io/io_plugin.c:168
#9  0x00007ffff7999983 in r_io_desc_write (desc=0x555555832fc0, buf=0x7fffffffc070 "\376", len=1) at ../libr/io/io_desc.c:190
#10 0x00007ffff799a0ed in r_io_desc_write_at (desc=0x555555832fc0, addr=0, buf=0x7fffffffc070 "\376", len=1) at ../libr/io/io_desc.c:342
#11 0x00007ffff7992d10 in r_io_fd_write_at (io=0x5555555d9fe0, fd=3, addr=0, buf=0x7fffffffc070 "\376", len=1) at ../libr/io/io_fd.c:70
#12 0x00007ffff7990706 in fd_write_at_wrap (io=0x5555555d9fe0, fd=3, addr=0, buf=0x7fffffffc070 "\376", len=1, map=0x555555694ec0, user=0x0) at ../libr/io/io.c:15
#13 0x00007ffff7990a78 in on_map_skyline (io=0x5555555d9fe0, vaddr=0, buf=0x7fffffffc070 "\376", len=1, match_flg=2, op=0x7ffff79906ca <fd_write_at_wrap>, prefix_mode=false) at ../libr/io/io.c:68
#14 0x00007ffff7991559 in r_io_vwrite_at (io=0x5555555d9fe0, vaddr=0, buf=0x7fffffffc070 "\376", len=1) at ../libr/io/io.c:289
#15 0x00007ffff7991b28 in r_io_write_at (io=0x5555555d9fe0, addr=0, buf=0x7fffffffc070 "\376", len=1) at ../libr/io/io.c:399
#16 0x00007ffff6a6d292 in internal_esil_mem_write (esil=0x5555556aa930, addr=0, buf=0x7fffffffc070 "\376", len=1) at ../libr/anal/esil.c:326
#17 0x00007ffff6a6d5ed in r_anal_esil_mem_write (esil=0x5555556aa930, addr=0, buf=0x7fffffffc070 "\376", len=1) at ../libr/anal/esil.c:381
#18 0x00007ffff6a720d1 in esil_poke_n (esil=0x5555556aa930, bits=8) at ../libr/anal/esil.c:2014
#19 0x00007ffff6a730bf in esil_mem_addeq_n (esil=0x5555556aa930, bits=8) at ../libr/anal/esil.c:2373
#20 0x00007ffff6a7315c in esil_mem_addeq1 (esil=0x5555556aa930) at ../libr/anal/esil.c:2388
#21 0x00007ffff6a76ab1 in runword (esil=0x5555556aa930, word=0x7fffffffc1a0 "+=[1]") at ../libr/anal/esil.c:3527
#22 0x00007ffff6a771a5 in r_anal_esil_parse (esil=0x5555556aa930, str=0x55555565d33c ",7,$o,of,:=,7,$s,sf,:=,$z,zf,:=,7,$c,cf,:=,$p,pf,:=,3,$c,af,:=") at ../libr/anal/esil.c:3676
#23 0x00007ffff6a7addb in r_anal_esil_trace_op (esil=0x5555556aa930, op=0x7fffffffc4e0) at ../libr/anal/esil_trace.c:238
#24 0x00007ffff7011131 in r_debug_trace_op (dbg=0x5555555f2050, op=0x7fffffffc4e0) at ../libr/debug/trace.c:182
#25 0x00007ffff7188fcd in r_core_esil_step (core=0x7ffff5e41010, until_addr=18446744073709551615, until_expr=0x0, prev_addr=0x0, stepOver=false) at ../libr/core/cmd_anal.c:5281
#26 0x00007ffff70f2184 in r_core_anal_type_match (core=0x7ffff5e41010, fcn=0x55555583cae0) at ../libr/core/anal_tp.c:553
#27 0x00007ffff717868b in type_cmd (core=0x7ffff5e41010, input=0x555555808093 "") at ../libr/core/cmd_anal.c:1034
#28 0x00007ffff7185065 in cmd_anal_fcn (core=0x7ffff5e41010, input=0x555555808091 "ft") at ../libr/core/cmd_anal.c:4159
#29 0x00007ffff719f6ea in cmd_anal (data=0x7ffff5e41010, input=0x555555808091 "ft") at ../libr/core/cmd_anal.c:11387
#30 0x00007ffff71f4e94 in r_cmd_call (cmd=0x5555555effa0, input=0x555555808090 "aft") at ../libr/core/cmd_api.c:354
#31 0x00007ffff71ecb18 in r_core_cmd_subst_i (core=0x7ffff5e41010, cmd=0x5555556ea7f1 "aft", colon=0x0, tmpseek=0x7fffffffd6e7) at ../libr/core/cmd.c:3468
#32 0x00007ffff71ec09e in r_core_cmd_subst (core=0x7ffff5e41010, cmd=0x5555556ea7f0 "\"aft") at ../libr/core/cmd.c:3236
#33 0x00007ffff71f2b4c in run_cmd_depth (core=0x7ffff5e41010, cmd=0x55555569e7d0 "\"aft\"") at ../libr/core/cmd.c:5229
#34 0x00007ffff71f2f61 in r_core_cmd (core=0x7ffff5e41010, cstr=0x55555567125e "\"aft\"", log=false) at ../libr/core/cmd.c:5312
#35 0x00007ffff71f311c in r_core_cmd_lines (core=0x7ffff5e41010, lines=0x55555583da00 "\"e bin.cache=1\"\n\"e anal.detectwrites=1\"\n\"w \\x7f\\x45\\x4c\\x46\\x02\\x01\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x03\\x00\\x3e\\x00\\x01\\x00\\x00\\x00\\xf0\\x04\\x00\\x00\\x00\\x00\\x00\\x00\\x40\\x00\\x00\\x00\\x00\\x00\\x00\\"...) at ../libr/core/cmd.c:5351
#36 0x00007ffff71f32dd in r_core_cmd_file (core=0x7ffff5e41010, file=0x7fffffffe200 "i") at ../libr/core/cmd.c:5395
#37 0x00007ffff71e6bb6 in r_core_run_script (core=0x7ffff5e41010, file=0x7fffffffe200 "i") at ../libr/core/cmd.c:1296
#38 0x00007ffff7e27820 in run_commands (r=0x7ffff5e41010, cmds=0x5555555842a0, files=0x555555584300, quiet=false, do_analysis=0) at ../libr/main/radare2.c:248
#39 0x00007ffff7e2bc07 in r_main_radare2 (argc=4, argv=0x7fffffffde38) at ../libr/main/radare2.c:1406
#40 0x00005555555606ad in main (argc=4, argv=0x7fffffffde38) at ../binr/radare2/radare2.c:96

r2 should be compiled with debug info.

Steps to reproduce the Segfault:
```sh
$ echo 'main(){}' >m.c
$ gcc -m m.c
$ r2 -e bin.cache=1 -e anal.detectwrites=1 -A m
[x] Analyze all flags starting with sym. and entry0 (aa)
[x] Analyze function calls (aac)
[x] Analyze len bytes of instructions for references (aar)
[x] Finding and parsing C++ vtables (avrr)
segmentation fault
```
Environment:
OS: `ubuntu-18.04 docker image`
GCC: `gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)`
@thymol0
Copy link
Contributor Author

thymol0 commented Oct 7, 2021

Added a test case.

@trufae
Copy link
Collaborator

trufae commented Oct 7, 2021

Thanks for the reproducer. the iterator is not freeed.. what it's freed is the basic block, and this happens in this line:

553                                 r_core_esil_step (core, UT64_MAX, NULL, NULL, false);

And this happens because the fcn.c in update_analysis() in fcn.c is calling this:

2267                         r_anal_function_remove_block (fcn, bb);

So i would say that your fix is working just because of this specific case, but it will crash in other situations still. The solution to this would be:

  • duplicate the list of basic blocks and iterate over this list. this will result on skipped bbs that have been removed to be analized which is also wrong
  • create a list of the offsets of the basic blocks. and then iterate over that liist using RAnal.get_bb_at()
  • not touching the list of basic blocks while iterating should be a thumb rule, so another option would be to move this removal of unreachable blocks to a separate functoin and call it right after the loop. But the problem is that if there are new basic blocks included while looping the analysis will miss that
  • The reason why some bbs got unreachable is because of a bug and bad assumption in the code that gets called when emulating. this is, your test uses e anal.detectwrites=1, which basically forces the analysis to update the function boundaries when any byte inside the function gets modified. And obviously this speicifc case is failing because is modifying its own address space. without setting those flags to 1.

imho anal.detectwrites is something that should die. ive never used it and i didnt even knew it existed. actually i was planning to do that by computing hashes of the function and comparing it later when the user wants, not everytime on every single iteration of the emulation.. do you really need to set this flag to 1?

All the type matching code is full of scary things, so thanks for pointing it out :) im fine for a solution that makes it not crash, because for correctness i plan to change much of it to make it work reliabily.

But this bug is because of the automatic-reanalysis when the esil operates a write instruction. So i would probably go on disabling and ill probably end up removing this code that is just making everything slow and buggy and was introduced right before the fork.

@thymol0
Copy link
Contributor Author

thymol0 commented Oct 7, 2021

do you really need to set this flag to 1?

I thought it was the solution for the following message which happens with VV after an analysis with esil:

Rendering graph...Function was modified. Reanalyze? (Y/n)

I was looking for an option that does auto-reanalyze and that's how I found it.

@trufae
Copy link
Collaborator

trufae commented Oct 7, 2021

Another option would be to disable this option when the analysis is running.. because that should only apply on user-driven write ops.. not emulation it analysis

@thymol0
Copy link
Contributor Author

thymol0 commented Oct 7, 2021

Ok. If there is no need to reanalyze after running the esil analysis then I don't need this option.

@Lazula
Copy link
Collaborator

Lazula commented Oct 7, 2021

Ok. I made a GDB script to stop where the iterator is freed and print the backtrace. This is how to run it:

In the future, it may be easier to diagnose such memory errors by compiling with ASan:

SANITIZE=address sys/sanitize.sh
r2 - <<EOF
e bin.cache=1
e anal.detectwrites=1
s 0x0
f func
w \x7f\x45\x4c\x46\x02\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x03\x00\x3e\x00\x01\x00\x00\x00\xf0\x04\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\xe0\x18\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x40\x00\x38\x00\x09\x00\x40\x00\x1c\x00\x1b\x00\x06\x00\x00\x00\x04\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\x40\x00\x00\x00\x00\x00\x00\x00\xf8\x01\x00\x00\x00\x00\x00\x00\xf8\x01\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x03
af
aeim
aft
EOF
=================================================================
==2087562==ERROR: AddressSanitizer: heap-use-after-free on address 0x6110000200e8 at pc 0x7f4c5cefee56 bp 0x7fff3e29c4a0 sp 0x7fff3e29c490
READ of size 8 at 0x6110000200e8 thread T0
    #0 0x7f4c5cefee55 in r_core_anal_type_match /home/lazula/radare2/libr/core/anal_tp.c:531
    #1 0x7f4c5cc3c0a9 in type_cmd /home/lazula/radare2/libr/core/cmd_anal.c:1034
    #2 0x7f4c5cc55c89 in cmd_anal_fcn /home/lazula/radare2/libr/core/cmd_anal.c:4159
    #3 0x7f4c5cc8ea0f in cmd_anal /home/lazula/radare2/libr/core/cmd_anal.c:11387
    #4 0x7f4c5cdff638 in r_cmd_call /home/lazula/radare2/libr/core/cmd_api.c:537
    #5 0x7f4c5cd38a8f in r_core_cmd_subst_i /home/lazula/radare2/libr/core/cmd.c:4373
    #6 0x7f4c5cd2fe7a in r_core_cmd_subst /home/lazula/radare2/libr/core/cmd.c:3268
    #7 0x7f4c5cd3f340 in run_cmd_depth /home/lazula/radare2/libr/core/cmd.c:5260
    #8 0x7f4c5cd3fb9c in r_core_cmd /home/lazula/radare2/libr/core/cmd.c:5343
    #9 0x7f4c5cba1064 in r_core_prompt_exec /home/lazula/radare2/libr/core/core.c:3275
    #10 0x7f4c5cb9f809 in r_core_prompt_loop /home/lazula/radare2/libr/core/core.c:3096
    #11 0x7f4c5fa5f231 in r_main_radare2 /home/lazula/radare2/libr/main/radare2.c:1459
    #12 0x55dd514376ff in main /home/lazula/radare2/binr/radare2/radare2.c:96
    #13 0x7f4c5f8490b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)
    #14 0x55dd5143728d in _start (/home/lazula/radare2/binr/radare2/radare2+0x128d)
<snip>

@thymol0
Copy link
Contributor Author

thymol0 commented Oct 7, 2021

In the future, it may be easier to diagnose such memory errors by compiling with ASan

Thanks for the tip.

@trufae
Copy link
Collaborator

trufae commented Oct 13, 2021

ill look into that today

@trufae
Copy link
Collaborator

trufae commented Oct 30, 2021

Found some more segfaults in the aaft code. guess it needs to be rewritten, but not for 5.5.. i wanted to do so after the release, so ill fix the bugs for now. sorry for the delay. ill come back to this next week. but anyone is welcome to help on this

@radare
Copy link
Collaborator

radare commented Nov 2, 2021

I decided to implement the simplest solution which is to keep a list of the bb addresses and fetch the basic block associated on each iteration, this way its able to detect that the cfg has changed and re-loops to compute the list of basic blocks again.

i used an rlist for the PoC, but rvector would work better, ill fix that and merge it.

Thanks for reporting and researching the crash details and writing the test file! I'll ping u back when i merge the PR, so you can rebase your PR with the testcase. Sounds good?

@radare
Copy link
Collaborator

radare commented Nov 2, 2021

Nvm i just took your test and commited it under your name :) very nice catch! it should work fine now

aemmitt-ns pushed a commit to aemmitt-ns/radare2 that referenced this pull request Jan 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants