Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of Bounds Read In static void elm_close(tree_node_t *nodo) #19

Closed
Halcy0nic opened this issue Oct 24, 2022 · 2 comments
Closed

Out of Bounds Read In static void elm_close(tree_node_t *nodo) #19

Halcy0nic opened this issue Oct 24, 2022 · 2 comments

Comments

@Halcy0nic
Copy link

Hi there!

Great work on html2xhtml, I find myself using it quite often. While I was using the tool I created some fuzz tests to run in the background. A couple of test cases led to a segfault when using the '-t frameset' option, which led me to further investigate the crash.

Valgrind

I started with Valgrind, which reported an invalid read of size 4 in each of the test cases:

==1040381== Memcheck, a memory error detector
==1040381== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1040381== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==1040381== Command: ./src/html2xhtml -t frameset report/vuln/id:000000,sig:11,src:001386+001369,time:12081510,execs:2336913,op:splice,rep:16
==1040381== 
==1040381== Invalid read of size 4
==1040381==    at 0x40E911: elm_close (procesador.c:944)
==1040381==    by 0x410617: err_html_struct (procesador.c:1889)
==1040381==    by 0x40F20A: err_content_invalid (procesador.c:0)
==1040381==    by 0x40F20A: elm_close (procesador.c:959)
==1040381==    by 0x40E7C4: saxEndDocument (procesador.c:233)
==1040381==    by 0x40DF7A: main (html2xhtml.c:117)
==1040381==  Address 0x6f20d4 is not stack'd, malloc'd or (recently) free'd
==1040381== 
==1040381== 
==1040381== Process terminating with default action of signal 11 (SIGSEGV)
==1040381==  Access not within mapped region at address 0x6F20D4
==1040381==    at 0x40E911: elm_close (procesador.c:944)
==1040381==    by 0x410617: err_html_struct (procesador.c:1889)
==1040381==    by 0x40F20A: err_content_invalid (procesador.c:0)
==1040381==    by 0x40F20A: elm_close (procesador.c:959)
==1040381==    by 0x40E7C4: saxEndDocument (procesador.c:233)
==1040381==    by 0x40DF7A: main (html2xhtml.c:117)
==1040381==  If you believe this happened as a result of a stack
==1040381==  overflow in your program's main thread (unlikely but
==1040381==  possible), you can try to increase the size of the
==1040381==  main thread stack using the --main-stacksize= flag.
==1040381==  The main thread stack size used in this run was 8388608.
==1040381== 
==1040381== HEAP SUMMARY:
==1040381==     in use at exit: 88,190 bytes in 13 blocks
==1040381==   total heap usage: 22 allocs, 9 frees, 2,218,413 bytes allocated
==1040381== 
==1040381== LEAK SUMMARY:
==1040381==    definitely lost: 0 bytes in 0 blocks
==1040381==    indirectly lost: 0 bytes in 0 blocks
==1040381==      possibly lost: 0 bytes in 0 blocks
==1040381==    still reachable: 88,190 bytes in 13 blocks
==1040381==         suppressed: 0 bytes in 0 blocks
==1040381== Rerun with --leak-check=full to see details of leaked memory
==1040381== 
==1040381== For lists of detected and suppressed errors, rerun with: -s
==1040381== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
==1040419== Memcheck, a memory error detector
==1040419== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1040419== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==1040419== Command: ./src/html2xhtml -t frameset report/vuln/id:000001,sig:11,src:001386+001369,time:12316330,execs:2651995,op:splice,rep:16
==1040419== 
==1040419== Invalid read of size 4
==1040419==    at 0x40E911: elm_close (procesador.c:944)
==1040419==    by 0x410617: err_html_struct (procesador.c:1889)
==1040419==    by 0x40F20A: err_content_invalid (procesador.c:0)
==1040419==    by 0x40F20A: elm_close (procesador.c:959)
==1040419==    by 0x40E7C4: saxEndDocument (procesador.c:233)
==1040419==    by 0x40DF7A: main (html2xhtml.c:117)
==1040419==  Address 0x6efc84 is not stack'd, malloc'd or (recently) free'd
==1040419== 
==1040419== 
==1040419== Process terminating with default action of signal 11 (SIGSEGV)
==1040419==  Access not within mapped region at address 0x6EFC84
==1040419==    at 0x40E911: elm_close (procesador.c:944)
==1040419==    by 0x410617: err_html_struct (procesador.c:1889)
==1040419==    by 0x40F20A: err_content_invalid (procesador.c:0)
==1040419==    by 0x40F20A: elm_close (procesador.c:959)
==1040419==    by 0x40E7C4: saxEndDocument (procesador.c:233)
==1040419==    by 0x40DF7A: main (html2xhtml.c:117)
==1040419==  If you believe this happened as a result of a stack
==1040419==  overflow in your program's main thread (unlikely but
==1040419==  possible), you can try to increase the size of the
==1040419==  main thread stack using the --main-stacksize= flag.
==1040419==  The main thread stack size used in this run was 8388608.
==1040419== 
==1040419== HEAP SUMMARY:
==1040419==     in use at exit: 88,190 bytes in 13 blocks
==1040419==   total heap usage: 22 allocs, 9 frees, 2,218,413 bytes allocated
==1040419== 
==1040419== LEAK SUMMARY:
==1040419==    definitely lost: 0 bytes in 0 blocks
==1040419==    indirectly lost: 0 bytes in 0 blocks
==1040419==      possibly lost: 0 bytes in 0 blocks
==1040419==    still reachable: 88,190 bytes in 13 blocks
==1040419==         suppressed: 0 bytes in 0 blocks
==1040419== Rerun with --leak-check=full to see details of leaked memory
==1040419== 
==1040419== For lists of detected and suppressed errors, rerun with: -s
==1040419== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
==1040433== Memcheck, a memory error detector
==1040433== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1040433== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==1040433== Command: ./src/html2xhtml -t frameset report/vuln/id:000002,sig:11,src:001386+001369,time:43142960,execs:4309058,op:splice,rep:8
==1040433== 
==1040433== Invalid read of size 4
==1040433==    at 0x40E911: elm_close (procesador.c:944)
==1040433==    by 0x410617: err_html_struct (procesador.c:1889)
==1040433==    by 0x40F20A: err_content_invalid (procesador.c:0)
==1040433==    by 0x40F20A: elm_close (procesador.c:959)
==1040433==    by 0x40E7C4: saxEndDocument (procesador.c:233)
==1040433==    by 0x40DF7A: main (html2xhtml.c:117)
==1040433==  Address 0x6efc84 is not stack'd, malloc'd or (recently) free'd
==1040433== 
==1040433== 
==1040433== Process terminating with default action of signal 11 (SIGSEGV)
==1040433==  Access not within mapped region at address 0x6EFC84
==1040433==    at 0x40E911: elm_close (procesador.c:944)
==1040433==    by 0x410617: err_html_struct (procesador.c:1889)
==1040433==    by 0x40F20A: err_content_invalid (procesador.c:0)
==1040433==    by 0x40F20A: elm_close (procesador.c:959)
==1040433==    by 0x40E7C4: saxEndDocument (procesador.c:233)
==1040433==    by 0x40DF7A: main (html2xhtml.c:117)
==1040433==  If you believe this happened as a result of a stack
==1040433==  overflow in your program's main thread (unlikely but
==1040433==  possible), you can try to increase the size of the
==1040433==  main thread stack using the --main-stacksize= flag.
==1040433==  The main thread stack size used in this run was 8388608.
==1040433== 
==1040433== HEAP SUMMARY:
==1040433==     in use at exit: 92,286 bytes in 14 blocks
==1040433==   total heap usage: 23 allocs, 9 frees, 2,222,509 bytes allocated
==1040433== 
==1040433== LEAK SUMMARY:
==1040433==    definitely lost: 0 bytes in 0 blocks
==1040433==    indirectly lost: 0 bytes in 0 blocks
==1040433==      possibly lost: 0 bytes in 0 blocks
==1040433==    still reachable: 92,286 bytes in 14 blocks
==1040433==         suppressed: 0 bytes in 0 blocks
==1040433== Rerun with --leak-check=full to see details of leaked memory
==1040433== 
==1040433== For lists of detected and suppressed errors, rerun with: -s
==1040433== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
==1040439== Memcheck, a memory error detector
==1040439== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1040439== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==1040439== Command: ./src/html2xhtml -t frameset report/vuln/id:000003,sig:11,src:001386+001369,time:43143048,execs:4309129,op:splice,rep:8
==1040439== 
==1040439== Invalid read of size 4
==1040439==    at 0x40E911: elm_close (procesador.c:944)
==1040439==    by 0x410617: err_html_struct (procesador.c:1889)
==1040439==    by 0x40F20A: err_content_invalid (procesador.c:0)
==1040439==    by 0x40F20A: elm_close (procesador.c:959)
==1040439==    by 0x40E7C4: saxEndDocument (procesador.c:233)
==1040439==    by 0x40DF7A: main (html2xhtml.c:117)
==1040439==  Address 0x6e7074 is not stack'd, malloc'd or (recently) free'd
==1040439== 
==1040439== 
==1040439== Process terminating with default action of signal 11 (SIGSEGV)
==1040439==  Access not within mapped region at address 0x6E7074
==1040439==    at 0x40E911: elm_close (procesador.c:944)
==1040439==    by 0x410617: err_html_struct (procesador.c:1889)
==1040439==    by 0x40F20A: err_content_invalid (procesador.c:0)
==1040439==    by 0x40F20A: elm_close (procesador.c:959)
==1040439==    by 0x40E7C4: saxEndDocument (procesador.c:233)
==1040439==    by 0x40DF7A: main (html2xhtml.c:117)
==1040439==  If you believe this happened as a result of a stack
==1040439==  overflow in your program's main thread (unlikely but
==1040439==  possible), you can try to increase the size of the
==1040439==  main thread stack using the --main-stacksize= flag.
==1040439==  The main thread stack size used in this run was 8388608.
==1040439== 
==1040439== HEAP SUMMARY:
==1040439==     in use at exit: 92,286 bytes in 14 blocks
==1040439==   total heap usage: 23 allocs, 9 frees, 2,222,509 bytes allocated
==1040439== 
==1040439== LEAK SUMMARY:
==1040439==    definitely lost: 0 bytes in 0 blocks
==1040439==    indirectly lost: 0 bytes in 0 blocks
==1040439==      possibly lost: 0 bytes in 0 blocks
==1040439==    still reachable: 92,286 bytes in 14 blocks
==1040439==         suppressed: 0 bytes in 0 blocks
==1040439== Rerun with --leak-check=full to see details of leaked memory
==1040439== 
==1040439== For lists of detected and suppressed errors, rerun with: -s
==1040439== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

GDB Backtrace and Source Code

I attached gdb to html2xhtml in an attempt to find where the Out of Bounds Read was taking place:

backtrace_1

backtrace_2

Taking a look at the segfault in GDB led me to the following function:

static void elm_close(tree_node_t *nodo)

A user could provide a malformed document with an invalid 'ELM_PTR(nodo).contenttype[doctype]', resulting in the following comparison in assembly:

cmp    dword ptr [rbp + rax*4 + 0xc], 4

This could be leveraged to read locations that they should not have access to. I have attached multiple crash files to help reproduce the issue.

Thanks again!

crashes.zip

jfisteus added a commit that referenced this issue Oct 25, 2022
Reported in issue #19. The error happened when the node passed to elm_close
wasn't an element as it was supposed to.

In the example files reported with the issue, the node causing the crash
was a comment.

Now, elm_close will exit with an error if the node it receives isn't
an element. Some calls that could potentially call it with something else
check the node type before calling it.
@jfisteus
Copy link
Owner

Thanks for testing html2xhtml and for reporting the issue you've found. The detail of your report and your example files have been quite helpful to triage the bug.

I've found the cause of the issue: the node received by elm_close was expected to be of type element but was of type comment instead. I've fixed it and your files don't cause a segmentation fault anymore.

@Halcy0nic
Copy link
Author

Awesome, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants