Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasm2c hangs on certain inputs and cannot finish execution for a while. #2180

Closed
khagankhan opened this issue Mar 27, 2023 · 5 comments · Fixed by #2182 or #2183
Closed

wasm2c hangs on certain inputs and cannot finish execution for a while. #2180

khagankhan opened this issue Mar 27, 2023 · 5 comments · Fixed by #2182 or #2183

Comments

@khagankhan
Copy link

khagankhan commented Mar 27, 2023

Describe the bug

Certain hang.wasm causes wasm2c an infinite loop. wasm2c tries to access a memory that is not permitted instead of providing type mismatch error for a while.

wasm2c --version: 1.0.32 (git~1.0.32-46-g47a589a1)


Content of the file that causes the issue:

vim hang.wasm:

^@asm^A^@^@^@^A^G^A`^B{^?^A~^C^B^A^@^@^L^A^H^@^@^@#^@^@^@^@^@^?
        ^A^G^Aàÿÿ^O^?^K

cat hang.wasm:

asm`{~
     #
	???

Steps to reproduce:

Here is the file for the bug:

hang.wasm.txt (Remove .txt extension and save as .wasm)

  • Install and build WABT
  • Run wasm2c on the test file: wasm2c hang.wasm
  • Observe the crash: It will be observed that wasm2c cannot finish the execution.
    wasm-validate hang.wasm output: Segmentation fault
    gdb wasm-validate and subsequent run hang.wasm output:
	Program received signal SIGSEGV, Segmentation fault.
	0x000000000047d8c6 in std::vector<wabt::TypeChecker::Label, std::allocator<wabt::TypeChecker::Label> >::_M_realloc_insert<wabt::LabelType&, std::vector<wabt::Type, std::allocator<wabt::Type> > const&, std::vector<wabt::Type, std::allocator<wabt::Type> > const&, unsigned long> (this=this@entry=0x7fffffffdd10, __position=__position@entry=non-dereferenceable iterator for std::vector, __args=@0x7fffffffd8d8: 0, __args=@0x7fffffffd8d8: 0, __args=@0x7fffffffd8d8: 0, __args=@0x7fffffffd8d8: 0) at /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:1762
1762		return (__len < size() || __len > max_size()) ? max_size() : __len;

Expected Behavior:

The expected output should be an error like other related tools would provide as below. To cite an example, run wasm2wat on the test file:
wasm2wat hang.wasm:

The output will be:

Expected_Behavior/hang.wasm:0000027: error: type mismatch in implicit return, expected [i64] but got []

Additional information

A combination of afl-fuzz++4.03a and Wasmlike, an Xsmith-based random program generator produced the snippet of code that caused the crash. https://www.flux.utah.edu/project/xsmith

@keithw
Copy link
Member

keithw commented Mar 28, 2023

I can't replicate the segfault or infinite loop (on GNU/Linux with default ulimit); the wasm2c command runs for 28 seconds, grows its heap to 4.3 GiB, and then spits out:

/tmp/hang.wasm:0000027: error: type mismatch in implicit return, expected [i64] but got []

What OS, compiler, and standard C++ library are you running/compiling this with, and with what stack and memory limits?

I do get similar results (the very long execution) from wasm2wat --generate-names, so I don't think this is a wasm2c problem specifically.

I think the root cause here is that:

  • this is an invalid module that happens to declare a function with 33,554,400 locals
  • wasm2c always generates names for everything in the module (before checking the validity), whereas other WABT tools only do so with the --generate-names option (and might bomb out on an invalid module before generating names)
  • GenerateNames takes a long time (and the BindingHash takes a ton of RAM) when there are 33 million locals, because each name is stored in the unordered_multimap (at apparently an average of ~140 bytes per entry), and each new name is compared for uniqueness against every name already in the BindingHash.

We could probably make this faster with the knowledge that generated names will only conflict with pre-existing names (not other generated names), but not sure if it's worth it. We're still going to be throwing all those names into a BindingHash in the end.

We could also make wasm2c check module validity before running GenerateNames, which at least would improve treatment of these invalid modules. Still would be a lot of RAM consumption for a module that happens to be valid and also has millions of locals in a function. OTOH, if the problem is just RAM consumption exceeding a limit, I would expect to get a more graceful failure than a segfault.

@khagankhan
Copy link
Author

Thank you for the comment, Professor. Initially, I was thinking it is because wasm2c tries to access unpermitted memory region. Because wat2wasm gave immediate type mismatch error but wasm2c did not. However, after your explanation I also realized that it finishes the execution about 28-30 seconds later that is why the fuzzer sees it as a hang. Considering that time I am inclined to think that validating the input firstly can resolve the issue. Thanks for the explanation.

Ack: Initial title that includes "infinite loop" was not right. The execution finishes after some time.

Best,
Khan.

@khagankhan khagankhan changed the title wasm2c hangs (an infinite loop) on certain inputs and cannot finish execution. wasm2c hangs on certain inputs and cannot finish execution. Mar 31, 2023
@khagankhan khagankhan changed the title wasm2c hangs on certain inputs and cannot finish execution. wasm2c hangs on certain inputs and cannot finish execution for a while. Mar 31, 2023
@keithw
Copy link
Member

keithw commented Mar 31, 2023

Thanks for checking further! Are you able to reproduce the segfault? I would love to understand where that comes from but wasn't able to replicate it locally.

@khagankhan
Copy link
Author

khagankhan commented Mar 31, 2023

Yes, Professor. I checked segfault multiple times and it gave it every time. wasm-validate hang.wasm gave segfault. In gdb it gave the one in the report:

Program received signal SIGSEGV, Segmentation fault.
	0x000000000047d8c6 in std::vector<wabt::TypeChecker::Label, std::allocator<wabt::TypeChecker::Label> >::_M_realloc_insert<wabt::LabelType&, std::vector<wabt::Type, std::allocator<wabt::Type> > const&, std::vector<wabt::Type, std::allocator<wabt::Type> > const&, unsigned long> (this=this@entry=0x7fffffffdd10, __position=__position@entry=non-dereferenceable iterator for std::vector, __args=@0x7fffffffd8d8: 0, __args=@0x7fffffffd8d8: 0, __args=@0x7fffffffd8d8: 0, __args=@0x7fffffffd8d8: 0) at /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:1762
1762		return (__len < size() || __len > max_size()) ? max_size() : __len;

gdb wasm2c and subsequent run hang.wasm output is: (waiting for a while) rror: type mismatch in implicit return, expected [i64] but got []

That is one of the reasons, I thought it was memory-related issue.

@keithw
Copy link
Member

keithw commented Mar 31, 2023

Thanks -- unfortunately I haven't been able to replicate the segfault (with GCC 12 on Ubuntu 22.10) so I'm a little stuck. Are you able to provide a little more information?

  1. How are you compiling wabt? make gcc-debug? make gcc-release? Something else?
  2. What operating system is this on?
  3. What exact Git commit of wabt is this?
  4. Does the segfault still happen when compiled with make gcc-debug?
  5. Are you able to provide a stack backtrace from gdb? That would help pinpoint better what is going on.

Thanks much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment