Skip to content

Unhandled Exceptions

Juergen Donnerstag edited this page Nov 22, 2021 · 5 revisions

While developing V-Rosie I occasionally had "unhandled exception" errors. Usually with a vlang backtrace, but occassionally without. Even -cg did not print any additional info. Usually the culprit was quickly identified and the problem fixed. V-lang is not yet production ready and you need to expect edge cases which don't yet work. So you re-arrange your source code a little, review the generated C-source code, and the problem is fixed (and a git issue created to fix it in V).

However, this time it wasn't that easy. I had a mut symbols []string variable in a struct. The array was mutable to support initializing the array with data retrieved during some (non trivial) initialisation process. But afterwards, the array was meant to be immutable.

The unhandled exception happened in a V-test function: assert parser.symbols[0] == "main.*". Absolutely nothing special about this assertion which simply compares two strings. So I added debug-prints directly after the initialisation phase, and before the assertion. Now it crashed with my last debug-print, just before the original assertion. I've added more debug-prints after each processing step, in between initialisation and the assertion. I came closer, but some of the processing steps are complex functions, and I thought may be there are more intelligent ways to identify the root cause.

I compiled my test case on WSL/Ubuntu with

../v/v -cg -keepc -showcc -modules/rosie/compiler_vm_backend/capture_test.v

First with tcc, but tcc does not add line numbers to the debug information, so that gdb, valgrind, etc. are only print function names. Then I switched to gcc

../v/v -cg -keepc -showcc -cc gcc modules/rosie/compiler_vm_backend/capture_test.v

and run gdb. The idea was: gdb will stop execution (gdb run) upon unhandled exception, do bt to print the backtrack. But that didn't work. The program exited with return code 1 and bt was empty. I did this multiple times, but the exception did not occur.

I thought, it must be something related to memory access. Somehow I either read from or write to some invalid memory block. I must admit I'm not an expert for this, but because people previously talked about Valgrind, I tried Valgrind with the executable:

valgrind -v --tool=memcheck --leak-check=yes --log-file=valgrind.log ./modules/rosie/compiler_vm_backend/capture_test

And the log indeed listed several memory leaks and access problems. The first error entry looked like:

==531== Invalid read of size 2
==531==    at 0x4C38750: memmove (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==531==    by 0x43339F: vmemcpy (capture_test.tmp.c:10367)
==531==    by 0x43A0BF: string_clone (capture_test.tmp.c:12301)
==531==    by 0x43F7F6: StrIntpData_get_fmt_format (capture_test.tmp.c:13830)
==531==    by 0x440D90: str_intp (capture_test.tmp.c:14255)
==531==    by 0x4C924E: rosie__compiler_vm_backend__test_05 (capture_test.tmp.c:32818)
==531==    by 0x4E1187: main (capture_test.tmp.c:35098)
==531==  Address 0x5685b10 is 0 bytes inside a block of size 7 free'd
==531==    at 0x4C32D3B: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==531==    by 0x431DBD: _v_free (capture_test.tmp.c:10082)
==531==    by 0x43DD2E: string_free (capture_test.tmp.c:13364)
==531==    by 0x432399: VAssertMetaInfo_free (capture_test.tmp.c:10181)
==531==    by 0x473488: main__cb_assertion_ok (capture_test.tmp.c:22732)
==531==    by 0x4C90A8: rosie__compiler_vm_backend__test_05 (capture_test.tmp.c:32798)
==531==    by 0x4E1187: main (capture_test.tmp.c:35098)
==531==  Block was alloc'd at
==531==    at 0x4C31B0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==531==    by 0x4316AA: malloc_noscan (capture_test.tmp.c:9987)
==531==    by 0x43A09F: string_clone (capture_test.tmp.c:12299)
==531==    by 0x4880CA: rosie__runtime_v2__Symbols_add (capture_test.tmp.c:24831)
==531==    by 0x4D051E: rosie__compiler_vm_backend__Compiler_add_open_capture (capture_test.tmp.c:33689)
==531==    by 0x4CF7E5: rosie__compiler_vm_backend__Compiler_compile (capture_test.tmp.c:33574)
==531==    by 0x4D68DB: rosie__compiler_vm_backend__parse_and_compile (capture_test.tmp.c:34585)
==531==    by 0x4A4D22: rosie__compiler_vm_backend__prepare_test (capture_test.tmp.c:28090)
==531==    by 0x4C8650: rosie__compiler_vm_backend__test_05 (capture_test.tmp.c:32711)
==531==    by 0x4E1187: main (capture_test.tmp.c:35098)

At least it confirmed my impression that someone is changing my string data.

Then I got the hint to use clang instead, like

../v/v -cg -keepc -showcc -cc clang -cflags -fsanitize=memory modules/rosie/compiler_vm_backend/capture_test.v

and the output was:

> C compiler cmd: clang "@/tmp/v_1000/capture_test.tmp.c.rsp"
> C compiler response file "/tmp/v_1000/capture_test.tmp.c.rsp":
  -fsanitize=memory -std=gnu99 -D_DEFAULT_SOURCE -g -O0 -o "/home/pz65n8/vlang-rosie/modules/rosie/compiler_vm_backend/capture_test" "/tmp/v_1000/capture_test.tmp.c" -rdynamic -lm
Parse and compile: '"a"*' ----------------------------------------
1 - main.* - 0x701000001db0
2 - main.* - 0x701000001db0
3 - main.* - 0x701000001db0
Uninitialized bytes in __interceptor_write at offset 4 inside [0x702000000940, 28)
==705==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x511dac in _write_buf_to_fd /tmp/v_1000/capture_test.tmp.c:9881:12
    #1 0x511b20 in _writeln_to_fd /tmp/v_1000/capture_test.tmp.c:9862:3
    #2 0x510855 in eprintln /tmp/v_1000/capture_test.tmp.c:9791:3
    #3 0x6c7231 in rosie__compiler_vm_backend__test_05 /tmp/v_1000/capture_test.tmp.c:28163:2
    #4 0x706001 in main /tmp/v_1000/capture_test.tmp.c:30144:30
    #5 0x7fe6e7e21bf6 in __libc_start_main /build/glibc-S9d2JN/glibc-2.27/csu/../csu/libc-start.c:310
    #6 0x431819 in _start (/home/pz65n8/vlang-rosie/modules/rosie/compiler_vm_backend/capture_test+0x431819)

SUMMARY: MemorySanitizer: use-of-uninitialized-value /tmp/v_1000/capture_test.tmp.c:9881:12 in _write_buf_to_fd
Exiting

Again: a V string function (eprintln) failing while writing the content of the string. "use of uninitialized value".

Then an expert from the V community suggested it might be a stack overflow, and I should increase the limit. On WSL it a little tricky:

> ulimit -s
8192
> sudo prlimit --stack=unlimited --pid=$$; ulimit -s 196000
> ulimit -s
196000

But the output has not changed. No difference :(

Several hours later: ../v/v -cg -keepc -showcc -cc clang -cflags -fsanitize=address,undefined -cstrict modules/rosie/compiler_vm_backend/capture_test.v creates the following output

==1333==ERROR: AddressSanitizer: heap-use-after-free on address 0x602000003d10 at pc 0x0000004efd12 bp 0x7fffe0f7e2c0 sp 0x7fffe0f7da70
READ of size 6 at 0x602000003d10 thread T0
    #0 0x4efd11 in __asan_memcpy (/home/pz65n8/vlang-rosie/modules/rosie/compiler_vm_backend/capture_test+0x4efd11)
    #1 0x581f87 in vmemcpy /tmp/v_1000/capture_test.tmp.c:10261:10
    #2 0x57530c in string_clone /tmp/v_1000/capture_test.tmp.c:12195:3
    #3 0x62c925 in StrIntpData_get_fmt_format /tmp/v_1000/capture_test.tmp.c:13724:9
    #4 0x561073 in str_intp /tmp/v_1000/capture_test.tmp.c:14149:5
    #5 0x87e705 in rosie__compiler_vm_backend__test_05 /tmp/v_1000/capture_test.tmp.c:28147:12
    #6 0x8f5ce8 in main /tmp/v_1000/capture_test.tmp.c:29963:30
    #7 0x7fbd3cc21bf6 in __libc_start_main /build/glibc-S9d2JN/glibc-2.27/csu/../csu/libc-start.c:310
    #8 0x430f99 in _start (/home/pz65n8/vlang-rosie/modules/rosie/compiler_vm_backend/capture_test+0x430f99)

0x602000003d10 is located 0 bytes inside of 7-byte region [0x602000003d10,0x602000003d17)
freed by thread T0 here:
    #0 0x4f0c80 in __interceptor_free.localalias.0 (/home/pz65n8/vlang-rosie/modules/rosie/compiler_vm_backend/capture_test+0x4f0c80)
    #1 0x53d424 in _v_free /tmp/v_1000/capture_test.tmp.c:9976:3
    #2 0x57ce8e in string_free /tmp/v_1000/capture_test.tmp.c:13258:2
    #3 0x5ba054 in VAssertMetaInfo_free /tmp/v_1000/capture_test.tmp.c:10075:3
    #4 0x73c844 in main__cb_assertion_ok /tmp/v_1000/capture_test.tmp.c:22626:2
    #5 0x87d5a5 in rosie__compiler_vm_backend__test_05 /tmp/v_1000/capture_test.tmp.c:28129:3
    #6 0x8f5ce8 in main /tmp/v_1000/capture_test.tmp.c:29963:30
    #7 0x7fbd3cc21bf6 in __libc_start_main /build/glibc-S9d2JN/glibc-2.27/csu/../csu/libc-start.c:310

previously allocated by thread T0 here:
    #0 0x4f0e50 in malloc (/home/pz65n8/vlang-rosie/modules/rosie/compiler_vm_backend/capture_test+0x4f0e50)
    #1 0x54a572 in malloc_noscan /tmp/v_1000/capture_test.tmp.c:9881:9
    #2 0x575063 in string_clone /tmp/v_1000/capture_test.tmp.c:12193:29
    #3 0x7b7f57 in rosie__runtime_v2__Symbols_add /tmp/v_1000/capture_test.tmp.c:24726:51
    #4 0x88864f in rosie__compiler_vm_backend__Compiler_add_open_capture /tmp/v_1000/capture_test.tmp.c:28559:12
    #5 0x892d57 in rosie__compiler_vm_backend__Compiler_compile /tmp/v_1000/capture_test.tmp.c:28444:3
    #6 0x873228 in rosie__compiler_vm_backend__parse_and_compile /tmp/v_1000/capture_test.tmp.c:29455:21
    #7 0x87042c in rosie__compiler_vm_backend__prepare_test /tmp/v_1000/capture_test.tmp.c:27989:39
    #8 0x874727 in rosie__compiler_vm_backend__test_05 /tmp/v_1000/capture_test.tmp.c:28006:39
    #9 0x8f5ce8 in main /tmp/v_1000/capture_test.tmp.c:29963:30
    #10 0x7fbd3cc21bf6 in __libc_start_main /build/glibc-S9d2JN/glibc-2.27/csu/../csu/libc-start.c:310

Allocation looks good: #3 in the 3rd block But why is it freed while cleaning up an assertion? #4 in the 2nd block. That is strange. The object is definitely allocated long before the assertion. 28006 < 28129 That also explains heap use after free, because #5 in block 1 ist executed after the assertion. 28129 < 28147 The referenced statement line 28129 is main__cb_assertion_ok(&v_assert_meta_info__t10); The only aspect I don't understand with the meta data is v_assert_meta_info__t10.lvalue = _t9 /* typeof: v.ast.CTempVar type: ast.Type(0x14 = 20) */ ; What makes me suspecious is CTempVar which may mean shall be cleaned up afterwards

And yes, it does; try moving the assert condition in its own bool var before the assert And then asserting on the new temporary variable. And that did the trick.

Lessons learnt:

  • For now use only use simple expressions in assert statements (e.g. maps don't work either). In doubt, assign it to a temp variable before the assertion
  • Use clang and the sanitizers as shown above (or Valgrind. The output looks very similar). Concentrate on heap-use-after-free and heap-buffer-overflow errors. And if you see some indication that an assertion is involved when freeing the memory, then you know what to do.

Taking a 2nd look at the initial Valgrind output, it is pretty similar to clang with sanitizer, and already provided the necessary information. Only I was not yet able to relate it to V-lang and its assert statement.

Clone this wiki locally