Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExternalModulesTests.testReturningAValue: segfault in math_sqrt() #70

Closed
saper opened this issue Jul 18, 2015 · 13 comments
Closed

ExternalModulesTests.testReturningAValue: segfault in math_sqrt() #70

saper opened this issue Jul 18, 2015 · 13 comments
Assignees

Comments

@saper
Copy link
Contributor

saper commented Jul 18, 2015

m> env CC=cc CXX=c++ python tests/tests.py ExternalModulesTests.testReturningAValue
E
======================================================================
ERROR: testReturningAValue (__main__.ExternalModulesTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "tests/tests.py", line 580, in testReturningAValue
    runTest(self, 'sqrt.asm', 1.73, 0, lambda o: round(float(o.strip()), 2))
  File "tests/tests.py", line 92, in runTest
    excode, output = run(compiled_path)
  File "tests/tests.py", line 84, in run
    raise ViuaCPUError('{0} [{1}]: {2}'.format(path, exit_code, output.decode('utf-8').strip()))
ViuaCPUError: ./tests/compiled/sample_asm_external_sqrt.asm.bin [-11]: 

----------------------------------------------------------------------
Ran 1 test in 4.106s

FAILED (errors=1)
m> env CC=cc CXX=c++ python tests/tests.py ExternalModulesTests.testReturningAValue
E
======================================================================
ERROR: testReturningAValue (__main__.ExternalModulesTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "tests/tests.py", line 580, in testReturningAValue
    runTest(self, 'sqrt.asm', 1.73, 0, lambda o: round(float(o.strip()), 2))
  File "tests/tests.py", line 92, in runTest
    excode, output = run(compiled_path)
  File "tests/tests.py", line 84, in run
    raise ViuaCPUError('{0} [{1}]: {2}'.format(path, exit_code, output.decode('utf-8').strip()))
ViuaCPUError: ./tests/compiled/sample_asm_external_sqrt.asm.bin [-11]: 

----------------------------------------------------------------------
Ran 1 test in 4.106s

FAILED (errors=1)


m> gdb ./build/bin/vm/cpu cpu.core 
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
Core was generated by `cpu'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /usr/lib/libc++.so.1...done.
Loaded symbols for /usr/lib/libc++.so.1
Reading symbols from /lib/libcxxrt.so.1...done.
Loaded symbols for /lib/libcxxrt.so.1
Reading symbols from /lib/libm.so.5...done.
Loaded symbols for /lib/libm.so.5
Reading symbols from /lib/libgcc_s.so.1...done.
Loaded symbols for /lib/libgcc_s.so.1
Reading symbols from /lib/libc.so.7...done.
Loaded symbols for /lib/libc.so.7
Reading symbols from ./math.so...done.
Loaded symbols for ./math.so
Reading symbols from /libexec/ld-elf.so.1...done.
Loaded symbols for /libexec/ld-elf.so.1
#0  0x0000000801565b19 in math_sqrt () from ./math.so
(gdb) bt
#0  0x0000000801565b19 in math_sqrt () from ./math.so
#1  0x000000000045e47a in CPU::excall (this=0x7fffffffd568, addr=0x801c16137 "<")
    at src/cpu/instr/linking.cpp:93
#2  0x000000000043afb8 in CPU::dispatch (this=0x7fffffffd568, addr=0x801c16126 "Q")
    at src/cpu/dispatch.cpp:223
#3  0x000000000042326b in CPU::tick (this=0x7fffffffd568) at src/cpu/cpu.cpp:251
#4  0x0000000000425870 in CPU::run (this=0x7fffffffd568) at src/cpu/cpu.cpp:374
#5  0x0000000000405c0e in main (argc=2, argv=0x7fffffffe8d0) at src/front/cpu.cpp:99
(gdb) frame 1
#1  0x000000000045e47a in CPU::excall (this=0x7fffffffd568, addr=0x801c16137 "<")
    at src/cpu/instr/linking.cpp:93
93      (*callback)(frame, 0, regset);
(gdb) list
88      /* FIXME: second parameter should be a pointer to static registers or
89       *        0 if function does not have static registers registered
90       * FIXME: should external functions always have static registers allocated?
91       */
92      ExternalFunction* callback = external_functions.at(call_name);
93      (*callback)(frame, 0, regset);
94  
95      // FIXME: woohoo! segfault!
96      Type* returned = 0;
97      bool returned_is_reference = false;

(gdb) print frame[0]
$2 = {return_address = 0x801c16137 "<", args = 0x801c0b160, regset = 0x801c0b180, 
  place_return_value_in = 2, resolve_return_value_register = false, 
  function_name = {<std::__1::__basic_string_common<true>> = {<No data fields>}, 
    __r_ = {<std::__1::__libcpp_compressed_pair_imp<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::__rep, std::__1::allocator<char>, 2>> = {<std::__1::allocator<char>> = {<No data fields>}, __first_ = {{__l = {__cap_ = 8303013083314482452, __size_ = 7631473, 
              __data_ = 0x0}, __s = {{__size_ = 20 '\024', __lx = 20 '\024'}, 
              __data_ = "math::sqrt", '\0' <repeats 12 times>}, __r = {__words = {
                8303013083314482452, 7631473, 0}}}}}, <No data fields>}, 
    static npos = <optimized out>}}
(gdb) print frame[1]
$3 = {return_address = 0x0, args = 0x801800248, regset = 0xf700000006, place_return_value_in = -60, 
  resolve_return_value_register = 255, 
  function_name = {<std::__1::__basic_string_common<true>> = {<No data fields>}, 
    __r_ = {<std::__1::__libcpp_compressed_pair_imp<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::__rep, std::__1::allocator<char>, 2>> = {<std::__1::allocator<char>> = {<No data fields>}, __first_ = {{__l = {__cap_ = 18446744073709551615, 
              __size_ = 18446744073709551615, 
              __data_ = 0xfffffffffffffff <Address 0xfffffffffffffff out of bounds>}, __s = {{
                __size_ = 255 'ÿ', __lx = -1 'ÿ'}, __data_ = 'ÿ' <repeats 22 times>, "\017"}, __r = {
              __words = {18446744073709551615, 18446744073709551615, 
                1152921504606846975}}}}}, <No data fields>}, static npos = <optimized out>}}

(gdb) print regset->registers
$6 = (Type **) 0x801c18800
(gdb) print regset->registers[0]
$7 = (Type *) 0x0
(gdb) print regset->registers[1]
$8 = (Type *) 0x801c0b0e0
(gdb) print regset->registers[2]
$9 = (Type *) 0x0
(gdb) print regset->registers[3]
$10 = (Type *) 0x0
@marekjm marekjm added the bug label Jul 18, 2015
@marekjm marekjm self-assigned this Jul 18, 2015
@saper
Copy link
Contributor Author

saper commented Jul 19, 2015

This is because dynamic_cast at https://github.com/marekjm/wudoovm/blob/23b5e76dbc6abf25152b830f4f8b1e4f7cb61db4/sample/asm/external/math.cpp#L12 fails:

Program received signal SIGSEGV, Segmentation fault.
0x0000000801565b19 in math_sqrt (frame=0x801c09fc0) at ./sample/asm/external/math.cpp:13
13      float square_root = sqrt(flt->value());
(gdb) bt
#0  0x0000000801565b19 in math_sqrt (frame=0x801c09fc0) at ./sample/asm/external/math.cpp:13
#1  0x000000000045e3da in CPU::excall (this=0x7fffffffd548, addr=0x801c16137 "<")
    at src/cpu/instr/linking.cpp:93
#2  0x000000000043af18 in CPU::dispatch (this=0x7fffffffd548, addr=0x801c16126 "Q")
    at src/cpu/dispatch.cpp:223
#3  0x00000000004231cb in CPU::tick (this=0x7fffffffd548) at src/cpu/cpu.cpp:251
#4  0x00000000004257d0 in CPU::run (this=0x7fffffffd548) at src/cpu/cpu.cpp:374
#5  0x0000000000405c0e in main (argc=2, argv=0x7fffffffe8b8) at src/front/cpu.cpp:99
(gdb) print flt
$1 = (Float *) 0x0

@saper
Copy link
Contributor Author

saper commented Jul 19, 2015

Somewhat related info: https://gcc.gnu.org/faq.html#dso

@saper
Copy link
Contributor Author

saper commented Jul 19, 2015

libstdc++ works it around by applying vague linkage and checking typeinfo symbols to find out if objects are to be treated the same.

But whenever I am using clang C++ runtime library, it does not work, apparently because (https://llvm.org/bugs/show_bug.cgi?id=10178#c12) clang toolkit does compare types by address, not by typeinfo contents. It may also be the problem with https://github.com/pathscale/libcxxrt used by FreeBSD.

There is a One Definition Rule in the [http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3797.pdf](C++11 standard %28liked n3337%29), but my reading of paragraph 3.2 allows for the case we have here.

Until this is researched properly it looks like creating plugins using wudoovm types may be not portable.

@saper
Copy link
Contributor Author

saper commented Jul 19, 2015

Contents of relevant sections of build/bin/vm/cpu:

(from nm(1))

0000000000477770 V _ZTI5Float
0000000000477768 V _ZTS5Float
0000000000477710 V _ZTV5Float

From objdump -s --start=0x477710 --stop=0x47778f ./build/bin/vm/cpu

Contents of section .rodata:
 477710 00000000 00000000 70774700 00000000  ........pwG.....
 477720 c0cc4600 00000000 40cd4600 00000000  ..F.....@.F.....
 477730 e0a64300 00000000 00d24600 00000000  ..C.......F.....
 477740 60974300 00000000 e09a4300 00000000  `.C.......C.....
 477750 30d24600 00000000 a0d24600 00000000  0.F.......F.....
 477760 c0d24600 00000000 35466c6f 61740000  ..F.....5Float..
 477770 70576800 00000000 68774700 00000000  pWh.....hwG.....
 477780 28674700 00000000 00000000 00000000  (gG.............

@marekjm
Copy link
Owner

marekjm commented Jul 19, 2015

Well, the thing I am about to suggest may be stupid but...

How about instead of relying on standard library's typechecking abilities we use machine's own?

The object that function math::sqrt receives as a parameter is sure to provide type() function which can be used to inspect its type (by string comparison). If it is not Float - throw exception.
If it is - use static_cast<Float*>(...) to cast down the inheritance hierarchy (static_cast allows downcasting).

My reasoning is that if static_cast allows downcasting it can be used to convert Type* to Float*, and if it does not perform any runtime type-checking, we will get a non-zero pointer to our desired type.

What do you think?

@saper
Copy link
Contributor Author

saper commented Jul 19, 2015

@marekjm
Copy link
Owner

marekjm commented Jul 19, 2015

Kinda. Objects in Viua provide virtual type() method which returns name of their type as a std::string. It can be used to inspect real type of passed Type pointer.

Could you check if commit ce04c1d works for you?

@saper
Copy link
Contributor Author

saper commented Jul 19, 2015

Yes, the test passes. Still trying to find out the cause for this (I think I've had some strange dynamic_cast problems in some other project)

@marekjm
Copy link
Owner

marekjm commented Jul 19, 2015

OK, so "the failing test" part of the issue is fixed.

I will leave the issue open, though, and look for info why this should fail with dynamic_cast.

@saper
Copy link
Contributor Author

saper commented Jul 21, 2015

Here is a little piece of code to test whether dynamic_cast<> passes through the runtime linker. It seems it doesn't on my FreeBSD:

https://github.com/saper/dynamic_cast

Output on Linux:

Parent::something() = 1
Child::something() = 2
library dynamic_cast<Child *>(Parent *) = (nil)
library dynamic_cast<Child *>(Child *) = 0x7fffb2f50a60
local dynamic_cast<Child *>(Parent *) = (nil)
local dynamic_cast<Child *>(Child *) = 0x7fffb2f50a60

Output on FreeBSD:

Parent::something() = 1
Child::something() = 2
library dynamic_cast<Child *>(Parent *) = 0x0
library dynamic_cast<Child *>(Child *) = 0x0
local dynamic_cast<Child *>(Parent *) = 0x0
local dynamic_cast<Child *>(Child *) = 0x7fffffffe710

More precisely, it also works on FreeBSD when compiling with g++ and using newer libstdc++.so.6 - tested with gcc 4.8 and 4.9, the one included with gcc 4.2.1 is too old.

@saper
Copy link
Contributor Author

saper commented Jul 23, 2015

saper added a commit to saper/wudoovm that referenced this issue Jul 23, 2015
When we pass -Wl,--dynamic-list-cpp-typeinfo (or -Wl,--export-dynamic)
then the run time type information for C++ classes is exported
from the program ("cpu", "dis" etc.) to any dynamically shared
libraries. Therefore C++ type definitions are available and
consistent with what the plugin may use.

We can even produce a specific list if desired, by using -Wl,--dynamic-list

More information:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=201784

Big thanks to David Chisnall (@davidchisnall) for helping
to point this out.

Fixes: marekjm#70
marekjm pushed a commit that referenced this issue Jul 25, 2015
When we pass -Wl,--dynamic-list-cpp-typeinfo (or -Wl,--export-dynamic)
then the run time type information for C++ classes is exported
from the program ("cpu", "dis" etc.) to any dynamically shared
libraries. Therefore C++ type definitions are available and
consistent with what the plugin may use.

We can even produce a specific list if desired, by using -Wl,--dynamic-list

More information:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=201784

Big thanks to David Chisnall (@davidchisnall) for helping
to point this out.

Fixes: #70
@marekjm
Copy link
Owner

marekjm commented May 13, 2016

I'm closing the issue.

Thanks for bringing up that there were some issues with the VM running on FreeBSD.
I'll try to allocate some time to setting up a FreeBSD instance in VirtualBox for testing.

@marekjm marekjm closed this as completed May 13, 2016
@saper
Copy link
Contributor Author

saper commented May 14, 2016

It actually turns out not to be a bug and can be mitigated by controlling dynamically exported symbols, which is a good thing that helps to keep stable APIs between modules anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants