Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows test failures with c++ exception: "Access violation - no RTTI data!" #1522

Closed
2 tasks done
scpeters opened this issue Oct 19, 2020 · 9 comments · Fixed by #1541
Closed
2 tasks done

Windows test failures with c++ exception: "Access violation - no RTTI data!" #1522

scpeters opened this issue Oct 19, 2020 · 9 comments · Fixed by #1541
Labels
type: bug Indicates an unexpected problem or unintended behavior

Comments

@scpeters
Copy link
Collaborator

Bug Report

  • I checked the documentation and the forum but found no answer.
  • I checked to make sure that this issue has not already been filed.

Environment

  • DART version: master
  • OS name and version name(or number): Windows 10
  • Compiler name and version number: Visual Studio 2019

Expected Behavior

All tests that pass on Linux and macOS should pass on windows.

Current Behavior

Certain tests fail on Windows with the following console message:

unknown file: error: C++ exception with description "Access violation - no RTTI data!" thrown in the test body.

Steps to Reproduce

Please provide detailed steps for reproducing the issue.

  1. Re-enable the tests that were disabled in dc4056a
  2. Run the tests on windows
  3. Observe the failures

Code to Reproduce

I've enabled some tests in scpeters@685e00c (branch scpeters:windows_ci_debug) that were previously disabled in #1513 (dc4056a), and the failures can be seen in the GitHub actions CI output.

@scpeters scpeters added the type: bug Indicates an unexpected problem or unintended behavior label Oct 19, 2020
@jslee02
Copy link
Member

jslee02 commented Oct 19, 2020

Thank you for the report! Yeah, the tests were disabled mostly due to this issue. One example is this line:

bn->createShapeNodeWith<
dart::dynamics::VisualAspect,
dart::dynamics::CollisionAspect,
dart::dynamics::DynamicsAspect>(
std::make_shared<dart::dynamics::BoxShape>(Eigen::Vector3d::Ones()));

It seems the access violation happens when a composite is being dynamic-cased to the concrete type (e.g., FixedFrame) in its constructor at this line:

mComposite = dynamic_cast<CompositeType*>(newComposite);

I'm not sure if this is a bug of MSVC or the implementation doesn't comply with the standard. Will keep investigating this.

@scpeters
Copy link
Collaborator Author

yes, I just collected a backtrace that implicates createShapeNodeWith and that dynamic_cast in Aspect.hpp

@jslee02
Copy link
Member

jslee02 commented Oct 25, 2020

Hm, I wasn't able to figure out the solution to this problem. Maybe @mxgrey could shed some light here. Until then let me find a workaround to avoid the dynamic_cast in the constructor.

@mxgrey
Copy link
Member

mxgrey commented Oct 26, 2020

If en.cppreference.com's explanation of The Standard can be trusted, I think this is an error in MSVC:

  1. When dynamic_cast is used in a constructor or a destructor (directly or indirectly), and expression refers to the object that's currently under construction/destruction, the object is considered to be the most derived object. If new-type is not a pointer or reference to the constructor's/destructor's own class or one of its bases, the behavior is undefined.

As far as I can tell, the problematic cases are being called from the constructors of classes that the dynamic_cast is trying to cast newComposite into. This should be perfectly acceptable, unless I'm misunderstanding something.

But supposing this is an MSVC problem that we need to accommodate, the only solution I can think of is to refactor every constructor (or at least the constructors that are problematic) to make them private/protected and replace them with an equivalent T::create(...) function that constructs the class and then adds the aspects after construction is complete. That should be a guaranteed way to fix this problem, but it does break the API.

@scpeters
Copy link
Collaborator Author

if we want to report it as a compiler bug, we should try to create a minimal reproduction example

@traversaro
Copy link
Contributor

traversaro commented Nov 10, 2020

If the problem is indeed due to MSVC, a possible workaround could be to just build on Windows dartsim and any library that uses dartsim's headers using clang-cl . This would still give us MSVC-compatible binaries, and will permit to use MSVC again for any downstream project that does not directly include dartsim's headers. For example, we would need to compile dartsim and ignition-physics with clang-cl, but then a MSVC program could link without problems to ign-physics and execute by running the clang-cl compiled ign-physics plugin.

@traversaro
Copy link
Contributor

However, this needs to be tested, because it is possible to be linked to how MSVC handles RTTI data, and so it is possible that also clang-cl is affected by it.

@traversaro
Copy link
Contributor

traversaro commented Nov 11, 2020

It was worth giving it a try, but apparently with clang-cl 10 the test_Aspect fails even before:

(dome-dev) C:\src\dome-dev\workspace\build\DART>ctest -VV -R test_Aspect
UpdateCTestConfiguration  from :C:/src/dome-dev/workspace/build/DART/DartConfiguration.tcl
UpdateCTestConfiguration  from :C:/src/dome-dev/workspace/build/DART/DartConfiguration.tcl
Test project C:/src/dome-dev/workspace/build/DART
Constructing a list of tests
Done constructing a list of tests
Updating test list for fixtures
Added 0 tests to meet fixture requirements
Checking test dependency graph...
Checking test dependency graph end
test 1
    Start 1: test_Aspect

1: Test command: C:\src\dome-dev\workspace\build\DART\unittests\unit\test_Aspect.exe
1: Test timeout computed to be: 10000000
1: Running main() from C:\src\dome-dev\workspace\src\dart\unittests\gtest\src\gtest_main.cc
1: [==========] Running 9 tests from 1 test case.
1: [----------] Global test environment set-up.
1: [----------] 9 tests from Aspect
1: [ RUN      ] Aspect.Generic
1: [       OK ] Aspect.Generic (1 ms)
1: [ RUN      ] Aspect.Specialized
1: [       OK ] Aspect.Specialized (0 ms)
1: [ RUN      ] Aspect.Releasing
1: [       OK ] Aspect.Releasing (0 ms)
1: [ RUN      ] Aspect.StateAndProperties
1: [       OK ] Aspect.StateAndProperties (2 ms)
1: [ RUN      ] Aspect.Construction
1: [       OK ] Aspect.Construction (0 ms)
1: [ RUN      ] Aspect.Joints
1/1 Test #1: test_Aspect ......................Exit code 0xc0000374
***Exception:   0.07 sec

0% tests passed, 1 tests failed out of 1

Total Test time (real) =   0.10 sec

The following tests FAILED:
          1 - test_Aspect (Exit code 0xc0000374
)
Errors while running CTest

@traversaro
Copy link
Contributor

Hopefully this could be fixed by #1540 . Actually the fix was already proposed in #1431, but personally I did not read that issue in detail. Indeed, the effect of the /vd2 option is (https://docs.microsoft.com/en-us/cpp/build/reference/vd-disable-construction-displacements?view=msvc-160):

Allows you to use dynamic_cast Operator on an object being constructed. For example, a dynamic_cast from a virtual base class to a derived class.

So this indeed could be related.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Indicates an unexpected problem or unintended behavior
Projects
None yet
4 participants