-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[example] BulletBouncingBoxDynamic.py segfaults #24
Comments
I got the error by using the ASAN sanitizer with gcc (you have to compile siconos with -DUSE_SANITIZER=asan to activate with gcc or clang). Otherwise try ``siconos -P valgrind BulletBouncingBoxDynamic.cpp'' Below is the first error. There are quite a few, so I won't copy them here, but I can mail you the full log. ==2218== Invalid read of size 8 This bug may not manifest itself (maybe more often when called from python), but there is definitevely an issue that may bit us. Also there is an issue with the Newton loop and the result does not match the reference file as shown here: http://cdash-bipop.inrialpes.fr/testDetails.php?test=329006&build=18034 |
Ok, Valgrind reports some reachable memory in BulletBouncingBoxDynamics.cpp but not in BulletBouncingBox.cpp, so there may be something dangling there. If I manage to get ASAN running I'll look more into it. |
I'm not sure that this issue is still present, but the examples are not running on cdash. @radarsat1 did you look into this? |
The example works. At least, It does not produce a segfault, but the output is rather different from the reference file. |
These examples are out of date. The "normal" version produces ok output, but there appears something wrong with the dynamic output. (Which tests adding one DS to the graph at step 100.) In both cases the reference files need updating, probably due to changes in the integrator/collision detection since a few months ago. But also both examples use BulletSpaceFilter, I'd like to update them to use SiconosBulletCollisionManager instead. |
I have updated the examples and I'm deciding whether to update the reference file, so I am examining the differences. If I use a SiconosPlane positioned exactly where the original reference file comes to rest (which curiously is at ~1.04, allowing for some penetration which is ok I guess), the global view is more or less fine: However a zoomed-in view near the time of settling reveals some differences: In particular something to notice is that the bounce positions of the red line (new behaviour) are closer to the resting position. I propose that this is due to the changes regarding where updateInteractions() is called (i.e. collision detection is performed during the Newton loop, instead of before.) It might be worth testing if this is the cause, but there are sufficient changes to the integrator that this would be difficult, and I think the new behaviour is desired, so I'll just update the reference file to reflect the change if there are no arguments. However, I will also take the opportunity to update the example to put the "ground" such that the bouncing position is at 0, and change it for SiconosPlane instead of a box. As for the Dynamic version, the behaviour is sufficiently complex that I'm not sure even having a reference file is worth it, but I'll update it to match. It appears to be more or less the same as before in global behaviour, comparable differences only at small scale. |
Okay something strange. I was just investigating why some of the peaks in the last graph above are missing, and in fact I don't understand. In BulletBouncingBox, there is a series of "if" statements checking that there are exactly 4 contacts, etc, and only collects the lambda(1) value if they are all true. But, weirdly, removing all conditions, I changed the code to,
and I still get missing impulses: But if I change the code to collect p() instead (which is just the sum of all lambda() over the DS, after change of reference frame, right?) then I get:
I get the expected graph (with a different amplitude of course): Anyways, I will push the code but I guess I will not push new reference files until this is understood. |
As I thought, it turns out the missing impulses are due to how I changed So, this brings up the issue of whether this is correct. I thought it should be controversial to be adding and removing interactions during the Newton loop. What happens here is that if there is another iteration of the Newton loop, and the contacts are resolved, then positions are updated and Bullet is called again, which removes the Interactions. So they cannot be found by traversing the indexSet1. Two possible solutions:
|
We should be able to keep the list of interactions within the Newton loop but to be able to update the values of the contact point and local frame at contact when an update of the position is made during the newton loop. |
Yes okay, it makes sense, it should be more robust, only it will be a bit of work to implement but I'll give it a try. Do you think the contact engine should be allowed to add new contacts during the Newton loop, or only at the beginning? In any case I'll add a parameter to the InteractionManager indicating whether it is being called at the beginning or in the middle of the loop. |
To be clear, I should mention that the reason I was inspired to add updateInteractions() in the middle of the Newton loop is because previously there was already a call to updateWorldFromDS there, to update the collision engine data structures after MoreauJeanOSI::updatePosition, however the collision engine was not subsequently being called. So I had the choice of either removing the useless updateWorldFromDS, or adding a call to the collision engine. It seemed to me a good idea to update the broadphase whenever updateState/updatePosition was called, but perhaps not. Better to just adjust the contact points according to the new DS positions, without performing new a detection sweep. Not sure how it will impact rolling/sliding/contact behaviour, I will have to test. |
I think we should be able to parametrize the behavior:
It is possible to update the relation calling Bullet without calling the broad page an changing the list of interactions. We also have to work with bullet to stabilize the list of contact points. It is nice but also quite expensive in terms of memory management and convergence of nsgs. |
I agree that configurable is the way to go, at least until some empirical or theoretical conclusions can be made. I'm not completely convinced that the "no update" idea is valid, it seems to me that if q() changes, then the contact point locations must change as well -- I assumed this was previously a source of bugs. Nonetheless it is easy to add, so it can be an option. (Idea: Perhaps we should store contact points in NewtonEulerFrom1DLocalFrameR as relative to q(), instead of absolute coordinates!) I'll look into finer control with Bullet, otherwise moving the contact points and distance shouldn't be too difficult, I was just avoiding it. In fact I think it might be worth providing at the SiconosCollisionManager level instead of in SiconosBulletCollisionManager, so that the same logic can easily be applied to other collision engines (eventually). The calculations themselves could go in NewtonEulerFrom1DLocalFrameR or NewtonEulerFrom3DLocalFrameR. |
Great. I trust you for this idea... and the implementation. |
Ok there is an initial implementation in 01a06f2. Please review. It seems to generate the impulses for BulletBouncingBox as expected and with my testing so far it doesn't introduce errors. Basically I add _relPc1 and _relPc2 to NewtonEulerFrom1DLocalFrame and use these to update Pc1 and Pc2 in a default implementation of computeh(). BulletR also calls this before calculating a new contact distance. Good results so far with slide_roll.py. I will continue to test. |
I found that cube_directproj.py was broken with is update. It's an interesting example.. it seems the calculations for updating the contact points were correct, but since it uses such a large timestep, only the corner of the cube enters the surface and it needs to add more contact points via the intra-Newton updateInteractions for it to be stable, otherwise it starts to oscillate instead of sitting steady on the surface. This (a) shows a use of having this behaviour optional, and (b) would be an interesting test case for add-Interactions only behaviour. (i.e, the broadphase is called but we don't remove contact points, only add them -- this could lead to runaway behaviour where too many Interactions are generated, but in combination with Bullet's "breaking threshold" which should block generation of new contact points within a certain distance, i believe it should stabilize as updates get closer together.) Anyways, fixed in 67ba02b |
The one reference file that is still not matching very well is NE_3DS_3Knee_1Prism_GMP.cpp, which gives a very close match (7.867185e-05) but far enough to fail the test. After a git bisect I determined that this started happening in the commit where I fixed the one-DS case for PrismaticJointR. e28c8e2 Looking at the reference file it appears that there is some very small movement in beam3 in one of the constrained axes of the prismatic joint, but it is exactly zero in the reference file. I will investigate before closing this. |
I am not sure whether to be concerned or not, but I will put a small analysis of what I have understood so far. Basically my changes to PrismaticJointR calculations seem to have introduced a small-amplitude (1e-17) change in the response of one axis in NE_3DS_3Knee_1Prism_GMP. All other axes are identical to before. The image below shows how axis 15 of the dat/ref files (corresponding to the X position of beam3) compared before and after my changes. Before it was exactly zero, and it seems that my modifications to PrismaticJointR introduced some play in that axis. This changes a bit if you re-orient the axis, e.g. the 4th plot here is the same as the 3rd one, but changed I can't really explain this except that I think it must come from the interaction of this joint with this particular kinematic configuration. It may be that the good behaviour in the GMP case is the exception and not the rule. As evidence, I include the ref files for MLCP and MLCP_MoreauJeanCombinedProjection cases, which also show some play on that axis. What I do find concerning is that if this is just a question of precision, it's strange that no amount of increasing the tolerance requirements of the solvers seems to improve it. Additionally, if I decrease the time step, the signal still looks the same, which indicates to me that it is a "real" simulation artifact, i.e., something wrong with the equality, and not just a precision issue. |
After reviewing more details of this final issue, it seems that I was wrong about the 15th column containing all the differences. In fact, if you calculate the norm of the difference for each column of the ref individually, one gets an error of 1e-5 for the new version. If you change the computation of jachq to closer to what it was before, by setting the elements of jachq(0,3-6) and jachq(1,3-6) to zero (c.f. old PrismaticJointR.cpp, you get an error of 1e-14 for each column. In other words the change with respect to the reference is a true reflection of the new computations in jachq that take into account the orientation with respect to change in position. I still don't know why there is more "play" in beam3-X than previously, but at an amplitude of 1e-17 it does not contribute significantly to the difference with respect to the ref file, even if it has an interesting structure. So, I would conclude here that updating the ref is the correct action to take. |
Fine. I trust you for the updating of the ref file. Concerning the error of amplitude 1e-17, it may perhaps depends how the constraints is taken into account into the GMP solvers and especially its order. |
See discussion in issue #24, problems with missing impacts were resolved.
Looks similar to the issue fixed in 83ce528
==28140==ERROR: AddressSanitizer: SEGV on unknown address 0x7fc453000053 (pc 0x7fc431d7d627 sp 0x7ffe34e177d0 bp 0x000000f16b40 T0)
#0 0x7fc431d7d626 in btCollisionWorld::~btCollisionWorld() (/usr/lib64/libBulletCollision.so.2.83+0x6f626)
#1 0x7fc431d7d6b8 in btCollisionWorld::~btCollisionWorld() (/usr/lib64/libBulletCollision.so.2.83+0x6f6b8)
#2 0x7fc4179ba28b in void boost::checked_delete(btCollisionWorld_) /usr/include/boost/core/checked_delete.hpp:34
#3 0x7fc4179c762e in boost::detail::sp_counted_impl_p::dispose() /usr/include/boost/smart_ptr/detail/sp_counted_impl.hpp:78
#4 0x7fc4179180bd in boost::detail::sp_counted_base::release() /usr/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp:146
#5 0x7fc417918304 in boost::detail::shared_count::~shared_count() /usr/include/boost/smart_ptr/detail/shared_count.hpp:443
#6 0x7fc41795e0ff in boost::shared_ptr::~shared_ptr() /usr/include/boost/smart_ptr/shared_ptr.hpp:323
#7 0x7fc417962edd in BulletSpaceFilter::~BulletSpaceFilter() /home/xhub/siconos/mechanics/src/contactDetection/bullet/BulletSpaceFilter.hpp:26
#8 0x7fc417962fab in BulletSpaceFilter::~BulletSpaceFilter() /home/xhub/siconos/mechanics/src/contactDetection/bullet/BulletSpaceFilter.hpp:26
#9 0x7fc4179be577 in void boost::checked_delete(BulletSpaceFilter_) /usr/include/boost/core/checked_delete.hpp:34
#10 0x7fc4179c612e in boost::detail::sp_counted_impl_p::dispose() /usr/include/boost/smart_ptr/detail/sp_counted_impl.hpp:78
#11 0x7fc4179180bd in boost::detail::sp_counted_base::release() /usr/include/boost/smart_ptr/detail/sp_counted_base_gcc_x86.hpp:146
#12 0x7fc417918304 in boost::detail::shared_count::~shared_count() /usr/include/boost/smart_ptr/detail/shared_count.hpp:443
#13 0x7fc41791a2c7 in boost::shared_ptr::~shared_ptr() /usr/include/boost/smart_ptr/shared_ptr.hpp:323
#14 0x7fc4178fbfa5 in _wrap_delete_BulletSpaceFilter /build/siconos/wrap/siconos/mechanics/contact_detection/bulletPYTHON_wrap.cxx:115918
#15 0x7fc4385762a2 in PyObject_Call (/usr/lib64/libpython2.7.so.1.0+0x4c2a2)
#16 0x7fc438576b33 in PyObject_CallFunctionObjArgs (/usr/lib64/libpython2.7.so.1.0+0x4cb33)
#17 0x7fc417668c9f in SwigPyObject_dealloc /build/siconos/wrap/siconos/mechanics/contact_detection/bulletPYTHON_wrap.cxx:1721
#18 0x7fc4385ac49a (/usr/lib64/libpython2.7.so.1.0+0x8249a)
#19 0x7fc4385c8363 (/usr/lib64/libpython2.7.so.1.0+0x9e363)
#20 0x7fc4385ab53e (/usr/lib64/libpython2.7.so.1.0+0x8153e)
#21 0x7fc4385acfaf (/usr/lib64/libpython2.7.so.1.0+0x82faf)
#22 0x7fc4385b190b in _PyModule_Clear (/usr/lib64/libpython2.7.so.1.0+0x8790b)
#23 0x7fc438621dfa in PyImport_Cleanup (/usr/lib64/libpython2.7.so.1.0+0xf7dfa)
#24 0x7fc43862d82d in Py_Finalize (/usr/lib64/libpython2.7.so.1.0+0x10382d)
#25 0x7fc43863ee43 in Py_Main (/usr/lib64/libpython2.7.so.1.0+0x114e43)
#26 0x7fc437f8c6df in __libc_start_main (/lib64/libc.so.6+0x206df)
#27 0x400758 in _start (/usr/bin/python2.7+0x400758)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV ??:0 btCollisionWorld::~btCollisionWorld()
==28140==ABORTING
/build/siconos/scripts/siconos: /home/xhub/siconos/examples/Mechanics/ContactDetection/BulletBouncingBox/BulletBouncingBoxDynamic.py failed, Command '['python', '/home/xhub/siconos/examples/Mechanics/ContactDetection/BulletBouncingBox/BulletBouncingBoxDynamic.py']' returned non-zero exit status 1
The text was updated successfully, but these errors were encountered: