-
Notifications
You must be signed in to change notification settings - Fork 16.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AP_HAL_SITL: disable SITL stack checking due to memory corruption #13276
AP_HAL_SITL: disable SITL stack checking due to memory corruption #13276
Conversation
There appears to be issues with pthread_attr_setstack.
I'm pretty hesitant about this. I can easily believe it's a platform dependent problem, and should be disabled on some, but since both scripting and FTP use threads (FTP even uses a 1K thread as well) and they both seem to be reliable on a Linux system I don't think we should kill this entirely. Stack corruption is just to insidious to justify turning off any protection, even if it's as weak as this. |
@WickedShell the failure is on Linux. I can't reproduce this on macos or windows. |
@WickedShell the failure is on Linux. I can't reproduce this on macos or windows. Also currently this makes some of the tests incredibly unreliable - so unless we can fix it it's causing more problems than it's worth. Note that I don't think this stops our stack overflow protection from working so we are covered from that perspective. |
I think we should apply this now. I would like to find a way to do the stack checking reliably in SITL, but for now the benefit of valgrind working properly is worth the change |
Wait, what's the valgrind failure? Valgrind is running fine for me without any problems with the stack checking. If I'm reading the related issues wrong this is a problem with the vagrant enviorment more then anything else. FRSky D in SITL (without anything else happening) works in valgrind for me (with the stack stuff). I can't run the autotests due to the test being incompatible with python3, so I'll have to come back to that. This check is quite nice and valuable in for both scripting and FTP when I've been working on it which is why I'm pretty opposed to losing it. |
@WickedShell I did not realize @peterbarker had removed the stack check as well. For me this didn't appear necessary for the test to pass. That said I see wide discrepencies between stack required in SITL and stack required on an STM32 - so it's going to be necessary to have SITL specific stacks everywhere if we go this route (I'm still guessing this is 32bit vs 64bit OS) which doesn't seem very useful. I think it would be more useful if we did the test but simply reported the stack usage at the end. |
@WickedShell @peterbarker @tridge #13253 now passes with this merged. The only change there is to increase the stack size of the frsky thread |
On Wed, 15 Jan 2020, Andy Piper wrote:
@WickedShell @peterbarker @tridge #13253 now passes with this merged. The only change there is to increase the stack size of the frsky thread
Does that need to be done now, 'though?
|
On Wed, 15 Jan 2020, Andy Piper wrote:
@WickedShell I did not realize @peterbarker had removed the stack check
as well. For me this didn't appear necessary for the test to pass. That
The stack checking is pointless without the setstack FWIU.
|
On Wed, 15 Jan 2020, WickedShell wrote:
Wait, what's the valgrind failure? Valgrind is running fine for me without any problems with the stack checking. If I'm reading the related issues wrong
this is a problem with the vagrant enviorment more then anything else. FRSky D in SITL (without anything else happening) works in valgrind for me (with
the stack stuff). I can't run the autotests due to the test being incompatible with python3, so I'll have to come back to that.
This check is quite nice and valuable in for both scripting and FTP when I've been working on it which is why I'm pretty opposed to losing it.
My understanding is that whether you see the issue or not comes down to
whether your implentation of pthread_attr_setstack takes into account
stack guard pages or not - the semantics have changed over time.
This did cause Valgrind failures.
But it caused out-and-out aborts for me without Valgrind in play as libc
realised its memory allocation metadata was corrupt. It was also crashing
in CI - with the same symptoms you would expect for that sort of crash.
We don't want to lose it. But it has caused Andy a rather large amount of
pain, and we do NOT want to annoy developers with flawed SITL features.
|
No - with your change there is no need to increase the stack in SITL. Everything works for me on macos. |
There appears to be issues with pthread_attr_setstack.
@andyp1per