New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sendorfwd] Skip deadlocking libgbinder cleanup. Fixes JB#60298 #7
Conversation
|
That's typically a symptom of a reference (and hence memory) leak. |
While there seems to be some reference counting wobble with auto-release objects, Or at least I managed to get the thing to unwind by adding dummy synchronous |
|
@monich Can you take a peek? Is it really that gbinder_client_cancel() and would there be something bit more generic that could be done at libgbinder in case there are similar pending transactions elsewhere too? |
|
So far I haven't been able to reproduce the problem but running sensorfwd under valgrind gives me lots of these when I switch between the apps (which I guess causes connections to be established and terminated): I don't think it's related but who knows. In any case it's worth fixing IMO. |
For hangup to occur: it needs to be something that uses andoid sensor api 1.x -> for example xa2.
There has been a long standing suspicion about thread safety of stuff starting from
In all likelihood not directly contributing to the immediate problem, but I did (finally) create a separate ticket about that. |
Stopping sensorfwd systemd service takes 15 seconds. This happens because libgbinder cleanup gets deadlocked by some kind of orphaned transaction of unknown origin that never finishes - which then requires systemd to go through TERM-wait-timeout-KILL cycle. As the part of at-exit cleanup we really care about is getting sensor hw turned off: Refactor hybrisManager instance logic so that cleanup code that stops sensors gets executed along with exit from mainloop instead of after returning from main(). Then skip all further cleanup code that would be executed after return from main() by using _exit(). Signed-off-by: Simo Piiroinen <simo.piiroinen@jolla.com>
By the looks of it, the deadlock at gbinder_ipc_exit() is caused by delayed processing of gbinder_client_cancel() actions. Use a dummy synchronous transaction to trigger cancellation and thus unblock way to exit. While at it, omit extra references for autorelease objects, add potentially missing reply unref and streamline object cleanup. Signed-off-by: Simo Piiroinen <simo.piiroinen@jolla.com>
Sensorfwd crashes on exit in devices using android sensors api 2.x. The crash occurs within event reader thread. The amount of events that are read in one go is already limited to what can be fit into fixed size buffers. However event processing is done as if all of the available events were read - which then leads to problems when garbage data from beyond buffer end is processed too. Process only the number of events that were read instead of how many would have been available. Signed-off-by: Simo Piiroinen <simo.piiroinen@jolla.com>
Crashes occur during sensorfwd exit time cleanup. Backtraces point towards objects related to cancellation of death notification already being invalidated. Documentation suggests that underlying glib signal connections can end up being torn automatically, but it is a bit unclear where and how that might happen. Doing cancellation of death and other async notifications first seems to help. Signed-off-by: Simo Piiroinen <simo.piiroinen@jolla.com>
dc4df4c
to
8fae0c2
Compare
|
Debug commits dropped. Review commits squashed. |
Stopping sensorfwd systemd service takes 15 seconds. This happens because libgbinder cleanup gets deadlocked by some kind of orphaned transaction of unknown origin that never finishes - which then requires systemd to go through TERM-wait-timeout-KILL cycle.
As the part of at-exit cleanup we really care about is getting sensor hw turned off: Refactor hybrisManager instance logic so that cleanup code that stops sensors gets executed along with exit from mainloop instead of after returning from main(). Then skip all further cleanup code that would be executed after return from main() by using _exit().