Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release the GIL when performing SDL_GL_SwapWindow call. #8025

Merged
merged 1 commit into from Oct 3, 2022

Conversation

misl6
Copy link
Member

@misl6 misl6 commented Oct 2, 2022

Maintainer merge checklist

  • Title is descriptive/clear for inclusion in release notes.
  • Applied a Component: xxx label.
  • Applied the api-deprecation or api-break label.
  • Applied the release-highlight label to be highlighted in release notes.
  • Added to the milestone version it was merged into.
  • Unittests are included in PR.
  • Properly documented, including versionadded, versionchanged as needed.

TL;DR; On Android, SDL implemented an Activity lock guard (SDL_LockMutex(Android_ActivityMutex);), and that does not fit well with Python GIL ( for a specific reason).

This change is not needed with the current python-for-android release, but it will be needed when upgrading to a newer SDL2 version: kivy/python-for-android#2673.

Thanks to the SDL team who helped with the debugging, I found a deadlock condition that was causing the app to freeze.

Preface:

We use SDL_SetEventFilter so we can catch some events before they are added to the queue. (And that may need some rework)

SDL_SetEventFilter(_event_filter, <void *>self)

The callback passed to SDL_SetEventFilter requires the GIL.
cdef int _event_filter(void *userdata, SDL_Event *event) with gil:
return (<_WindowSDL2Storage>userdata).cb_event_filter(event)

Analysis:

On SDL side, the onNativeTouch callback is called, but Android_ActivityMutex is already locked by Android_GLES_SwapWindow (from a different thread), so it waits until Android_ActivityMutex gets unlocked.

10-01 13:51:11.327  5356  5356 V SDL     : onNativeTouch()-Lock
10-01 13:51:11.328  5356  5386 V SDL     : Android_GLES_SwapWindow()-Unlock
10-01 13:51:11.328  5356  5386 V SDL     : Android_GLES_SwapWindow()-UnlockED

The Android_GLES_SwapWindow finishes and unlocks Android_ActivityMutex so onNativeTouch can now lock Android_ActivityMutex. The touch event then gets pushed via SDL_PushEvent, where

10-01 13:51:11.328  5356  5356 V SDL     : onNativeTouch()-LockED
10-01 13:51:11.328  5356  5356 V SDL     : in SDL_PushEvent -> Type: 1024
10-01 13:51:11.328  5356  5356 V SDL     : in SDL_LockMutex(SDL_event_watchers_lock) -> Type: 1024
10-01 13:51:11.328  5356  5356 V SDL     : in SDL_LockMutex(SDL_event_watchers_lock) [LOCKED] -> Type: 1024

the SDL_EventOK.callback (which as said before requires the GIL on the Python side), gets called but looks stuck.

if (!SDL_event_watchers_lock || SDL_LockMutex(SDL_event_watchers_lock) == 0) {
        if (SDL_EventOK.callback && !SDL_EventOK.callback(SDL_EventOK.userdata, event)) {  \\ <--- We are now here
            if (SDL_event_watchers_lock) {
                SDL_UnlockMutex(SDL_event_watchers_lock);
            }
            return 0;
        }

Why? Well, meanwhile on thread 5386 our flip method, which is calling SDL_GL_SwapWindow have the GIL locked, and can't release it as Android_GLES_SwapWindow is stuck on SDL_LockMutex(Android_ActivityMutex);, waiting for onNativeTouch to unlock the Android_ActivityMutex.

10-01 13:51:11.332  5356  5386 V SDL     : Android_GLES_SwapWindow()-Lock

Ouch. deadlocked.

For more info, see: libsdl-org/SDL#3679

Results after these changes:

Why you resurrected an old test app from issue kivy/python-for-android#2002 ? (https://github.com/jere-botes/KivyBench)

Well looks like we previously reverted a PR (Original: #4219, Reverts it: #6578) that added nogil when calling SDL_GL_SwapWindow, so I was afraid to re-introduce an issue.

From my tests, the nogil on SDL_GL_SwapWindow does not create any performance drop. (Tested on Samsung S10e Android 12 and Android Emulator (arm64): Pixel 6 Pro API 33 (even with 200 widgets runs at 60fps).

Before these changes (freeze):

Screen.Recording.2022-10-02.at.09.33.31.mov

After these changes (no freeze and 60fps with 200 widgets):

Screen.Recording.2022-10-02.at.09.42.01.mov

@misl6 misl6 added this to the 2.2.0 milestone Oct 2, 2022
@misl6
Copy link
Member Author

misl6 commented Oct 2, 2022

Added as reviewers everyone from core-devs who have been deeply involved in previous PRs targeting the performance issues so we can make sure nothing bad happens. Feel free to ask for a pre-built APK to test on your devices. 😄

Copy link
Member

@tshirtman tshirtman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice catch!

Copy link
Member

@AndreMiras AndreMiras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work, thanks for all the digging and explanation 👏

Copy link
Member

@matham matham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

5 participants