Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-104530: Use native Win32 condition variables #104531

Merged
merged 6 commits into from Feb 2, 2024
Merged

Conversation

adr26
Copy link
Contributor

@adr26 adr26 commented May 16, 2023

Update Python to using native Win32 condition variable support, by setting _PY_EMULATED_WIN_CV to 0.

A small fix to PyCOND_TIMEDWAIT() was required to correctly report condition variable timeouts.

After doing this, I found that _listdir_windows_no_opendir() was crashing because Py_END_ALLOW_THREADS is overwriting Win32's GLE and the error code ERROR_NO_MORE_FILES from FindNextFileW() was being wiped out:

        Py_BEGIN_ALLOW_THREADS
        result = FindNextFileW(hFindFile, &wFileData);
        Py_END_ALLOW_THREADS
        /* FindNextFile sets error to ERROR_NO_MORE_FILES if
           it got to the end of the directory. */
        if (!result && GetLastError() != ERROR_NO_MORE_FILES) {

I debugged this (using iDNA/TTD) and found that SleepConditionVariableSRW() was clearing the Win32 GLE in Py_END_ALLOW_THREADS:

0:008> t
Time Travel Position: 6D3A9:38
ntdll!RtlSetLastWin32Error+0x46:
00007ffb`26d43316 894868          mov     dword ptr [rax+68h],ecx ds:0000000f`a9a8c068=00000012

0:008> !gle
LastErrorValue: (Win32) 0x12 (18) - There are no more files.
LastStatusValue: (NTSTATUS) 0 - STATUS_SUCCESS

0:008> t
Breakpoint 0 hit
Time Travel Position: 6D3A9:39
ntdll!RtlSetLastWin32Error+0x49:
00007ffb`26d43319 8b442450        mov     eax,dword ptr [rsp+50h] ss:0000000f`ab52d400=00000000

0:008> !gle
LastErrorValue: (Win32) 0 (0) - The operation completed successfully.
LastStatusValue: (NTSTATUS) 0 - STATUS_SUCCESS

0:008> kn7
# Child-SP          RetAddr               Call Site
00 0000000f`ab52d3b0 00007ffb`24766f9d     ntdll!RtlSetLastWin32Error+0x49
01 0000000f`ab52d400 00007ffb`247b2b47     KERNELBASE!BaseSetLastNTError+0x1d
02 0000000f`ab52d430 00007ffa`722ff5d3     KERNELBASE!SleepConditionVariableSRW+0x37
03 (Inline Function) --------`--------     python312!PyCOND_TIMEDWAIT+0x23 [C:\src\cpython\Python\condvar.h @ 285] 
04 0000000f`ab52d470 00007ffa`72142665     python312!take_gil+0xb3 [C:\src\cpython\Python\ceval_gil.c @ 381] 
05 (Inline Function) --------`--------     python312!PyEval_RestoreThread+0x11 [C:\src\cpython\Python\ceval_gil.c @ 711] 
06 0000000f`ab52d4c0 00007ffa`7214883e     python312!_listdir_windows_no_opendir+0x315 [C:\src\cpython\Modules\posixmodule.c @ 4165]

I have fixed this by adding code to PyEval_RestoreThread to save and restore the Win32 GLE, and removing the code in PyThread_tss_get() and PyThread_get_key_value() that does the same (as it's no longer required).

cc: @ericsnowcurrently

@ericsnowcurrently
Copy link
Member

I'll defer to @zooba and other Windows experts on this one.

@zooba
Copy link
Member

zooba commented Feb 1, 2024

I definitely want the fixes in this PR, even if we don't change the default setting for CVs.

My main concern is destabilising things - @encukou has just spent days tracking down a Windows deadlock, and I'd hate to give him a whole lot of new ones ;)

But let's see what the buildbots say

@zooba zooba added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Feb 1, 2024
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @zooba for commit fb37931 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Feb 1, 2024
@encukou
Copy link
Member

encukou commented Feb 2, 2024

My main concern is destabilising things - @encukou has just spent days tracking down a Windows deadlock, and I'd hate to give him a whole lot of new ones ;)

Eh, I think there are more deadlocks waiting to be solved even without this PR.
If this is how things should be on Windows, now is the time to merge it.

@zooba
Copy link
Member

zooba commented Feb 2, 2024

Alright, let's YOLO it 😆

Easy enough to change the setting to disable later on if it turns out to be trouble, but alphas are definitely the time to turn it on.

@zooba zooba merged commit b3f0b69 into python:main Feb 2, 2024
115 of 117 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants