[Windows] allow clang to build tailcall vm, optimize hotpath in zend_atomic#21611
[Windows] allow clang to build tailcall vm, optimize hotpath in zend_atomic#21611henderkes wants to merge 9 commits intophp:masterfrom
Conversation
…ions on every zend_atomic_*_load_ex) # Conflicts: # win32/build/confutils.js
33b452c to
48c0ac5
Compare
|
test failures seem unrelated so I'm calling it a night |
| if ("no" == PHP_NATIVE_INTRINSICS || "yes" == PHP_NATIVE_INTRINSICS) { | ||
| PHP_NATIVE_INTRINSICS = default_enabled; | ||
| } |
There was a problem hiding this comment.
perhaps we should move the "no" case to the same as "disable"? I don't understand why --enable-native-intrinsics=no results in SSE2
| /* On this platform it is non-const due to Interlocked API */ | ||
| static zend_always_inline bool zend_atomic_bool_load_ex(zend_atomic_bool *obj) { | ||
| /* Or'ing with false won't change the value. */ | ||
| return InterlockedOr8(&obj->value, false); |
There was a problem hiding this comment.
this was the performance killer on tight loops, InterlockedOr8 became mfence with clang, but lock cmpxchg on msvc.
but... isn't this unnecessary on x86 anyways? I thought aligned loads were always atomic? sidestepping this code path by preferring C11 atomics now
There was a problem hiding this comment.
My understanding is that we need this to have a consistent view of memory, when EG(vm_interrupt) and other data may be changed by another thread. Without the atomic operation we could possibly see that EG(vm_interrupt) is set, but not see the related data. Maybe @morrisonlevi can tell more about this.
| if ("disabled" == PHP_NATIVE_INTRINSICS) { | ||
| ERROR("Can't enable intrinsics, --with-codegen-arch passed with an incompatible option. ") | ||
| } |
There was a problem hiding this comment.
that error message made no sense, or am I missing something?
There was a problem hiding this comment.
This is great!
The changes look good to me, but I have limited understanding of Windows specific things.
I would suggest to split the PR into:
- Support building with clang (
--with-toolset=clang,WINDOWS_CLANGjob, and changes required to make it work) - Atomics fix
- TAILCALL support
| #if defined(__cplusplus) && defined(__clang__) | ||
| #define TSRMLS_MAIN_CACHE_EXTERN() extern "C" { extern TSRM_TLS void *TSRMLS_CACHE TSRM_TLS_MODEL_ATTR; } | ||
| #define TSRMLS_CACHE_EXTERN() extern "C" { extern TSRM_TLS void *TSRMLS_CACHE; } |
There was a problem hiding this comment.
This code is already wrapped in extern "C" apparently. Could you comment on why this change is necessary?
There was a problem hiding this comment.
I honestly don't understand it either, but intl absolutely refuses to build with Clang because it sees the symbol with both C and C++ naming.
| } | ||
|
|
||
| #if defined(ZEND_WIN32) || defined(HAVE_SYNC_ATOMICS) | ||
| #if (defined(ZEND_WIN32) || defined(HAVE_SYNC_ATOMICS)) && !defined(HAVE_C11_ATOMICS) |
There was a problem hiding this comment.
Should we look for MSVC specifically here, rather than WIN32?
Edit: Possibly there are other places like this, where we use less optimal code because we assume that ZEND_WIN32 implies MSVC
There was a problem hiding this comment.
yes, I had planned on that for a separate follow-up PR, as well as a few other optimizations. e.g. clang also supports attribute cold, but many of these definitions are locked only behind gcc.
I've split it off into:
|
see #21608
closes #21563
cc @arnaud-lb because of his amazing work with the tailcall vm.
cc @php/windows-team any chance the official releases will be switched to clang? compiles a bit longer, but performance is improved vastly