New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[scsynth][supernova] fix Windows not guarding denormals #4504
[scsynth][supernova] fix Windows not guarding denormals #4504
Conversation
Travis is failing because it has CMake 3.9 instead of 3.10 Perhaps can be done without CMake this sets the flags
which are then used as in this PR |
CMakeLists.txt
Outdated
# so we have to set our own | ||
|
||
cmake_host_system_information(RESULT SC_SYSTEM_HAS_SSE QUERY HAS_SSE) | ||
cmake_host_system_information(RESULT SC_SYSTEM_HAS_SSE2 QUERY HAS_SSE2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
every x86 that doesn't have sse1/sse2 should probably go to a museum (especially they are available on all 64bit intel CPUs)
it might be better to port the feature detection macros to boost.predef
to ensure portability:
https://www.boost.org/doc/libs/1_66_0/doc/html/predef/reference.html#predef.reference.boost_hw_hardware_macros
Except for requiring cmake 3.10 @jsurge solución is the simplest one. |
after discussions with other devs, the case for bumping CMake is not strong enough here. I'm currently re-writing to incorporate @sonoro1234's solution and remove the build system from this change. |
13b5ac2
to
a2c3fb9
Compare
@jrsurge why not use boost.predef as @timblechmann suggested? then we don't need to add an entirely new file that does the same thing. Also, just want to make sure this is clear to everyone -- these macros do not detect whether SSE/SSE2/AVX/etc. are available on the hardware; rather they are set by the compiler to indicate that the corresponding instruction set may be used during compilation. it's totally legitimate to disallow certain instruction sets even if they are available on the machine used for compiling; for instance, so that the build can be copied to another machine without those instruction sets. |
boost didn't seem to make this any easier, if not harder.
(with reference to the boost docs more specifically) the word detection is everywhere with reference to these macros. I can make the wording in commits less strong? I'm slightly out of my depth on this one, I'm just trying to do what I'm told to fix an issue. |
514a0ab
to
bda6a96
Compare
server/scsynth/SC_World.cpp
Outdated
#include <boost/predef/architecture.h> | ||
#include <boost/predef/hardware.h> | ||
|
||
#ifdef BOOST_HW_SIMD_AVAILABLE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be BOOST_HW_SIMD_X86 >= BOOST_HW_SIMD_X86_SSE_VERSION
(BOOST_HW_SIMD_AVAILABLE is for all SIMD sets, but xmmintrin.h
is x86-specific)
server/supernova/server/server.cpp
Outdated
@@ -207,7 +210,7 @@ static void name_current_thread(int thread_index) { | |||
} | |||
|
|||
static void set_daz_ftz(void) { | |||
#ifdef __SSE__ | |||
#if defined(BOOST_ARCH_X86_64) || (BOOST_HW_SIMD_X86 >= BOOST_X86_SSE_VERSION) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can simply be BOOST_HW_SIMD_X86 >= BOOST_X86_SSE_VERSION
, and you can remove the architecture.h
include
server/scsynth/SC_World.cpp
Outdated
|
||
void sc_SetDenormalFlags() { | ||
_MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON); | ||
_mm_setcsr(_mm_getcsr() | 0x40); // DAZ | ||
} | ||
|
||
#elif defined(__SSE__) | ||
# include <xmmintrin.h> | ||
#elif BOOST_HW_SIMD_X86 >= BOOST_X86_SSE_VERSION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did some looking around -- this code can be changed to match the supernova implementation below. all the operations in it are available and work in the SSE instruction set. in fact the entire thing can be reduced to
static void sc_SetDenormalFlags() {
#if BOOST_HW_SIMD_X86 >= BOOST_HW_SIMD_X86_SSE_VERSION
_MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);
_mm_setcsr(_mm_getcsr() | 0x40); // DAZ
#endif
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can also then remove the architecture.h
include above
thanks for the review @brianlheim! Working on it. |
this fixes denormals on Windows
this fixes denormals on Windows
bda6a96
to
ff2c03d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
Thanks, here's some feedback on 3.11.0-beta1. I tested my previous version (3.10.3) and the new beta version, and I can definitely see an improvement -- a major reduction in CPU spikes in the cases where I saw them before. And all this time I thought the issue was with my Windows laptop, doing CPU throttling or power management or something! (-; In some cases where I used to regularly experience CPU spikes (using VSTs and also doing ambisonic stuff with SC-HOA), I no longer seem to get them. The most obvious "test case" was with some ambisonic stuff (lots of convolution)...when I was actively spatializing sources, I would get (depending on the setup), say 20-25% CPU usage. When I stopped playing ambisonic sources so nothing was going to the decoder, the CPU usage would (unintuitively) go up, often to around 60%. Then when I resumed playing spatialized sources, the CPU would go back to 20-25%. When I stopped, back up to 60-65%. In other cases, the variation was from ~10-12% up to 18-20%. With the new version, the CPU never goes above the "regular" values, of say 25% in the first case (decoding with And with another test case from a live-coding set with extensive use of VSTPlugin, I would sometimes get CPU spikes of 70% or even beyond 100% (causing audible glitches). With this new version, so far I've never seen the CPU usage go above 20-25% in any of those cases... So I'm very happy, this seems to be from this fix -- thanks to those who figured this out! It's like I've got a new laptop... (-; |
Purpose and Motivation
[Edit] Updated in line with #4504 (comment)
This fixes #4046 - Windows not guarding denormals.
MSVC doesn't set the usual
__SSE__
and__SSE2__
etc definitions, so denormals aren't properly handled on Windows.This PR uses boost.predef to check for requested SIMD instructions.
Types of changes
Fixes scsynth on Windows does not guard denormals #4046
This changes how every platform works, so may have unintended consequences
To-do list
scsynth, yes. supernova doesn't compile on msvc
- [ ] Updated documentation