-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use std::atomic instead of the windows-only primitives for atomic operations in the song renderer #50
base: master
Are you sure you want to change the base?
Conversation
3837561
to
671640e
Compare
(I'll squash the commits once it's working.) |
301f3cc
to
67cb52c
Compare
WaveSabreCore/CMakeLists.txt
Outdated
#set_property(TARGET WaveSabreCore PROPERTY INTERPROCEDURAL_OPTIMIZATION ON) | ||
target_compile_options(WaveSabreCore | ||
PUBLIC -O2 -fno-exceptions -fno-rtti -fno-stack-protector -fno-stack-check -fno-unwind-tables -fno-asynchronous-unwind-tables -fomit-frame-pointer -fno-threadsafe-statics | ||
PRIVATE -ffast-math -march=nocona -ffunction-sections -fdata-sections -Wl,--gc-sections) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of these are only appropriate in the MinSizeRel config, so I suggest doing this instead:
target_compile_options(WaveSabreCore
PUBLIC $<$<CONFIG:MinSizeRel>:-O2 -fno-exceptions -fno-rtti -fno-stack-protector -fno-stack-check -fno-unwind-tables -fno-asynchronous-unwind-tables -fomit-frame-pointer -fno-threadsafe-statics>
PRIVATE $<$<CONFIG:MinSizeRel>:-ffast-math -march=nocona -ffunction-sections -fdata-sections -Wl,--gc-sections)
...That way you can also drop the extra conditional.
...that being said, I question some of the flags here:
O2
: This should already be set for release-builds, I think. But I would suspect we really want -Os? I think CMake sets better defaults than what you do here...ffast-math
: Are you sure? This is generally a really bad flag that break code. Perhaps-fno-math-errno
gives the part that you really care about?march=nocona
: I would imagine we want this one regardless of the config, so this should probably be set unconditionally. But the choice also seems a bit arbitrary... any proper reasoning for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks, I don't know that much about CMake.
O2: IME this usually produces better compressible code than Os, but this is for 4k code, so it migth be better the other way around here.
ffast-math: fair enough.
march=nocona: often produces the smallest code, found by experimentation, disables a number of new SIMD instructions, though.
56bc74b
to
d8d2dae
Compare
Minus those one (or two) nit(s) above, this looks pretty good to me. |
This one needs to be size-tested and verified working especially with squishy, which also doesn't support TLS (most 64k packers don't actually). I have a task in squishy to identify TLS usage and error in that case (a user has accidentally tripped on this and provided a test case) and I'd like to do that first before double-checking that this is OK. (if this becomes a bottleneck, this can be checked for manually btw). |
Ah, I see now that we've indeed split the impl between windows and other platforms - that makes me more confident about size and checking for TLS with squishy is actually a non-issue in that case. Still, I'd like to see a size comparison (though I expect this will not really be affected, as the code will be monomorphized and likely inlined). |
@yupferris: Yeah... We should probably make the CI report the size of the stand-alone player (or TestPlayer) or something, so we always know what's going on there. It's tempting to try to turn this into some sort of delta and make it fail if it grows etc, but tracking stats across CI runs is much more difficult than just reporting numbers in the log. So maybe just printing the size at the end is enough? |
For many changes we should still ideally run the thing ofc, but at least for a first approximation (and something we can link to for actual results/bookkeeping) it's a good idea to pack and report in CI. I don't think we need (or necessarily want) to do more than that. |
…rations in the song renderer
d8d2dae
to
3980638
Compare
I've started splitting up my patches into usable chunks, this is the first one.
Someone will have to test whether this compiles to the atomic instructions on x86 w/ MSVC, but I'm pretty sure it does do that.