Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use std::atomic instead of the windows-only primitives for atomic operations in the song renderer #50

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

PoroCYon
Copy link
Contributor

@PoroCYon PoroCYon commented Sep 9, 2020

I've started splitting up my patches into usable chunks, this is the first one.

Someone will have to test whether this compiles to the atomic instructions on x86 w/ MSVC, but I'm pretty sure it does do that.

@PoroCYon PoroCYon force-pushed the std-atomic branch 2 times, most recently from 3837561 to 671640e Compare September 9, 2020 15:50
@PoroCYon PoroCYon mentioned this pull request Sep 9, 2020
15 tasks
@PoroCYon
Copy link
Contributor Author

PoroCYon commented Sep 9, 2020

I have no clue why it doesn't compile, and emoon's new[] implementation doesn't seem to help either. I should try doing this stuff when I'm actually awake.

(I'll squash the commits once it's working.)

@PoroCYon PoroCYon force-pushed the std-atomic branch 2 times, most recently from 301f3cc to 67cb52c Compare September 9, 2020 16:46
WaveSabreCore/CMakeLists.txt Outdated Show resolved Hide resolved
WaveSabreCore/CMakeLists.txt Outdated Show resolved Hide resolved
WaveSabreCore/CMakeLists.txt Outdated Show resolved Hide resolved
WaveSabreCore/CMakeLists.txt Outdated Show resolved Hide resolved
#set_property(TARGET WaveSabreCore PROPERTY INTERPROCEDURAL_OPTIMIZATION ON)
target_compile_options(WaveSabreCore
PUBLIC -O2 -fno-exceptions -fno-rtti -fno-stack-protector -fno-stack-check -fno-unwind-tables -fno-asynchronous-unwind-tables -fomit-frame-pointer -fno-threadsafe-statics
PRIVATE -ffast-math -march=nocona -ffunction-sections -fdata-sections -Wl,--gc-sections)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of these are only appropriate in the MinSizeRel config, so I suggest doing this instead:

target_compile_options(WaveSabreCore
		PUBLIC $<$<CONFIG:MinSizeRel>:-O2 -fno-exceptions -fno-rtti -fno-stack-protector -fno-stack-check -fno-unwind-tables -fno-asynchronous-unwind-tables -fomit-frame-pointer -fno-threadsafe-statics>
		PRIVATE $<$<CONFIG:MinSizeRel>:-ffast-math -march=nocona -ffunction-sections -fdata-sections -Wl,--gc-sections)

...That way you can also drop the extra conditional.

...that being said, I question some of the flags here:

  • O2: This should already be set for release-builds, I think. But I would suspect we really want -Os? I think CMake sets better defaults than what you do here...
  • ffast-math: Are you sure? This is generally a really bad flag that break code. Perhaps -fno-math-errno gives the part that you really care about?
  • march=nocona: I would imagine we want this one regardless of the config, so this should probably be set unconditionally. But the choice also seems a bit arbitrary... any proper reasoning for it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks, I don't know that much about CMake.

O2: IME this usually produces better compressible code than Os, but this is for 4k code, so it migth be better the other way around here.

ffast-math: fair enough.

march=nocona: often produces the smallest code, found by experimentation, disables a number of new SIMD instructions, though.

@PoroCYon PoroCYon force-pushed the std-atomic branch 4 times, most recently from 56bc74b to d8d2dae Compare September 13, 2020 15:52
WaveSabreCore/CMakeLists.txt Outdated Show resolved Hide resolved
@kusma
Copy link
Contributor

kusma commented Sep 14, 2020

Minus those one (or two) nit(s) above, this looks pretty good to me.

@yupferris
Copy link
Member

yupferris commented Sep 14, 2020

This one needs to be size-tested and verified working especially with squishy, which also doesn't support TLS (most 64k packers don't actually). I have a task in squishy to identify TLS usage and error in that case (a user has accidentally tripped on this and provided a test case) and I'd like to do that first before double-checking that this is OK. (if this becomes a bottleneck, this can be checked for manually btw).

@yupferris
Copy link
Member

Ah, I see now that we've indeed split the impl between windows and other platforms - that makes me more confident about size and checking for TLS with squishy is actually a non-issue in that case. Still, I'd like to see a size comparison (though I expect this will not really be affected, as the code will be monomorphized and likely inlined).

@kusma
Copy link
Contributor

kusma commented Sep 14, 2020

@yupferris: Yeah... We should probably make the CI report the size of the stand-alone player (or TestPlayer) or something, so we always know what's going on there. It's tempting to try to turn this into some sort of delta and make it fail if it grows etc, but tracking stats across CI runs is much more difficult than just reporting numbers in the log. So maybe just printing the size at the end is enough?

@yupferris
Copy link
Member

For many changes we should still ideally run the thing ofc, but at least for a first approximation (and something we can link to for actual results/bookkeeping) it's a good idea to pack and report in CI. I don't think we need (or necessarily want) to do more than that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants