Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] MSVC does not optimize out the lambda created by the CPP2_UFCS macros. #185

Closed
prenaux opened this issue Dec 29, 2022 · 5 comments
Closed
Labels
bug Something isn't working

Comments

@prenaux
Copy link
Contributor

prenaux commented Dec 29, 2022

Describe the bug
As the title describe, MSVC 19 doesn't optimize out the UFCS macros. It seems to be a serious problem since the UFCS lambdas generate a fair bit of code and end up generated for every function call. Both GCC & Clang optimize them out even at the lowest optimization level (-Og).

This is not a bug in cppfront per say, but it is still a bug that will affect it directly imo - and maybe something that cppfront can work around. It is not clear where it should be fixed ultimately.

To Reproduce
https://godbolt.org/z/ravPdfjoG

Additional context
I was playing with the UFCS macros in regular C++ code and I wanted to see whether the lambdas got removed in debug builds, put them in godbolt, no compilers get rid of them with -O0 - which is understandable. GCC & Clang get rid of them at any optimization level I tried (-Og, -Os, -O1, -O2) but MSVC doesnt even get rid of them in any optimization level (neither /O2 nor /O3 - although I dont recall whether that's a legit level for MSVC or if it just gets pegged back to /O2).

@prenaux prenaux added the bug Something isn't working label Dec 29, 2022
@hsutter
Copy link
Owner

hsutter commented Dec 29, 2022

I didn't check that, and this is really helpful, thanks.

I think there is something I can do about this... I just googled around and found this StackOverflow answer quoting Jonathan Caves' answer of using [[msvc::forceinline]]. In your Godbolt example, adding that attribute before the lambda body has the desired effect, and enables inlining all the way through to the callee (no call instructions at all).

I'll try that and check that it passes regression testing in GCC and Clang... they should just ignore the attribute, but if they don't I'll need to hide it behind YAM (yet another macro).

Thanks!

@hsutter
Copy link
Owner

hsutter commented Dec 29, 2022

Update: As in that SO answer, GCC and Clang want __attribute__((always_inline)) so I'm doing that too, even though they were already inlining (adding that attribute doesn't seem to change the code gen in your example but it can't hurt). Again, thanks!

@prenaux
Copy link
Contributor Author

prenaux commented Dec 29, 2022

@hsutter I did try [[msvc::forceinline]] but it didn't get rid of the lambda - apologies I forgot to mention it. Its very possible I did it incorrectly so I updated the godbolt example with your updates at https://godbolt.org/z/caYnE4YGa but I still see the full lambda function in the output assembly in MSVC.

@hsutter
Copy link
Owner

hsutter commented Dec 30, 2022

It didn't get rid of the lambda sitting in the object image (MSVC still isn't as good as other compilers at not displaying unreachable code under Godbolt CE) but the lambda is never called in callTest, it's inlined and there's no call instruction.

Try this slight variation https://godbolt.org/z/459sMx9n4 which shows it as two functions that differ only in direct call vs. UFCS:

vec4f callTest() {
    vec4f v = {123,456,1,2};
    return square(v);
}

vec4f callTest_UFCS() {
    vec4f v = {123,456,1,2};
    return CPP2_UFCS_0(square,v);
}

On all three compilers at O2, these two functions have identical code gen. Does that look right?

Interesting thing: On O1, the UFCS function actually has better code gen on MSVC (and still identical code gen on GCC and Clang).

Thanks again for pointing it out!

@prenaux
Copy link
Contributor Author

prenaux commented Dec 30, 2022

I see great, thanks for the explanation I was just looking for the lambda in the assembly :)

The godbolt link you shared does look like what I'd expect (minus the lambda which is now dead code).

Azmah-Bad pushed a commit to Azmah-Bad/cppfront that referenced this issue Feb 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants