Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CMakeLists.txt forces -O2 to optimization flags #2970

Closed
janhubicka opened this issue Nov 25, 2023 · 6 comments
Closed

CMakeLists.txt forces -O2 to optimization flags #2970

janhubicka opened this issue Nov 25, 2023 · 6 comments
Labels
building/portability Platform-specific issues, build issues unrelated to 1.0 Things that need not be done before the 1.0 version milestone

Comments

@janhubicka
Copy link

The following bits:

Force build with optimizations in release mode.

set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} -O2")

Also optimize debug build; this is "unofficial", so it goes through ENV

if(DEFINED ENV{JPEGXL_OPT_DBG})
# Not the same thing as release_with_debug
set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS_DEBUG} -O2")
endif()

force -O2 to compiler flags even if user asks for different. Both GCC and Clang benefits from using -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109811 (by about 15% for GCC and 9% for clang.

I think image encoding is typical candidate for -O3 or -Ofast optimization level and also I think there should be way for user to change it.

@Traneptora
Copy link
Contributor

-Ofast can break things. Particularly things with simd. (It's known to break zimg, for example.) -O3 is much safer, but doesn't always make code faster. See: https://wiki.gentoo.org/wiki/GCC_optimization#-O

@thesamesam
Copy link

-Ofast can break things. Particularly things with simd. (It's known to break zimg, for example.) -O3 is much safer, but doesn't always make code faster. See: wiki.gentoo.org/wiki/GCC_optimization#-O

The person you're replying to is one of the senior GCC maintainers.

@birdie-github
Copy link

-Ofast can break things. Particularly things with simd. (It's known to break zimg, for example.) -O3 is much safer, but doesn't always make code faster. See: https://wiki.gentoo.org/wiki/GCC_optimization#-O

As far as I understand this bug report is not about your choice of -O [optimization levels].

It's about the fact that you enforce them despite what people specify at build time.

I could be wrong.

@janhubicka
Copy link
Author

GCC defines -O2 as the optimization level fit to build most packages in the distribution. So it does optimizations that does not trade too much of code size for speed. Getting everything twice as big to squeeze few cycles out of internal loops is not a good idea.

-O3 is for CPU bound code. Encoding/decoding graphics fits to the -O3 category well (and benchmark on the GCC PR I listed shows 15% speedup on encoding that is nice).

Clang has -O2 tuned for more code size growthand those who care use -Os. There is still noticeable speedup for -O3 at clang, too.

-Ofast enables -ffast-math and relaxes assumptions about memory model to allow more vectorization.
For math it does: -fno-math-errno, -funsafe-math-optimizations, -ffinite-math-only, -fno-rounding-math, -fno-signaling-nans, -fcx-limited-range and -fexcess-precision=fast.
I do not know enough about jpegxl to tell if that could make difference. But if there is a good testsuite it should be easy to check if it affects output and why.

@janhubicka
Copy link
Author

So in short - I think -O3 would be better default and also there should be way for user to change the default.

@mo271 mo271 added building/portability Platform-specific issues, build issues unrelated to 1.0 Things that need not be done before the 1.0 version milestone labels Dec 6, 2023
@sboukortt
Copy link
Member

Thank you for this report. As of #2987, which has just been merged, we don’t force -O2 anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
building/portability Platform-specific issues, build issues unrelated to 1.0 Things that need not be done before the 1.0 version milestone
Projects
None yet
Development

No branches or pull requests

6 participants