UCS/SYS: Disable "always inline" when -Og flag is present#11251
UCS/SYS: Disable "always inline" when -Og flag is present#11251yosefe merged 6 commits intoopenucx:masterfrom
Conversation
| uct_pack_callback_t pack_cb, void *arg, const uct_iov_t *iov, | ||
| size_t iovcnt, unsigned flags) | ||
| { | ||
| uint8_t elem_flags = 0; |
| # | ||
| # Check if -Og flag was supplied | ||
| # | ||
| AS_IF([echo " $BASE_CFLAGS $CFLAGS " | grep -q -- ' -Og '], |
There was a problem hiding this comment.
maybe enhance --enable-compiler-opt to support 0,1,2,3,g since as far as I understand they are all mutually exclusive? and maybe AC_DEFINE HAVE_OG_OPTIMIZATION for --enable-compiler-opt=g?
There was a problem hiding this comment.
notice that -Og can be passed from CFLAGS which comes from env (see example in CI job).
I think that the existing block should remain as is because it has a single well-defined output (BASE_CFLAGS).
There was a problem hiding this comment.
BTW, which functions fail to compile with -Og ?
Maybe we could convert them to non-forced inline w/o performance impact
There was a problem hiding this comment.
Those are the failed functions:
- ucp_proto_request_bcopy_complete_success
- ucp_proto_put_offload_bcopy_send_func
- ucp_proto_rndv_put_common_flush_send
- ucp_proto_rndv_put_common_data_sent
- ucp_proto_rndv_put_mtype_send_func
how would you suggest converting them?
There was a problem hiding this comment.
or did you mean to convert only them? @yosefe
There was a problem hiding this comment.
but then if we don't recognize -O3 the impact will not be so bad, worse case some functions may be non inline which may impact the performance a little. The impact of mis detection will not fail compilation.
There was a problem hiding this comment.
Notice that built-in gcc flags __ OPTIMIZE __ and __ NO_INLINE __ are set for all opt levels
except O0, so it's unuseful for us.
So basically the plan is:
- Add a flag in complier.m4 to identify high opt level (which includes O2 and O3).
- if this flag is set, we define UCS_F_INLINE_OPTIMIZED in complier_def.h
- apply UCS_F_INLINE_OPTIMIZED to list of functions above
@yosefe is this correct?
There was a problem hiding this comment.
Notice that built-in gcc flags __ OPTIMIZE __ and __ NO_INLINE __ are set for all opt levels
except O0, so it's unuseful for us.
Right
Regarding the plan:
- It's enough to do it as part of compiler-opt logic.
- If we set optimization level to 2 or higher as part of this macro - define UCS_F_INLINE_OPTIMIZED to always_inline. Otherwise, define it as regular inline
- Yes
There was a problem hiding this comment.
- It's enough to do it as part of compiler-opt logic.
without considering CFLAGS=-Og?
There was a problem hiding this comment.
Yes, just setting CFLAGS=-Og without compiler-opt will keep UCS_F_INLINE_OPTIMIZED as "inline"
|
regarding -Og passed as CFLAGS, if it is only about the line below, maybe we could fix this use-case to pass it using |
so I guess the question is how you want handle a contradiction between CFLAGS and --enable-compiler-opt.
can you explain this question? |
maybe we can handle --enable-compiler-opt=g only as i think by default add -O3 anyways, which could already conflict with an -O specified from CFLAGS?
I understand the -Og can cause build failure for always inline. now we are able to remove the failures by moving to always inline to inline. But we would still need to have a macro set when -Og, to downgrade always inline to inline. |
won't it cause CFLAGS=-Og to fail compilation?
right that's the idea |
| # | ||
| # Define OPTIMIZE_HIGH for optimization levels -O2 or -O3 | ||
| # | ||
| AS_IF([echo " $BASE_CFLAGS $CFLAGS " | grep -qE -- ' -O(2|3) '], |
There was a problem hiding this comment.
maybe i am getting confused, but the case where CI sets -Og would not be handled right, as below comes with -O3 and then -Og, and AFAIU, gcc will use -Og, where it should not add the inlines?
$ ./contrib/configure-devel CFLAGS=-Og
configure: CFLAGS: -O3 -g -Wall ... -Og
|
@yosefe please review |
| opt_level="" | ||
| for flag in $BASE_CFLAGS $CFLAGS; do | ||
| case $flag in -O*) opt_level=$flag;; esac | ||
| done | ||
| AS_IF([test "x$opt_level" = "x-O2" || test "x$opt_level" = "x-O3"], | ||
| [AC_DEFINE([OPTIMIZE_HIGH], 1, [Compiled with high optimization level])]) |
There was a problem hiding this comment.
the approach is not bad. i would fine tune the implementation:
- use shell pattern substitution to set "opt_level" to the exact optimization level that follows the -O argument. The -O argument can be a number or a string /(like "z", "fast", "g", etc)
- define OPTIMIZATION_LEVEL macro to be the numeric optimization level. If the optimization is not a number, the macro is left undefined
- in compiler_def.h we CANNOT use OPTIMIZATION_LEVEL macro, because it does not include config.h. need to use compiler.h
- in compiler.h, need to check if OPTIMIZATION_LEVEL is defined and its level is >=2
What?
Disable "always inline" when -Og flag is present
Why?
Support -Og optimization level