-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-15678: [C++] Add support for -DCMAKE_BUILD_TYPE=MinSizeRel #14342
Conversation
If we build with -DCMAKE_BUILD_TYPE=MinSizeRel, our SIMD related code may violate the one-definition-rule. See also ARROW-15664 and https://issues.apache.org/jira/browse/ARROW-15678?focusedCommentId=17613909#comment-17613909 for details. This change prevents the one-definition-rule violation by forcing inlining as much as possible.
@github-actions crossbow submit test-r-install-local-minsizerel |
|
Revision: 22cdaf6 Submitted crossbow builds: ursacomputing/crossbow @ actions-de96a45a25
|
@github-actions crossbow submit test-r-install-local-minsizerel |
Revision: 1f4d567 Submitted crossbow builds: ursacomputing/crossbow @ actions-1d3ee175be
|
@jonkeane It seems that the problem is fixed with this approach in our CI. Could you confirm that this approach also fixes CI for your downstream projects mentioned at https://issues.apache.org/jira/browse/ARROW-15678?focusedCommentId=17613507#comment-17613507? |
Thanks for this! Lemme try to reproduce it in my upstream project — it's a bit complicated since AFAIK, I'll need to do a manual install from this branch to get there (and the dependency resolution of the upstream project doesn't make that super easy...) — but I'll let you know once I've got a test done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I did manage to replicate this manually, before the fix in 1f4d567, I get the segfault, but after it works
Thanks again!
Thanks for confirming this! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm OK with this. At least now that we have a CI job, we can hopefully catch any future issues like this earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kou . This certainly looks good enough for now.
Should the configuration for other runtime-dispatched files be modified as well?
Maybe. I've opened ARROW-17981 for it. |
Benchmark runs are scheduled for baseline = 73cfd2d and contender = 54d8560. 54d8560 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
['Python', 'R'] benchmarks have high level of regressions. |
A huge thank you to @kou for this! 🎉 |
set_source_files_properties(level_conversion_bmi2.cc | ||
PROPERTIES SKIP_PRECOMPILE_HEADERS ON | ||
COMPILE_FLAGS | ||
"${ARROW_AVX2_FLAG} -DARROW_HAVE_BMI2 -mbmi2") | ||
"${ARROW_AVX2_FLAG} -DARROW_HAVE_BMI2 ${CXX_FLAGS_RELEASE}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh... I removed -mbmi2
accidentally...
It caused a build failure on macOS High Sierra:
https://github.com/ursacomputing/crossbow/actions/runs/3225397138/jobs/5278904264#step:13:7034
/Users/voltrondata/tmp/hbtmp/apache-arrow-20221011-35535-3alkwv/cpp/src/parquet/level_conversion_inc.h:278:10: error: always_inline function '_pext_u64' requires target feature 'bmi2', but would be inlined into function 'ExtractBits' that is compiled without support for 'bmi2'
return _pext_u64(bitmap, select_bitmap);
^
I'll restore the flag.
apache#14342 removed -mbmi2 flag accidentally. It should not be removed. If we remove it, a build error is occurred on macOS High Sierra. https://github.com/ursacomputing/crossbow/actions/runs/3225397138/jobs/5278904264#step:13:7034 /Users/voltrondata/tmp/hbtmp/apache-arrow-20221011-35535-3alkwv/cpp/src/parquet/level_conversion_inc.h:278:10: error: always_inline function '_pext_u64' requires target feature 'bmi2', but would be inlined into function 'ExtractBits' that is compiled without support for 'bmi2' return _pext_u64(bitmap, select_bitmap); ^
#14342 removed -mbmi2 flag accidentally. It should not be removed. If we remove it, a build error is occurred on macOS High Sierra. https://github.com/ursacomputing/crossbow/actions/runs/3225397138/jobs/5278904264#step:13:7034 /Users/voltrondata/tmp/hbtmp/apache-arrow-20221011-35535-3alkwv/cpp/src/parquet/level_conversion_inc.h:278:10: error: always_inline function '_pext_u64' requires target feature 'bmi2', but would be inlined into function 'ExtractBits' that is compiled without support for 'bmi2' return _pext_u64(bitmap, select_bitmap); ^ Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
### Rationale for this change Summary of this problem: #31132 (comment) Why this problem is happen again? Because I removed `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` in #36583. The solution we chose by #14342 was forcing to use `-O2` for SIMD related code. It works for `-DCMAKE_BUILD_TYPE=MinSizeRel` but it doesn't work for Homebrew. Because Homebrew's CC https://github.com/Homebrew/brew/blob/master/Library/Homebrew/shims/super/cc forces to use the same `-O` flag. The default is `-Os`. If we specify `-O2`, Homebrew's CC replaces it to `-Os`. If we use `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"`, Homebrew's CC always use `-O2`. So the solution we chose by #14342 isn't used for Homebrew. But Homebrew thinks that `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` is a workaround. So we need another solution for Homebrew. Here are candidate solutions: 1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE` 2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?` "1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" works because we don't use the runtime SIMD dispatch feature (the problematic feature) entirely. "2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" works but I don't know why... If `ENV.runtime_cpu_detection` is called, Homebrew's CC stops replacing `-march=*`. If we call `ENV.runtime_cpu_detection`, `-march=haswell` is used for AVX2 related code and `-march=skylake-avx512` is used for AVX512 including BMI2 related code. If we don't call `ENV.runtime_cpu_detection`, `-march=nehalem` is always used. (Note that SIMD related flags such as `-mbmi2` aren't removed by Homebrew's CC. So I think that SIMD is enabled.) I don't know why but "the one-definition-rule violation" (see the summary for details: #31132 (comment) ) isn't happen. FYI: CPU info for GitHub Actions macOS hosted-runner: ```console $ sysctl hw.optional machdep.cpu hw.optional.adx: 0 hw.optional.aes: 1 hw.optional.avx1_0: 1 hw.optional.avx2_0: 0 hw.optional.avx512bw: 0 hw.optional.avx512cd: 0 hw.optional.avx512dq: 0 hw.optional.avx512f: 0 hw.optional.avx512ifma: 0 hw.optional.avx512vbmi: 0 hw.optional.avx512vl: 0 hw.optional.bmi1: 0 hw.optional.bmi2: 0 hw.optional.enfstrg: 0 hw.optional.f16c: 1 hw.optional.floatingpoint: 1 hw.optional.fma: 0 hw.optional.hle: 0 hw.optional.mmx: 1 hw.optional.mpx: 0 hw.optional.rdrand: 1 hw.optional.rtm: 0 hw.optional.sgx: 0 hw.optional.sse: 1 hw.optional.sse2: 1 hw.optional.sse3: 1 hw.optional.sse4_1: 1 hw.optional.sse4_2: 1 hw.optional.supplementalsse3: 1 hw.optional.x86_64: 1 machdep.cpu.address_bits.physical: 43 machdep.cpu.address_bits.virtual: 48 machdep.cpu.arch_perf.events: 127 machdep.cpu.arch_perf.events_number: 7 machdep.cpu.arch_perf.fixed_number: 0 machdep.cpu.arch_perf.fixed_width: 0 machdep.cpu.arch_perf.number: 4 machdep.cpu.arch_perf.version: 1 machdep.cpu.arch_perf.width: 48 machdep.cpu.cache.L2_associativity: 8 machdep.cpu.cache.linesize: 64 machdep.cpu.cache.size: 256 machdep.cpu.mwait.extensions: 3 machdep.cpu.mwait.linesize_max: 4096 machdep.cpu.mwait.linesize_min: 64 machdep.cpu.mwait.sub_Cstates: 16 machdep.cpu.thermal.ACNT_MCNT: 0 machdep.cpu.thermal.core_power_limits: 0 machdep.cpu.thermal.dynamic_acceleration: 0 machdep.cpu.thermal.energy_policy: 0 machdep.cpu.thermal.fine_grain_clock_mod: 0 machdep.cpu.thermal.hardware_feedback: 0 machdep.cpu.thermal.invariant_APIC_timer: 1 machdep.cpu.thermal.package_thermal_intr: 0 machdep.cpu.thermal.sensor: 0 machdep.cpu.thermal.thresholds: 0 machdep.cpu.tlb.data.small: 64 machdep.cpu.tlb.inst.large: 8 machdep.cpu.tlb.inst.small: 64 machdep.cpu.tlb.shared: 512 machdep.cpu.tsc_ccc.denominator: 0 machdep.cpu.tsc_ccc.numerator: 0 machdep.cpu.xsave.extended_state: 7 832 832 0 machdep.cpu.xsave.extended_state1: 0 0 0 0 machdep.cpu.brand: 0 machdep.cpu.brand_string: Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz machdep.cpu.core_count: 3 machdep.cpu.cores_per_package: 4 machdep.cpu.extfamily: 0 machdep.cpu.extfeature_bits: 4967106816 machdep.cpu.extfeatures: SYSCALL XD EM64T LAHF RDTSCP TSCI machdep.cpu.extmodel: 3 machdep.cpu.family: 6 machdep.cpu.feature_bits: 18427078393948011519 machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH MMX FXSR SSE SSE2 SS HTT SSE3 PCLMULQDQ MON VMX SSSE3 CX16 SSE4.1 SSE4.2 x2APIC POPCNT AES VMM PCID XSAVE OSXSAVE TSCTMR AVX1.0 RDRAND F16C machdep.cpu.leaf7_feature_bits: 643 0 machdep.cpu.leaf7_feature_bits_edx: 3154117632 machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET SMEP ERMS MDCLEAR IBRS STIBP L1DF ACAPMSR SSBD machdep.cpu.logical_per_package: 4 machdep.cpu.max_basic: 13 machdep.cpu.max_ext: 2147483656 machdep.cpu.microcode_version: 1070 machdep.cpu.model: 58 machdep.cpu.processor_flag: 0 machdep.cpu.signature: 198313 machdep.cpu.stepping: 9 machdep.cpu.thread_count: 3 machdep.cpu.vendor: GenuineIntel ``` ### What changes are included in this PR? "1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" because it's straightforward and "2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" may also disable runtime SIMD dispatch implicitly. This also adds the following debug information for easy to debug in future: * CPU information for GitHub Actions runner * Homebrew's build logs ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * Closes: #36685 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…ache#36705) ### Rationale for this change Summary of this problem: apache#31132 (comment) Why this problem is happen again? Because I removed `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` in apache#36583. The solution we chose by apache#14342 was forcing to use `-O2` for SIMD related code. It works for `-DCMAKE_BUILD_TYPE=MinSizeRel` but it doesn't work for Homebrew. Because Homebrew's CC https://github.com/Homebrew/brew/blob/master/Library/Homebrew/shims/super/cc forces to use the same `-O` flag. The default is `-Os`. If we specify `-O2`, Homebrew's CC replaces it to `-Os`. If we use `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"`, Homebrew's CC always use `-O2`. So the solution we chose by apache#14342 isn't used for Homebrew. But Homebrew thinks that `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` is a workaround. So we need another solution for Homebrew. Here are candidate solutions: 1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE` 2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?` "1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" works because we don't use the runtime SIMD dispatch feature (the problematic feature) entirely. "2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" works but I don't know why... If `ENV.runtime_cpu_detection` is called, Homebrew's CC stops replacing `-march=*`. If we call `ENV.runtime_cpu_detection`, `-march=haswell` is used for AVX2 related code and `-march=skylake-avx512` is used for AVX512 including BMI2 related code. If we don't call `ENV.runtime_cpu_detection`, `-march=nehalem` is always used. (Note that SIMD related flags such as `-mbmi2` aren't removed by Homebrew's CC. So I think that SIMD is enabled.) I don't know why but "the one-definition-rule violation" (see the summary for details: apache#31132 (comment) ) isn't happen. FYI: CPU info for GitHub Actions macOS hosted-runner: ```console $ sysctl hw.optional machdep.cpu hw.optional.adx: 0 hw.optional.aes: 1 hw.optional.avx1_0: 1 hw.optional.avx2_0: 0 hw.optional.avx512bw: 0 hw.optional.avx512cd: 0 hw.optional.avx512dq: 0 hw.optional.avx512f: 0 hw.optional.avx512ifma: 0 hw.optional.avx512vbmi: 0 hw.optional.avx512vl: 0 hw.optional.bmi1: 0 hw.optional.bmi2: 0 hw.optional.enfstrg: 0 hw.optional.f16c: 1 hw.optional.floatingpoint: 1 hw.optional.fma: 0 hw.optional.hle: 0 hw.optional.mmx: 1 hw.optional.mpx: 0 hw.optional.rdrand: 1 hw.optional.rtm: 0 hw.optional.sgx: 0 hw.optional.sse: 1 hw.optional.sse2: 1 hw.optional.sse3: 1 hw.optional.sse4_1: 1 hw.optional.sse4_2: 1 hw.optional.supplementalsse3: 1 hw.optional.x86_64: 1 machdep.cpu.address_bits.physical: 43 machdep.cpu.address_bits.virtual: 48 machdep.cpu.arch_perf.events: 127 machdep.cpu.arch_perf.events_number: 7 machdep.cpu.arch_perf.fixed_number: 0 machdep.cpu.arch_perf.fixed_width: 0 machdep.cpu.arch_perf.number: 4 machdep.cpu.arch_perf.version: 1 machdep.cpu.arch_perf.width: 48 machdep.cpu.cache.L2_associativity: 8 machdep.cpu.cache.linesize: 64 machdep.cpu.cache.size: 256 machdep.cpu.mwait.extensions: 3 machdep.cpu.mwait.linesize_max: 4096 machdep.cpu.mwait.linesize_min: 64 machdep.cpu.mwait.sub_Cstates: 16 machdep.cpu.thermal.ACNT_MCNT: 0 machdep.cpu.thermal.core_power_limits: 0 machdep.cpu.thermal.dynamic_acceleration: 0 machdep.cpu.thermal.energy_policy: 0 machdep.cpu.thermal.fine_grain_clock_mod: 0 machdep.cpu.thermal.hardware_feedback: 0 machdep.cpu.thermal.invariant_APIC_timer: 1 machdep.cpu.thermal.package_thermal_intr: 0 machdep.cpu.thermal.sensor: 0 machdep.cpu.thermal.thresholds: 0 machdep.cpu.tlb.data.small: 64 machdep.cpu.tlb.inst.large: 8 machdep.cpu.tlb.inst.small: 64 machdep.cpu.tlb.shared: 512 machdep.cpu.tsc_ccc.denominator: 0 machdep.cpu.tsc_ccc.numerator: 0 machdep.cpu.xsave.extended_state: 7 832 832 0 machdep.cpu.xsave.extended_state1: 0 0 0 0 machdep.cpu.brand: 0 machdep.cpu.brand_string: Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz machdep.cpu.core_count: 3 machdep.cpu.cores_per_package: 4 machdep.cpu.extfamily: 0 machdep.cpu.extfeature_bits: 4967106816 machdep.cpu.extfeatures: SYSCALL XD EM64T LAHF RDTSCP TSCI machdep.cpu.extmodel: 3 machdep.cpu.family: 6 machdep.cpu.feature_bits: 18427078393948011519 machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH MMX FXSR SSE SSE2 SS HTT SSE3 PCLMULQDQ MON VMX SSSE3 CX16 SSE4.1 SSE4.2 x2APIC POPCNT AES VMM PCID XSAVE OSXSAVE TSCTMR AVX1.0 RDRAND F16C machdep.cpu.leaf7_feature_bits: 643 0 machdep.cpu.leaf7_feature_bits_edx: 3154117632 machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET SMEP ERMS MDCLEAR IBRS STIBP L1DF ACAPMSR SSBD machdep.cpu.logical_per_package: 4 machdep.cpu.max_basic: 13 machdep.cpu.max_ext: 2147483656 machdep.cpu.microcode_version: 1070 machdep.cpu.model: 58 machdep.cpu.processor_flag: 0 machdep.cpu.signature: 198313 machdep.cpu.stepping: 9 machdep.cpu.thread_count: 3 machdep.cpu.vendor: GenuineIntel ``` ### What changes are included in this PR? "1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" because it's straightforward and "2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" may also disable runtime SIMD dispatch implicitly. This also adds the following debug information for easy to debug in future: * CPU information for GitHub Actions runner * Homebrew's build logs ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * Closes: apache#36685 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…ache#36705) ### Rationale for this change Summary of this problem: apache#31132 (comment) Why this problem is happen again? Because I removed `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` in apache#36583. The solution we chose by apache#14342 was forcing to use `-O2` for SIMD related code. It works for `-DCMAKE_BUILD_TYPE=MinSizeRel` but it doesn't work for Homebrew. Because Homebrew's CC https://github.com/Homebrew/brew/blob/master/Library/Homebrew/shims/super/cc forces to use the same `-O` flag. The default is `-Os`. If we specify `-O2`, Homebrew's CC replaces it to `-Os`. If we use `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"`, Homebrew's CC always use `-O2`. So the solution we chose by apache#14342 isn't used for Homebrew. But Homebrew thinks that `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` is a workaround. So we need another solution for Homebrew. Here are candidate solutions: 1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE` 2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?` "1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" works because we don't use the runtime SIMD dispatch feature (the problematic feature) entirely. "2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" works but I don't know why... If `ENV.runtime_cpu_detection` is called, Homebrew's CC stops replacing `-march=*`. If we call `ENV.runtime_cpu_detection`, `-march=haswell` is used for AVX2 related code and `-march=skylake-avx512` is used for AVX512 including BMI2 related code. If we don't call `ENV.runtime_cpu_detection`, `-march=nehalem` is always used. (Note that SIMD related flags such as `-mbmi2` aren't removed by Homebrew's CC. So I think that SIMD is enabled.) I don't know why but "the one-definition-rule violation" (see the summary for details: apache#31132 (comment) ) isn't happen. FYI: CPU info for GitHub Actions macOS hosted-runner: ```console $ sysctl hw.optional machdep.cpu hw.optional.adx: 0 hw.optional.aes: 1 hw.optional.avx1_0: 1 hw.optional.avx2_0: 0 hw.optional.avx512bw: 0 hw.optional.avx512cd: 0 hw.optional.avx512dq: 0 hw.optional.avx512f: 0 hw.optional.avx512ifma: 0 hw.optional.avx512vbmi: 0 hw.optional.avx512vl: 0 hw.optional.bmi1: 0 hw.optional.bmi2: 0 hw.optional.enfstrg: 0 hw.optional.f16c: 1 hw.optional.floatingpoint: 1 hw.optional.fma: 0 hw.optional.hle: 0 hw.optional.mmx: 1 hw.optional.mpx: 0 hw.optional.rdrand: 1 hw.optional.rtm: 0 hw.optional.sgx: 0 hw.optional.sse: 1 hw.optional.sse2: 1 hw.optional.sse3: 1 hw.optional.sse4_1: 1 hw.optional.sse4_2: 1 hw.optional.supplementalsse3: 1 hw.optional.x86_64: 1 machdep.cpu.address_bits.physical: 43 machdep.cpu.address_bits.virtual: 48 machdep.cpu.arch_perf.events: 127 machdep.cpu.arch_perf.events_number: 7 machdep.cpu.arch_perf.fixed_number: 0 machdep.cpu.arch_perf.fixed_width: 0 machdep.cpu.arch_perf.number: 4 machdep.cpu.arch_perf.version: 1 machdep.cpu.arch_perf.width: 48 machdep.cpu.cache.L2_associativity: 8 machdep.cpu.cache.linesize: 64 machdep.cpu.cache.size: 256 machdep.cpu.mwait.extensions: 3 machdep.cpu.mwait.linesize_max: 4096 machdep.cpu.mwait.linesize_min: 64 machdep.cpu.mwait.sub_Cstates: 16 machdep.cpu.thermal.ACNT_MCNT: 0 machdep.cpu.thermal.core_power_limits: 0 machdep.cpu.thermal.dynamic_acceleration: 0 machdep.cpu.thermal.energy_policy: 0 machdep.cpu.thermal.fine_grain_clock_mod: 0 machdep.cpu.thermal.hardware_feedback: 0 machdep.cpu.thermal.invariant_APIC_timer: 1 machdep.cpu.thermal.package_thermal_intr: 0 machdep.cpu.thermal.sensor: 0 machdep.cpu.thermal.thresholds: 0 machdep.cpu.tlb.data.small: 64 machdep.cpu.tlb.inst.large: 8 machdep.cpu.tlb.inst.small: 64 machdep.cpu.tlb.shared: 512 machdep.cpu.tsc_ccc.denominator: 0 machdep.cpu.tsc_ccc.numerator: 0 machdep.cpu.xsave.extended_state: 7 832 832 0 machdep.cpu.xsave.extended_state1: 0 0 0 0 machdep.cpu.brand: 0 machdep.cpu.brand_string: Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz machdep.cpu.core_count: 3 machdep.cpu.cores_per_package: 4 machdep.cpu.extfamily: 0 machdep.cpu.extfeature_bits: 4967106816 machdep.cpu.extfeatures: SYSCALL XD EM64T LAHF RDTSCP TSCI machdep.cpu.extmodel: 3 machdep.cpu.family: 6 machdep.cpu.feature_bits: 18427078393948011519 machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH MMX FXSR SSE SSE2 SS HTT SSE3 PCLMULQDQ MON VMX SSSE3 CX16 SSE4.1 SSE4.2 x2APIC POPCNT AES VMM PCID XSAVE OSXSAVE TSCTMR AVX1.0 RDRAND F16C machdep.cpu.leaf7_feature_bits: 643 0 machdep.cpu.leaf7_feature_bits_edx: 3154117632 machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET SMEP ERMS MDCLEAR IBRS STIBP L1DF ACAPMSR SSBD machdep.cpu.logical_per_package: 4 machdep.cpu.max_basic: 13 machdep.cpu.max_ext: 2147483656 machdep.cpu.microcode_version: 1070 machdep.cpu.model: 58 machdep.cpu.processor_flag: 0 machdep.cpu.signature: 198313 machdep.cpu.stepping: 9 machdep.cpu.thread_count: 3 machdep.cpu.vendor: GenuineIntel ``` ### What changes are included in this PR? "1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" because it's straightforward and "2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" may also disable runtime SIMD dispatch implicitly. This also adds the following debug information for easy to debug in future: * CPU information for GitHub Actions runner * Homebrew's build logs ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * Closes: apache#36685 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
If we build with -DCMAKE_BUILD_TYPE=MinSizeRel, our SIMD related code may violate the one-definition-rule. See also ARROW-15664 and https://issues.apache.org/jira/browse/ARROW-15678?focusedCommentId=17613909#comment-17613909 for details.
This change prevents the one-definition-rule violation by forcing inlining as much as possible.