Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-15678: [C++] Add support for -DCMAKE_BUILD_TYPE=MinSizeRel #14342

Merged
merged 3 commits into from
Oct 10, 2022

Conversation

kou
Copy link
Member

@kou kou commented Oct 7, 2022

If we build with -DCMAKE_BUILD_TYPE=MinSizeRel, our SIMD related code may violate the one-definition-rule. See also ARROW-15664 and https://issues.apache.org/jira/browse/ARROW-15678?focusedCommentId=17613909#comment-17613909 for details.

This change prevents the one-definition-rule violation by forcing inlining as much as possible.

If we build with -DCMAKE_BUILD_TYPE=MinSizeRel, our SIMD related code
may violate the one-definition-rule. See also ARROW-15664 and
https://issues.apache.org/jira/browse/ARROW-15678?focusedCommentId=17613909#comment-17613909
for details.

This change prevents the one-definition-rule violation by forcing
inlining as much as possible.
@kou
Copy link
Member Author

kou commented Oct 7, 2022

@github-actions crossbow submit test-r-install-local-minsizerel

@github-actions
Copy link

github-actions bot commented Oct 7, 2022

@github-actions
Copy link

github-actions bot commented Oct 7, 2022

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

@github-actions
Copy link

github-actions bot commented Oct 7, 2022

Revision: 22cdaf6

Submitted crossbow builds: ursacomputing/crossbow @ actions-de96a45a25

Task Status
test-r-install-local-minsizerel Github Actions

@kou
Copy link
Member Author

kou commented Oct 7, 2022

@github-actions crossbow submit test-r-install-local-minsizerel

@github-actions
Copy link

github-actions bot commented Oct 7, 2022

Revision: 1f4d567

Submitted crossbow builds: ursacomputing/crossbow @ actions-1d3ee175be

Task Status
test-r-install-local-minsizerel Github Actions

@kou
Copy link
Member Author

kou commented Oct 7, 2022

@jonkeane It seems that the problem is fixed with this approach in our CI. Could you confirm that this approach also fixes CI for your downstream projects mentioned at https://issues.apache.org/jira/browse/ARROW-15678?focusedCommentId=17613507#comment-17613507?

@jonkeane
Copy link
Member

jonkeane commented Oct 8, 2022

Thanks for this! Lemme try to reproduce it in my upstream project — it's a bit complicated since AFAIK, I'll need to do a manual install from this branch to get there (and the dependency resolution of the upstream project doesn't make that super easy...) — but I'll let you know once I've got a test done.

Copy link
Member

@jonkeane jonkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I did manage to replicate this manually, before the fix in 1f4d567, I get the segfault, but after it works

Thanks again!

@kou
Copy link
Member Author

kou commented Oct 9, 2022

Thanks for confirming this!
I'll update comments in source before we merge this.

@kou
Copy link
Member Author

kou commented Oct 9, 2022

@pitrou @lidavidm Can we use this approach for 10.0.0?

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with this. At least now that we have a CI job, we can hopefully catch any future issues like this earlier.

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kou . This certainly looks good enough for now.
Should the configuration for other runtime-dispatched files be modified as well?

@kou
Copy link
Member Author

kou commented Oct 10, 2022

Should the configuration for other runtime-dispatched files be modified as well?

Maybe. I've opened ARROW-17981 for it.

@kou kou merged commit 54d8560 into apache:master Oct 10, 2022
@kou kou deleted the cpp-cmake-minsizerel branch October 10, 2022 21:18
@ursabot
Copy link

ursabot commented Oct 11, 2022

Benchmark runs are scheduled for baseline = 73cfd2d and contender = 54d8560. 54d8560 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.56% ⬆️0.0%] test-mac-arm
[Failed ⬇️0.55% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.29% ⬆️0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 54d85602 ec2-t3-xlarge-us-east-2
[Failed] 54d85602 test-mac-arm
[Failed] 54d85602 ursa-i9-9960x
[Finished] 54d85602 ursa-thinkcentre-m75q
[Finished] 73cfd2d0 ec2-t3-xlarge-us-east-2
[Failed] 73cfd2d0 test-mac-arm
[Failed] 73cfd2d0 ursa-i9-9960x
[Finished] 73cfd2d0 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@ursabot
Copy link

ursabot commented Oct 11, 2022

['Python', 'R'] benchmarks have high level of regressions.
ursa-i9-9960x

@jonkeane
Copy link
Member

A huge thank you to @kou for this! 🎉

set_source_files_properties(level_conversion_bmi2.cc
PROPERTIES SKIP_PRECOMPILE_HEADERS ON
COMPILE_FLAGS
"${ARROW_AVX2_FLAG} -DARROW_HAVE_BMI2 -mbmi2")
"${ARROW_AVX2_FLAG} -DARROW_HAVE_BMI2 ${CXX_FLAGS_RELEASE}"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh... I removed -mbmi2 accidentally...
It caused a build failure on macOS High Sierra:

https://github.com/ursacomputing/crossbow/actions/runs/3225397138/jobs/5278904264#step:13:7034

/Users/voltrondata/tmp/hbtmp/apache-arrow-20221011-35535-3alkwv/cpp/src/parquet/level_conversion_inc.h:278:10: error: always_inline function '_pext_u64' requires target feature 'bmi2', but would be inlined into function 'ExtractBits' that is compiled without support for 'bmi2'
  return _pext_u64(bitmap, select_bitmap);
         ^

I'll restore the flag.

kou added a commit to kou/arrow that referenced this pull request Oct 11, 2022
apache#14342 removed -mbmi2 flag
accidentally. It should not be removed. If we remove it, a build error
is occurred on macOS High Sierra.

https://github.com/ursacomputing/crossbow/actions/runs/3225397138/jobs/5278904264#step:13:7034

    /Users/voltrondata/tmp/hbtmp/apache-arrow-20221011-35535-3alkwv/cpp/src/parquet/level_conversion_inc.h:278:10:
    error: always_inline function '_pext_u64' requires target feature
    'bmi2', but would be inlined into function 'ExtractBits' that is
    compiled without support for 'bmi2'
      return _pext_u64(bitmap, select_bitmap);
             ^
kou added a commit that referenced this pull request Oct 11, 2022
#14342 removed -mbmi2 flag accidentally. It should not be removed. If we remove it, a build error is occurred on macOS High Sierra.

https://github.com/ursacomputing/crossbow/actions/runs/3225397138/jobs/5278904264#step:13:7034

    /Users/voltrondata/tmp/hbtmp/apache-arrow-20221011-35535-3alkwv/cpp/src/parquet/level_conversion_inc.h:278:10:
    error: always_inline function '_pext_u64' requires target feature
    'bmi2', but would be inlined into function 'ExtractBits' that is
    compiled without support for 'bmi2'
      return _pext_u64(bitmap, select_bitmap);
             ^

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
kou added a commit that referenced this pull request Jul 30, 2023
### Rationale for this change

Summary of this problem: #31132 (comment)

Why this problem is happen again? Because I removed `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` in #36583. The solution we chose by #14342 was forcing to use `-O2` for SIMD related code. It works for `-DCMAKE_BUILD_TYPE=MinSizeRel` but it doesn't work for Homebrew.

Because Homebrew's CC https://github.com/Homebrew/brew/blob/master/Library/Homebrew/shims/super/cc forces to use the same `-O` flag. The default is `-Os`. If we specify `-O2`, Homebrew's CC replaces it to `-Os`. If we use  `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"`, Homebrew's CC always use `-O2`. So the solution we chose by #14342 isn't used for Homebrew.

But Homebrew thinks that `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` is a workaround. So we need another solution for Homebrew.

Here are candidate solutions:
1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`
2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`

"1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" works because we don't use the runtime SIMD dispatch feature (the problematic feature) entirely.

"2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" works but I don't know why... If `ENV.runtime_cpu_detection` is called, Homebrew's CC stops replacing `-march=*`. If we call `ENV.runtime_cpu_detection`, `-march=haswell` is used for AVX2 related code and `-march=skylake-avx512` is used for AVX512 including BMI2 related code. If we don't call `ENV.runtime_cpu_detection`, `-march=nehalem` is always used. (Note that SIMD related flags such as `-mbmi2` aren't removed by Homebrew's CC. So I think that SIMD is enabled.) I don't know why but "the one-definition-rule violation" (see the summary for details: #31132 (comment) ) isn't happen.

FYI: CPU info for GitHub Actions macOS hosted-runner:

```console
$ sysctl hw.optional machdep.cpu
hw.optional.adx: 0
hw.optional.aes: 1
hw.optional.avx1_0: 1
hw.optional.avx2_0: 0
hw.optional.avx512bw: 0
hw.optional.avx512cd: 0
hw.optional.avx512dq: 0
hw.optional.avx512f: 0
hw.optional.avx512ifma: 0
hw.optional.avx512vbmi: 0
hw.optional.avx512vl: 0
hw.optional.bmi1: 0
hw.optional.bmi2: 0
hw.optional.enfstrg: 0
hw.optional.f16c: 1
hw.optional.floatingpoint: 1
hw.optional.fma: 0
hw.optional.hle: 0
hw.optional.mmx: 1
hw.optional.mpx: 0
hw.optional.rdrand: 1
hw.optional.rtm: 0
hw.optional.sgx: 0
hw.optional.sse: 1
hw.optional.sse2: 1
hw.optional.sse3: 1
hw.optional.sse4_1: 1
hw.optional.sse4_2: 1
hw.optional.supplementalsse3: 1
hw.optional.x86_64: 1
machdep.cpu.address_bits.physical: 43
machdep.cpu.address_bits.virtual: 48
machdep.cpu.arch_perf.events: 127
machdep.cpu.arch_perf.events_number: 7
machdep.cpu.arch_perf.fixed_number: 0
machdep.cpu.arch_perf.fixed_width: 0
machdep.cpu.arch_perf.number: 4
machdep.cpu.arch_perf.version: 1
machdep.cpu.arch_perf.width: 48
machdep.cpu.cache.L2_associativity: 8
machdep.cpu.cache.linesize: 64
machdep.cpu.cache.size: 256
machdep.cpu.mwait.extensions: 3
machdep.cpu.mwait.linesize_max: 4096
machdep.cpu.mwait.linesize_min: 64
machdep.cpu.mwait.sub_Cstates: 16
machdep.cpu.thermal.ACNT_MCNT: 0
machdep.cpu.thermal.core_power_limits: 0
machdep.cpu.thermal.dynamic_acceleration: 0
machdep.cpu.thermal.energy_policy: 0
machdep.cpu.thermal.fine_grain_clock_mod: 0
machdep.cpu.thermal.hardware_feedback: 0
machdep.cpu.thermal.invariant_APIC_timer: 1
machdep.cpu.thermal.package_thermal_intr: 0
machdep.cpu.thermal.sensor: 0
machdep.cpu.thermal.thresholds: 0
machdep.cpu.tlb.data.small: 64
machdep.cpu.tlb.inst.large: 8
machdep.cpu.tlb.inst.small: 64
machdep.cpu.tlb.shared: 512
machdep.cpu.tsc_ccc.denominator: 0
machdep.cpu.tsc_ccc.numerator: 0
machdep.cpu.xsave.extended_state: 7 832 832 0
machdep.cpu.xsave.extended_state1: 0 0 0 0
machdep.cpu.brand: 0
machdep.cpu.brand_string: Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz
machdep.cpu.core_count: 3
machdep.cpu.cores_per_package: 4
machdep.cpu.extfamily: 0
machdep.cpu.extfeature_bits: 4967106816
machdep.cpu.extfeatures: SYSCALL XD EM64T LAHF RDTSCP TSCI
machdep.cpu.extmodel: 3
machdep.cpu.family: 6
machdep.cpu.feature_bits: 18427078393948011519
machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH MMX FXSR SSE SSE2 SS HTT SSE3 PCLMULQDQ MON VMX SSSE3 CX16 SSE4.1 SSE4.2 x2APIC POPCNT AES VMM PCID XSAVE OSXSAVE TSCTMR AVX1.0 RDRAND F16C
machdep.cpu.leaf7_feature_bits: 643 0
machdep.cpu.leaf7_feature_bits_edx: 3154117632
machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET SMEP ERMS MDCLEAR IBRS STIBP L1DF ACAPMSR SSBD
machdep.cpu.logical_per_package: 4
machdep.cpu.max_basic: 13
machdep.cpu.max_ext: 2147483656
machdep.cpu.microcode_version: 1070
machdep.cpu.model: 58
machdep.cpu.processor_flag: 0
machdep.cpu.signature: 198313
machdep.cpu.stepping: 9
machdep.cpu.thread_count: 3
machdep.cpu.vendor: GenuineIntel
```

### What changes are included in this PR?

"1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" because it's straightforward and  "2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" may also disable runtime SIMD dispatch implicitly.

This also adds the following debug information for easy to debug in future:
* CPU information for GitHub Actions runner
* Homebrew's build logs

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.

* Closes: #36685

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
R-JunmingChen pushed a commit to R-JunmingChen/arrow that referenced this pull request Aug 20, 2023
…ache#36705)

### Rationale for this change

Summary of this problem: apache#31132 (comment)

Why this problem is happen again? Because I removed `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` in apache#36583. The solution we chose by apache#14342 was forcing to use `-O2` for SIMD related code. It works for `-DCMAKE_BUILD_TYPE=MinSizeRel` but it doesn't work for Homebrew.

Because Homebrew's CC https://github.com/Homebrew/brew/blob/master/Library/Homebrew/shims/super/cc forces to use the same `-O` flag. The default is `-Os`. If we specify `-O2`, Homebrew's CC replaces it to `-Os`. If we use  `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"`, Homebrew's CC always use `-O2`. So the solution we chose by apache#14342 isn't used for Homebrew.

But Homebrew thinks that `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` is a workaround. So we need another solution for Homebrew.

Here are candidate solutions:
1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`
2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`

"1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" works because we don't use the runtime SIMD dispatch feature (the problematic feature) entirely.

"2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" works but I don't know why... If `ENV.runtime_cpu_detection` is called, Homebrew's CC stops replacing `-march=*`. If we call `ENV.runtime_cpu_detection`, `-march=haswell` is used for AVX2 related code and `-march=skylake-avx512` is used for AVX512 including BMI2 related code. If we don't call `ENV.runtime_cpu_detection`, `-march=nehalem` is always used. (Note that SIMD related flags such as `-mbmi2` aren't removed by Homebrew's CC. So I think that SIMD is enabled.) I don't know why but "the one-definition-rule violation" (see the summary for details: apache#31132 (comment) ) isn't happen.

FYI: CPU info for GitHub Actions macOS hosted-runner:

```console
$ sysctl hw.optional machdep.cpu
hw.optional.adx: 0
hw.optional.aes: 1
hw.optional.avx1_0: 1
hw.optional.avx2_0: 0
hw.optional.avx512bw: 0
hw.optional.avx512cd: 0
hw.optional.avx512dq: 0
hw.optional.avx512f: 0
hw.optional.avx512ifma: 0
hw.optional.avx512vbmi: 0
hw.optional.avx512vl: 0
hw.optional.bmi1: 0
hw.optional.bmi2: 0
hw.optional.enfstrg: 0
hw.optional.f16c: 1
hw.optional.floatingpoint: 1
hw.optional.fma: 0
hw.optional.hle: 0
hw.optional.mmx: 1
hw.optional.mpx: 0
hw.optional.rdrand: 1
hw.optional.rtm: 0
hw.optional.sgx: 0
hw.optional.sse: 1
hw.optional.sse2: 1
hw.optional.sse3: 1
hw.optional.sse4_1: 1
hw.optional.sse4_2: 1
hw.optional.supplementalsse3: 1
hw.optional.x86_64: 1
machdep.cpu.address_bits.physical: 43
machdep.cpu.address_bits.virtual: 48
machdep.cpu.arch_perf.events: 127
machdep.cpu.arch_perf.events_number: 7
machdep.cpu.arch_perf.fixed_number: 0
machdep.cpu.arch_perf.fixed_width: 0
machdep.cpu.arch_perf.number: 4
machdep.cpu.arch_perf.version: 1
machdep.cpu.arch_perf.width: 48
machdep.cpu.cache.L2_associativity: 8
machdep.cpu.cache.linesize: 64
machdep.cpu.cache.size: 256
machdep.cpu.mwait.extensions: 3
machdep.cpu.mwait.linesize_max: 4096
machdep.cpu.mwait.linesize_min: 64
machdep.cpu.mwait.sub_Cstates: 16
machdep.cpu.thermal.ACNT_MCNT: 0
machdep.cpu.thermal.core_power_limits: 0
machdep.cpu.thermal.dynamic_acceleration: 0
machdep.cpu.thermal.energy_policy: 0
machdep.cpu.thermal.fine_grain_clock_mod: 0
machdep.cpu.thermal.hardware_feedback: 0
machdep.cpu.thermal.invariant_APIC_timer: 1
machdep.cpu.thermal.package_thermal_intr: 0
machdep.cpu.thermal.sensor: 0
machdep.cpu.thermal.thresholds: 0
machdep.cpu.tlb.data.small: 64
machdep.cpu.tlb.inst.large: 8
machdep.cpu.tlb.inst.small: 64
machdep.cpu.tlb.shared: 512
machdep.cpu.tsc_ccc.denominator: 0
machdep.cpu.tsc_ccc.numerator: 0
machdep.cpu.xsave.extended_state: 7 832 832 0
machdep.cpu.xsave.extended_state1: 0 0 0 0
machdep.cpu.brand: 0
machdep.cpu.brand_string: Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz
machdep.cpu.core_count: 3
machdep.cpu.cores_per_package: 4
machdep.cpu.extfamily: 0
machdep.cpu.extfeature_bits: 4967106816
machdep.cpu.extfeatures: SYSCALL XD EM64T LAHF RDTSCP TSCI
machdep.cpu.extmodel: 3
machdep.cpu.family: 6
machdep.cpu.feature_bits: 18427078393948011519
machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH MMX FXSR SSE SSE2 SS HTT SSE3 PCLMULQDQ MON VMX SSSE3 CX16 SSE4.1 SSE4.2 x2APIC POPCNT AES VMM PCID XSAVE OSXSAVE TSCTMR AVX1.0 RDRAND F16C
machdep.cpu.leaf7_feature_bits: 643 0
machdep.cpu.leaf7_feature_bits_edx: 3154117632
machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET SMEP ERMS MDCLEAR IBRS STIBP L1DF ACAPMSR SSBD
machdep.cpu.logical_per_package: 4
machdep.cpu.max_basic: 13
machdep.cpu.max_ext: 2147483656
machdep.cpu.microcode_version: 1070
machdep.cpu.model: 58
machdep.cpu.processor_flag: 0
machdep.cpu.signature: 198313
machdep.cpu.stepping: 9
machdep.cpu.thread_count: 3
machdep.cpu.vendor: GenuineIntel
```

### What changes are included in this PR?

"1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" because it's straightforward and  "2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" may also disable runtime SIMD dispatch implicitly.

This also adds the following debug information for easy to debug in future:
* CPU information for GitHub Actions runner
* Homebrew's build logs

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.

* Closes: apache#36685

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
…ache#36705)

### Rationale for this change

Summary of this problem: apache#31132 (comment)

Why this problem is happen again? Because I removed `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` in apache#36583. The solution we chose by apache#14342 was forcing to use `-O2` for SIMD related code. It works for `-DCMAKE_BUILD_TYPE=MinSizeRel` but it doesn't work for Homebrew.

Because Homebrew's CC https://github.com/Homebrew/brew/blob/master/Library/Homebrew/shims/super/cc forces to use the same `-O` flag. The default is `-Os`. If we specify `-O2`, Homebrew's CC replaces it to `-Os`. If we use  `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"`, Homebrew's CC always use `-O2`. So the solution we chose by apache#14342 isn't used for Homebrew.

But Homebrew thinks that `ENV["HOMEBREW_OPTIMIZATION_LEVEL"] = "O2"` is a workaround. So we need another solution for Homebrew.

Here are candidate solutions:
1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`
2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`

"1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" works because we don't use the runtime SIMD dispatch feature (the problematic feature) entirely.

"2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" works but I don't know why... If `ENV.runtime_cpu_detection` is called, Homebrew's CC stops replacing `-march=*`. If we call `ENV.runtime_cpu_detection`, `-march=haswell` is used for AVX2 related code and `-march=skylake-avx512` is used for AVX512 including BMI2 related code. If we don't call `ENV.runtime_cpu_detection`, `-march=nehalem` is always used. (Note that SIMD related flags such as `-mbmi2` aren't removed by Homebrew's CC. So I think that SIMD is enabled.) I don't know why but "the one-definition-rule violation" (see the summary for details: apache#31132 (comment) ) isn't happen.

FYI: CPU info for GitHub Actions macOS hosted-runner:

```console
$ sysctl hw.optional machdep.cpu
hw.optional.adx: 0
hw.optional.aes: 1
hw.optional.avx1_0: 1
hw.optional.avx2_0: 0
hw.optional.avx512bw: 0
hw.optional.avx512cd: 0
hw.optional.avx512dq: 0
hw.optional.avx512f: 0
hw.optional.avx512ifma: 0
hw.optional.avx512vbmi: 0
hw.optional.avx512vl: 0
hw.optional.bmi1: 0
hw.optional.bmi2: 0
hw.optional.enfstrg: 0
hw.optional.f16c: 1
hw.optional.floatingpoint: 1
hw.optional.fma: 0
hw.optional.hle: 0
hw.optional.mmx: 1
hw.optional.mpx: 0
hw.optional.rdrand: 1
hw.optional.rtm: 0
hw.optional.sgx: 0
hw.optional.sse: 1
hw.optional.sse2: 1
hw.optional.sse3: 1
hw.optional.sse4_1: 1
hw.optional.sse4_2: 1
hw.optional.supplementalsse3: 1
hw.optional.x86_64: 1
machdep.cpu.address_bits.physical: 43
machdep.cpu.address_bits.virtual: 48
machdep.cpu.arch_perf.events: 127
machdep.cpu.arch_perf.events_number: 7
machdep.cpu.arch_perf.fixed_number: 0
machdep.cpu.arch_perf.fixed_width: 0
machdep.cpu.arch_perf.number: 4
machdep.cpu.arch_perf.version: 1
machdep.cpu.arch_perf.width: 48
machdep.cpu.cache.L2_associativity: 8
machdep.cpu.cache.linesize: 64
machdep.cpu.cache.size: 256
machdep.cpu.mwait.extensions: 3
machdep.cpu.mwait.linesize_max: 4096
machdep.cpu.mwait.linesize_min: 64
machdep.cpu.mwait.sub_Cstates: 16
machdep.cpu.thermal.ACNT_MCNT: 0
machdep.cpu.thermal.core_power_limits: 0
machdep.cpu.thermal.dynamic_acceleration: 0
machdep.cpu.thermal.energy_policy: 0
machdep.cpu.thermal.fine_grain_clock_mod: 0
machdep.cpu.thermal.hardware_feedback: 0
machdep.cpu.thermal.invariant_APIC_timer: 1
machdep.cpu.thermal.package_thermal_intr: 0
machdep.cpu.thermal.sensor: 0
machdep.cpu.thermal.thresholds: 0
machdep.cpu.tlb.data.small: 64
machdep.cpu.tlb.inst.large: 8
machdep.cpu.tlb.inst.small: 64
machdep.cpu.tlb.shared: 512
machdep.cpu.tsc_ccc.denominator: 0
machdep.cpu.tsc_ccc.numerator: 0
machdep.cpu.xsave.extended_state: 7 832 832 0
machdep.cpu.xsave.extended_state1: 0 0 0 0
machdep.cpu.brand: 0
machdep.cpu.brand_string: Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz
machdep.cpu.core_count: 3
machdep.cpu.cores_per_package: 4
machdep.cpu.extfamily: 0
machdep.cpu.extfeature_bits: 4967106816
machdep.cpu.extfeatures: SYSCALL XD EM64T LAHF RDTSCP TSCI
machdep.cpu.extmodel: 3
machdep.cpu.family: 6
machdep.cpu.feature_bits: 18427078393948011519
machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH MMX FXSR SSE SSE2 SS HTT SSE3 PCLMULQDQ MON VMX SSSE3 CX16 SSE4.1 SSE4.2 x2APIC POPCNT AES VMM PCID XSAVE OSXSAVE TSCTMR AVX1.0 RDRAND F16C
machdep.cpu.leaf7_feature_bits: 643 0
machdep.cpu.leaf7_feature_bits_edx: 3154117632
machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET SMEP ERMS MDCLEAR IBRS STIBP L1DF ACAPMSR SSBD
machdep.cpu.logical_per_package: 4
machdep.cpu.max_basic: 13
machdep.cpu.max_ext: 2147483656
machdep.cpu.microcode_version: 1070
machdep.cpu.model: 58
machdep.cpu.processor_flag: 0
machdep.cpu.signature: 198313
machdep.cpu.stepping: 9
machdep.cpu.thread_count: 3
machdep.cpu.vendor: GenuineIntel
```

### What changes are included in this PR?

"1. `-DARROW_RUNTIME_SIMD_LEVEL=NONE`" because it's straightforward and  "2. Remove `ENV.runtime_cpu_detection if Hardware::CPU.intel?`" may also disable runtime SIMD dispatch implicitly.

This also adds the following debug information for easy to debug in future:
* CPU information for GitHub Actions runner
* Homebrew's build logs

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.

* Closes: apache#36685

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants