Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-17436: [C++] Use -O2 instead of -O3 for RELEASE builds #13661

Merged
merged 5 commits into from
Aug 17, 2022

Conversation

wesm
Copy link
Member

@wesm wesm commented Jul 20, 2022

Motivated by investigation in #13654. To be discussed

@wesm
Copy link
Member Author

wesm commented Jul 20, 2022

@ursabot benchmark please

@ursabot
Copy link

ursabot commented Jul 20, 2022

Supported benchmark command examples:

@ursabot benchmark help

To run all benchmarks:
@ursabot please benchmark

To filter benchmarks by language:
@ursabot please benchmark lang=Python
@ursabot please benchmark lang=C++
@ursabot please benchmark lang=R
@ursabot please benchmark lang=Java
@ursabot please benchmark lang=JavaScript

To filter Python and R benchmarks by name:
@ursabot please benchmark name=file-write
@ursabot please benchmark name=file-write lang=Python
@ursabot please benchmark name=file-.*

To filter C++ benchmarks by archery --suite-filter and --benchmark-filter:
@ursabot please benchmark command=cpp-micro --suite-filter=arrow-compute-vector-selection-benchmark --benchmark-filter=TakeStringRandomIndicesWithNulls/262144/2 --iterations=3

For other command=cpp-micro options, please see https://github.com/ursacomputing/benchmarks/blob/main/benchmarks/cpp_micro_benchmarks.py

@wesm
Copy link
Member Author

wesm commented Jul 20, 2022

@ursabot please benchmark

@ursabot
Copy link

ursabot commented Jul 20, 2022

Benchmark runs are scheduled for baseline = 1214083 and contender = 46e3195. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Failed ⬇️0.0% ⬆️0.0% ⚠️ Contender and baseline run contexts do not match] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.0% ⬆️0.0% ⚠️ Contender and baseline run contexts do not match] test-mac-arm
[Failed ⬇️0.0% ⬆️0.0% ⚠️ Contender and baseline run contexts do not match] ursa-i9-9960x
[Finished ⬇️0.0% ⬆️0.0% ⚠️ Contender and baseline run contexts do not match] ursa-thinkcentre-m75q
Buildkite builds:
[Failed] 46e31957 ec2-t3-xlarge-us-east-2
[Failed] 46e31957 test-mac-arm
[Failed] 46e31957 ursa-i9-9960x
[Finished] 46e31957 ursa-thinkcentre-m75q
[Failed] 1214083f ec2-t3-xlarge-us-east-2
[Failed] 1214083f test-mac-arm
[Failed] 1214083f ursa-i9-9960x
[Finished] 1214083f ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@wesm
Copy link
Member Author

wesm commented Jul 20, 2022

cc @pitrou

@wesm
Copy link
Member Author

wesm commented Jul 20, 2022

Here's a dump of symbols that shrink the most in -O2:

https://gist.github.com/wesm/4a2815077ed37b671d6160b8abec5e7c

I'd be interested to see if e.g. unsafe numeric casts are significantly affected by this

@cyb70289
Copy link
Contributor

Looks great.

I believe the big regression from some tests are not real.
E.g., arrow-bit-util-benchmark : BenchmarkBitmapVisitUInt8And/32768/0 drops from 13.983 GiB/s (O3) to 503.375 MiB/s (O2).
Tested on my local host with clang-12, the result is 134MB/s, both O2 and O3. The huge gap is probably due to aggressive inline and optimization which makes the micro-benchmark far from reality.

One catch is gcc -O2 disables vectorization, while clang -O2 keeps it. We may need additional -fxxxx if want to keep some useful features.

@pitrou
Copy link
Member

pitrou commented Jul 21, 2022

Hmm, perhaps the bit-util micro-benchmarks are a bit pathologic, but other regressions seem real and quite significant...

@cyb70289
Copy link
Contributor

Not sure of gcc version used in conbench.
From gcc-10 man:

-O3 Optimize yet more.  -O3 turns on all optimizations specified by -O2 and also turns on the following optimization
    flags:

    -fgcse-after-reload -finline-functions -fipa-cp-clone -floop-interchange -floop-unroll-and-jam -fpeel-loops
    -fpredictive-commoning -fsplit-paths -ftree-loop-distribute-patterns -ftree-loop-distribution -ftree-loop-vectorize
    -ftree-partial-pre -ftree-slp-vectorize -funswitch-loops -fvect-cost-model -fversion-loops-for-strides

Perhaps try -O2 -ftree-loop-vectorize? This does impact arrow-bit-util-benchmark gcc benchmark result per my test.

@pitrou
Copy link
Member

pitrou commented Jul 21, 2022

There's -ftree-vectorize which apparently enables all vectorization (both "tree" and "SLP").

@pitrou
Copy link
Member

pitrou commented Jul 21, 2022

Also, apparently with gcc 12 the following flags are enabled with -O2: -ftree-loop-vectorize -ftree-slp-vectorize -fvect-cost-model=very-cheap

if(NOT MSVC)
string(REPLACE "-O3 -DNDEBUG" "" CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove -DNDEBUG here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Especially, RelWithDebInfo needs -DNDEBUG otherwise runtime assertions are enabled.

@pitrou
Copy link
Member

pitrou commented Jul 21, 2022

@ursabot please benchmark lang=C++

@ursabot
Copy link

ursabot commented Jul 21, 2022

Benchmark runs are scheduled for baseline = 1214083 and contender = e4e430b. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Only ['Python'] langs are supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.0% ⬆️0.0% ⚠️ Contender and baseline run contexts do not match] test-mac-arm
[Skipped ⚠️ Only ['JavaScript', 'Python', 'R'] langs are supported on ursa-i9-9960x] ursa-i9-9960x
[Finished ⬇️0.0% ⬆️0.0% ⚠️ Contender and baseline run contexts do not match] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] e4e430b4 test-mac-arm
[Finished] e4e430b4 ursa-thinkcentre-m75q
[Failed] 1214083f ec2-t3-xlarge-us-east-2
[Failed] 1214083f test-mac-arm
[Failed] 1214083f ursa-i9-9960x
[Finished] 1214083f ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@pitrou
Copy link
Member

pitrou commented Jul 21, 2022

Ok, so I run locally with gcc 9.4.0 (Ubuntu 20.04) on AMD Zen 2.

Build times (with ccache disabled)

  • -O3:
real	1m25,888s
user	21m58,527s
sys	0m48,888s
  • -O2:
real	1m21,713s
user	20m57,462s
sys	0m48,843s
  • -O2 -ftree-vectorize:
real	1m23,461s
user	21m11,970s
sys	0m49,485s

Lib sizes

  • -O3:
   text	   data	    bss	    dec	    hex	filename
23230116	 326105	2197509	25753730	188f882	build-bundled-release/release/libarrow.so
1276669	  24928	   2346	1303943	 13e587	build-bundled-release/release/libarrow_testing.so
  • -O2:
22574451	 326665	2197509	25098625	17ef981	build-bundled-release/release/libarrow.so
1268546	  25144	   2346	1296036	 13c6a4	build-bundled-release/release/libarrow_testing.so
  • -O2 -ftree-vectorize:
22639315	 326713	2197509	25163537	17ff711	build-bundled-release/release/libarrow.so
1257242	  25144	   2346	1284732	 139a7c	build-bundled-release/release/libarrow_testing.so

Compute benchmarks

All in all, in this case:

  • the size and build time reductions are significant but small (less than 5%)
  • -ftree-vectorize brings a net performance improvement over bare -O2
  • performance seems globally comparable between -O2 -ftree-vectorize and -O3, despite some large disparities in individual micro-benchmarks

@pitrou
Copy link
Member

pitrou commented Jul 21, 2022

Hmm, I notice that -DNDEBUG was not passed anymore, so perhaps that affected some benchmarks :-(

Edit: re-ran benchmarks and updated the gists above.

@pitrou
Copy link
Member

pitrou commented Jul 21, 2022

@ursabot please benchmark lang=C++

@ursabot
Copy link

ursabot commented Jul 21, 2022

Benchmark runs are scheduled for baseline = 1214083 and contender = 6dd7bab. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Only ['Python'] langs are supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.0% ⬆️0.0% ⚠️ Contender and baseline run contexts do not match] test-mac-arm
[Skipped ⚠️ Only ['JavaScript', 'Python', 'R'] langs are supported on ursa-i9-9960x] ursa-i9-9960x
[Finished ⬇️0.0% ⬆️0.0% ⚠️ Contender and baseline run contexts do not match] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 6dd7bab5 test-mac-arm
[Finished] 6dd7bab5 ursa-thinkcentre-m75q
[Failed] 1214083f ec2-t3-xlarge-us-east-2
[Failed] 1214083f test-mac-arm
[Failed] 1214083f ursa-i9-9960x
[Finished] 1214083f ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@pitrou
Copy link
Member

pitrou commented Jul 21, 2022

cc @save-buffer @westonpace for further opinions

@wesm
Copy link
Member Author

wesm commented Jul 21, 2022

I would guess there are places that benefit from loop unswitching also.

@pitrou
Copy link
Member

pitrou commented Jul 21, 2022

I would guess there are places that benefit from loop unswitching also.

Hmm, I don't think it's our duty to micro-optimize compiler options, though. There are too many moving parts (compiler brand, compiler version, architecture, etc.).

@wesm
Copy link
Member Author

wesm commented Jul 21, 2022

I would say that we should just keep O3 and keep an eye on symbol sizes in case we need to intervene occasionally. On the whole I think the symbol sizes we have are not too bad.

@wesm
Copy link
Member Author

wesm commented Jul 21, 2022

@ursabot please benchmark lang=C++

@ursabot
Copy link

ursabot commented Jul 21, 2022

Benchmark runs are scheduled for baseline = 8a2acaa and contender = 7e5ca1a. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Only ['Python'] langs are supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.0% ⬆️0.0% ⚠️ Contender and baseline run contexts do not match] test-mac-arm
[Skipped ⚠️ Only ['JavaScript', 'Python', 'R'] langs are supported on ursa-i9-9960x] ursa-i9-9960x
[Finished ⬇️0.0% ⬆️0.0% ⚠️ Contender and baseline run contexts do not match] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 7e5ca1a5 test-mac-arm
[Finished] 7e5ca1a5 ursa-thinkcentre-m75q
[Failed] 8a2acaa4 ec2-t3-xlarge-us-east-2
[Failed] 8a2acaa4 test-mac-arm
[Failed] 8a2acaa4 ursa-i9-9960x
[Finished] 8a2acaa4 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@wesm
Copy link
Member Author

wesm commented Jul 21, 2022

I added -ftree-vectorize for gcc

@pitrou
Copy link
Member

pitrou commented Jul 21, 2022

You undid the changes I had already pushed :-(

cpp/cmake_modules/SetupCxxFlags.cmake Outdated Show resolved Hide resolved
cpp/cmake_modules/SetupCxxFlags.cmake Outdated Show resolved Hide resolved
cpp/cmake_modules/SetupCxxFlags.cmake Show resolved Hide resolved
@save-buffer
Copy link
Contributor

save-buffer commented Jul 21, 2022

I've been trying to get caught up on the context here - I took a look at #13654. My current understanding is:

  • The problem we are trying to solve are insanely large functions generated by the codegen framework when using -O3
  • The theory is that it has to do with -O3 applying tons of crazy optimizations that leads to lots of bloat due to too much vectorized code
    Does that sound right?

So looking at the results, -O3 adds about 1MB (to ~22MB) to the total binary size, so I think that's not an issue itself. However, there is something to be said about bloating individual kernels. Reading the other PR, it seems like one of the kernels was 40 KB big? That's quite alarming as chips these days have about 32 KB of icache. In the worst case, that's quite a bit of thrashing.
That particular disassembly looks to me like the compiler is vectorizing and unrolling the loop after vectorizing it.

As for solutions: Looking at the benchmarks, it seems like the current code is pretty unstable with regards to what the compiler generates when it comes to flags. I'm not sure messing with compiler flags will be one-size-fits-all as each combination of flags causes large changes in the generated code. I did like the changes in #13654.

I really liked this point, which very much aligns with my experience and intuition that abstract templates lead to unstable code generation:

our approach (so much for "zero cost abstractions") for generalizing to abstract between writing to an array versus packing a bitmap is causing too much code to be generated.

So in my mind, two solutions we could have are:

  • Keep existing code and compilation flags but explicitly disable them for problematic kernels (using something like #pragma GCC push_options and #pragma GCC pop_options, though I'm not sure if there's a way to do this on MSVC).
  • Change the code to use fewer templates and more raw for loops. If we're feeling really adventurous, we could write a Python or Jinja script that generates the kernels as the simplest possible for loop (I know this is the approach used in a lot of databases). I have never seen a problem with this style of code even on -O3.

@wesm
Copy link
Member Author

wesm commented Jul 21, 2022

@save-buffer thanks for your comments

Change the code to use fewer templates and more raw for loops. If we're feeling really adventurous, we could write a Python or Jinja script that generates the kernels as the simplest possible for loop (I know this is the approach used in a lot of databases). I have never seen a problem with this style of code even on -O3.

I agree also with this -- I know that some feel that manually generating code when you can have templates "do it for you" is an antipattern, but it seems at least that the code in compute/kernels/codegen_internal.h has gone a little too far introducing abstractions where we are putting too much blind faith in the compiler (e.g. the "OutputAdapter").

Not a priority by any means among our myriad priorities but perhaps something for us to occasionally hack at in our idle moments (I did #13654 when I was bored on an airplane)

@wesm
Copy link
Member Author

wesm commented Jul 21, 2022

In https://conbench.ursa.dev/compare/runs/e938638743e84794ad829524fae04fbd...20727b1b390e4b30be10f49db7f06f3f/ it seems that there are several hundred microbenchmarks with > 10% performance regressions but also over 100 microbenchmarks with > 10% performance improvement. I'd say it's a coin toss whether to move to -O2 (with -ftree-vectorize) versus -O3.

@westonpace
Copy link
Member

I would very much like to run the TPC-H benchmarks on this change. They are failing in conbench at the moment. There is a fix for these benchmarks in PR right now (#13679) so maybe we can run it after. That will at least give us some sense of the impact at a macro-level.

@wesm
Copy link
Member Author

wesm commented Jul 22, 2022

Makes sense. Let’s get more data and make a decision after 9.0.0 goes out.

@pitrou
Copy link
Member

pitrou commented Aug 9, 2022

@westonpace Are you planning to get/report TPC-H benchmark numbers for this?

@westonpace
Copy link
Member

@ursabot please benchmark lang=R

@ursabot
Copy link

ursabot commented Aug 9, 2022

Benchmark runs are scheduled for baseline = 8a2acaa and contender = 47fcf77. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Only ['Python'] langs are supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.0% ⬆️0.0% ⚠️ Contender and baseline run contexts do not match] test-mac-arm
[Failed ⬇️0.0% ⬆️0.0% ⚠️ Contender and baseline run contexts do not match] ursa-i9-9960x
[Skipped ⚠️ Only ['C++', 'Java'] langs are supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
[Failed] 47fcf771 test-mac-arm
[Failed] 47fcf771 ursa-i9-9960x
[Failed] 8a2acaa4 ec2-t3-xlarge-us-east-2
[Failed] 8a2acaa4 test-mac-arm
[Failed] 8a2acaa4 ursa-i9-9960x
[Finished] 8a2acaa4 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@westonpace
Copy link
Member

Ah, I think a rebase is needed to get these passing. I'll do that real quick.

@westonpace
Copy link
Member

@pitrou I got the results here: https://conbench.ursa.dev/compare/runs/b724609840e242afbf4e1e26682afbe3...b742cce58407420db4da8e461604a1db/

There were no significant changes (one of the queries was 15% faster and everything else was within +/- 5%) so I think I'm +1 for this change.

@pitrou pitrou changed the title [C++][DONOTMERGE] Use -O2 instead of -O3 for RELEASE builds ARROW-17436: [C++] Use -O2 instead of -O3 for RELEASE builds Aug 16, 2022
@pitrou pitrou requested a review from cyb70289 August 16, 2022 17:11
@pitrou
Copy link
Member

pitrou commented Aug 16, 2022

@github-actions crossbow submit -g cpp -g python -g r

@github-actions
Copy link

@github-actions
Copy link

Revision: de53440

Submitted crossbow builds: ursacomputing/crossbow @ actions-0c44f1532a

Task Status
conda-linux-gcc-py37-cpu-r40 Azure
conda-linux-gcc-py37-cpu-r41 Azure
conda-osx-clang-py37-r40 Azure
conda-osx-clang-py37-r41 Azure
conda-win-vs2017-py37-r40 Azure
conda-win-vs2017-py37-r41 Azure
homebrew-r-autobrew Github Actions
homebrew-r-brew Github Actions
r-binary-packages Github Actions
test-alpine-linux-cpp Github Actions
test-build-cpp-fuzz Github Actions
test-conda-cpp Github Actions
test-conda-cpp-valgrind Azure
test-conda-python-3.10 Github Actions
test-conda-python-3.7 Github Actions
test-conda-python-3.7-hdfs-2.9.2 Github Actions
test-conda-python-3.7-hdfs-3.2.1 Github Actions
test-conda-python-3.7-kartothek-latest Github Actions
test-conda-python-3.7-kartothek-master Github Actions
test-conda-python-3.7-pandas-0.24 Github Actions
test-conda-python-3.7-pandas-latest Github Actions
test-conda-python-3.7-spark-v3.1.2 Github Actions
test-conda-python-3.8 Github Actions
test-conda-python-3.8-hypothesis Github Actions
test-conda-python-3.8-pandas-latest Github Actions
test-conda-python-3.8-pandas-nightly Github Actions
test-conda-python-3.8-spark-v3.2.0 Github Actions
test-conda-python-3.9 Github Actions
test-conda-python-3.9-dask-latest Github Actions
test-conda-python-3.9-dask-master Github Actions
test-conda-python-3.9-pandas-master Github Actions
test-conda-python-3.9-spark-master Github Actions
test-debian-10-cpp-amd64 Github Actions
test-debian-10-cpp-i386 Github Actions
test-debian-11-cpp-amd64 Github Actions
test-debian-11-cpp-i386 Github Actions
test-debian-11-python-3 Azure
test-fedora-35-cpp Github Actions
test-fedora-35-python-3 Azure
test-fedora-r-clang-sanitizer Azure
test-r-arrow-backwards-compatibility Github Actions
test-r-depsource-bundled Azure
test-r-depsource-system Github Actions
test-r-dev-duckdb Github Actions
test-r-devdocs Github Actions
test-r-gcc-11 Github Actions
test-r-gcc-12 Github Actions
test-r-install-local Github Actions
test-r-linux-as-cran Github Actions
test-r-linux-rchk Github Actions
test-r-linux-valgrind Azure
test-r-minimal-build Azure
test-r-offline-maximal Github Actions
test-r-offline-minimal Azure
test-r-rhub-debian-gcc-devel-lto-latest Azure
test-r-rhub-debian-gcc-release-custom-ccache Azure
test-r-rhub-ubuntu-gcc-release-latest Azure
test-r-rocker-r-base-latest Azure
test-r-rstudio-r-base-4.1-opensuse153 Azure
test-r-rstudio-r-base-4.2-centos7-devtoolset-8 Azure
test-r-rstudio-r-base-4.2-focal Azure
test-r-ubuntu-22.04 Github Actions
test-r-versions Github Actions
test-ubuntu-18.04-cpp Github Actions
test-ubuntu-18.04-cpp-release Github Actions
test-ubuntu-18.04-cpp-static Github Actions
test-ubuntu-18.04-r-sanitizer Azure
test-ubuntu-20.04-cpp Github Actions
test-ubuntu-20.04-cpp-14 Github Actions
test-ubuntu-20.04-cpp-17 Github Actions
test-ubuntu-20.04-cpp-bundled Github Actions
test-ubuntu-20.04-cpp-thread-sanitizer Github Actions
test-ubuntu-20.04-python-3 Azure
test-ubuntu-22.04-cpp Github Actions

Copy link
Contributor

@cyb70289 cyb70289 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pitrou pitrou merged commit 9d1bbaf into apache:master Aug 17, 2022
@ursabot
Copy link

ursabot commented Aug 17, 2022

Benchmark runs are scheduled for baseline = 682c63a and contender = 9d1bbaf. 9d1bbaf is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0% ⚠️ Contender and baseline run contexts do not match] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.0% ⬆️0.0% ⚠️ Contender and baseline run contexts do not match] test-mac-arm
[Failed ⬇️0.0% ⬆️0.0% ⚠️ Contender and baseline run contexts do not match] ursa-i9-9960x
[Finished ⬇️0.0% ⬆️0.0% ⚠️ Contender and baseline run contexts do not match] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 9d1bbaff ec2-t3-xlarge-us-east-2
[Finished] 9d1bbaff test-mac-arm
[Failed] 9d1bbaff ursa-i9-9960x
[Finished] 9d1bbaff ursa-thinkcentre-m75q
[Finished] 682c63a3 ec2-t3-xlarge-us-east-2
[Failed] 682c63a3 test-mac-arm
[Failed] 682c63a3 ursa-i9-9960x
[Finished] 682c63a3 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@kou
Copy link
Member

kou commented Aug 22, 2022

I don't know why but it seems that the "AMD64 Windows R 3.6 RTools 35" CI job is failed since this change is merged into master:

https://github.com/apache/arrow/runs/7866259947?check_suite_focus=true#step:11:699

Error: Error: package or namespace load failed for 'arrow' in inDL(x, as.logical(local), as.logical(now), ...):
 unable to load shared object 'D:/a/arrow/arrow/r/check/arrow.Rcheck/00LOCK-arrow/00new/arrow/libs/i386/arrow.dll':
  LoadLibrary failure:  A dynamic link library (DLL) initialization routine failed.
`

@wesm
Copy link
Member Author

wesm commented Aug 22, 2022

That’s really strange. Where is the log for the job that builds the artifacts that depends on? Is the tarball it is downloading stale by chance?

@pitrou
Copy link
Member

pitrou commented Aug 22, 2022

I don't know. Perhaps @paleolimbot wants to take a look.

But regardless, we're now having a discussion to drop RTools 3.5 on the ML, so I'm not sure that matters much.

@paleolimbot
Copy link
Member

As you noted we're discussing dropping support for the failing platform. I don't currently have a development environment for RTools 35...while I could set one up, I'm not keen to spend a bunch of time doing that if we're about to drop support. I'll open a discussion with the R developers as to how we'll solve the issue.

zagto pushed a commit to zagto/arrow that referenced this pull request Oct 7, 2022
…13661)

Motivated by investigation in apache#13654. To be discussed

Lead-authored-by: Antoine Pitrou <antoine@python.org>
Co-authored-by: Wes McKinney <wesm@apache.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants