Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-35636: [C++] Extract two expensive test suites from compute-vector-test #36401

Merged
merged 3 commits into from
Jul 3, 2023

Conversation

felipecrv
Copy link
Contributor

@felipecrv felipecrv commented Jun 29, 2023

Rationale for this change

arrow-compute-vector-test is too big and takes a long time to run because of that.

What changes are included in this PR?

Extracting two tests.

Timings on my machine (Debug builds with ASAN).

debug/arrow-compute-vector-test > /dev/null  11.54s user 0.47s system 99% cpu 12.023 total
debug/arrow-compute-vector-sort-test > /dev/null  13.30s user 0.26s system 99% cpu 13.579 total
debug/arrow-compute-vector-selection-test > /dev/null  6.97s user 0.22s system 99% cpu 7.207 total

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

@pitrou
Copy link
Member

pitrou commented Jun 30, 2023

@github-actions crossbow submit -g cpp

add_arrow_compute_test(vector_selection_test
SOURCES
vector_selection_test.cc
test_util.cc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point we might want to create an object library for test_util.cc instead of recompiling it for each target. Can be left as a subsequent PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was tempted to do it, but I was afraid it would make things worse because test_util.cc is so small.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's possible. We can revisit later :-)

@github-actions
Copy link

Revision: 021fb90

Submitted crossbow builds: ursacomputing/crossbow @ actions-ccd2ce168e

Task Status
test-alpine-linux-cpp Github Actions
test-build-cpp-fuzz Github Actions
test-conda-cpp Github Actions
test-conda-cpp-valgrind Azure
test-cuda-cpp Github Actions
test-debian-11-cpp-amd64 Github Actions
test-debian-11-cpp-i386 Github Actions
test-fedora-35-cpp Github Actions
test-ubuntu-20.04-cpp Github Actions
test-ubuntu-20.04-cpp-20 Github Actions
test-ubuntu-20.04-cpp-bundled Github Actions
test-ubuntu-20.04-cpp-minimal-with-formats Github Actions
test-ubuntu-20.04-cpp-thread-sanitizer Github Actions
test-ubuntu-22.04-cpp Github Actions

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Jun 30, 2023
@pitrou
Copy link
Member

pitrou commented Jun 30, 2023

I think you need to run cmake-format for the lint step to pass... or apply the following patch:

diff --git a/cpp/src/arrow/compute/kernels/CMakeLists.txt b/cpp/src/arrow/compute/kernels/CMakeLists.txt
index a84b36a52..9084f279a 100644
--- a/cpp/src/arrow/compute/kernels/CMakeLists.txt
+++ b/cpp/src/arrow/compute/kernels/CMakeLists.txt
@@ -76,14 +76,9 @@ add_arrow_compute_test(vector_test
                        select_k_test.cc
                        test_util.cc)
 
-add_arrow_compute_test(vector_sort_test
-                       SOURCES
-                       vector_sort_test.cc
-                       test_util.cc)
+add_arrow_compute_test(vector_sort_test SOURCES vector_sort_test.cc test_util.cc)
 
-add_arrow_compute_test(vector_selection_test
-                       SOURCES
-                       vector_selection_test.cc
+add_arrow_compute_test(vector_selection_test SOURCES vector_selection_test.cc
                        test_util.cc)
 
 add_arrow_benchmark(vector_hash_benchmark PREFIX "arrow-compute")

@pitrou
Copy link
Member

pitrou commented Jun 30, 2023

Nice timings here too:

/build/build-test ~/arrow/dev/cpp
Test project /build/build-test
    Start 26: arrow-compute-vector-sort-test
    Start 25: arrow-compute-vector-test
    Start 27: arrow-compute-vector-selection-test
1/3 Test #27: arrow-compute-vector-selection-test ...   Passed    2.94 sec
2/3 Test #25: arrow-compute-vector-test .............   Passed    4.84 sec
3/3 Test #26: arrow-compute-vector-sort-test ........   Passed    6.15 sec

100% tests passed, 0 tests failed out of 3

Label Time Summary:
arrow_compute    =  13.94 sec*proc (3 tests)
unittest         =  13.94 sec*proc (3 tests)

Total Test time (real) =   6.15 sec
~/arrow/dev/cpp

real	0m6,166s
user	0m13,900s
sys	0m0,092s

@felipecrv
Copy link
Contributor Author

I think you need to run cmake-format for the lint step to pass... or apply the following patch:

I ran the CMake linter locally and was so confused: it only says the checks doesn't pass.

@felipecrv felipecrv requested a review from pitrou June 30, 2023 14:26
@pitrou
Copy link
Member

pitrou commented Jul 3, 2023

The CI failures seem unrelated, I'll merge. Thanks @felipecrv !

@pitrou pitrou merged commit 1bfa241 into apache:main Jul 3, 2023
31 of 34 checks passed
@pitrou pitrou removed the awaiting committer review Awaiting committer review label Jul 3, 2023
@felipecrv felipecrv deleted the split_bench branch July 4, 2023 17:41
@conbench-apache-arrow
Copy link

Conbench analyzed the 6 benchmark runs on commit 1bfa241b.

There were 9 benchmark results indicating a performance regression:

The full Conbench report has more details.

westonpace pushed a commit to westonpace/arrow that referenced this pull request Jul 7, 2023
…vector-test (apache#36401)

### Rationale for this change

`arrow-compute-vector-test` is too big and takes a long time to run because of that.

### What changes are included in this PR?

Extracting two tests.

Timings on my machine (Debug builds with ASAN).

```
debug/arrow-compute-vector-test > /dev/null  11.54s user 0.47s system 99% cpu 12.023 total
debug/arrow-compute-vector-sort-test > /dev/null  13.30s user 0.26s system 99% cpu 13.579 total
debug/arrow-compute-vector-selection-test > /dev/null  6.97s user 0.22s system 99% cpu 7.207 total
```

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.

* Closes: apache#35636

Authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants