Skip to content

ARROW-7785: [C++] Improve compilation performance of sparse tensor related code#6533

Closed
mrkn wants to merge 7 commits intoapache:masterfrom
mrkn:ARROW-7785
Closed

ARROW-7785: [C++] Improve compilation performance of sparse tensor related code#6533
mrkn wants to merge 7 commits intoapache:masterfrom
mrkn:ARROW-7785

Conversation

@mrkn
Copy link
Member

@mrkn mrkn commented Mar 4, 2020

Extract SparseTensorConverter related things from sparse_tensor.cc to tensor/converter.cc and break into multiple compilation units which can be compiled in parallel. This should reduce the code wall clock time of compiling sparse_tensor.cc.

@mrkn mrkn requested a review from pitrou March 4, 2020 09:05
@mrkn
Copy link
Member Author

mrkn commented Mar 4, 2020

This is the result of ClangBuildAnalyzer.

Before:

**** Files that took longest to codegen (compiler backend):
  6516 ms: ../../apache/arrow/cpp/build.clang-9.debug/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/take.cc.o
  4033 ms: ../../apache/arrow/cpp/build.clang-9.debug/src/arrow/CMakeFiles/arrow_objlib.dir/sparse_tensor.cc.o
  3943 ms: ../../apache/arrow/cpp/build.clang-9.debug/src/arrow/CMakeFiles/arrow_objlib.dir/array/diff.cc.o
  2124 ms: ../../apache/arrow/cpp/build.clang-9.debug/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/filter.cc.o
  2054 ms: ../../apache/arrow/cpp/build.clang-9.debug/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/compare.cc.o
  1895 ms: ../../apache/arrow/cpp/build.clang-9.debug/src/arrow/CMakeFiles/arrow_objlib.dir/scalar.cc.o
  1586 ms: ../../apache/arrow/cpp/build.clang-9.debug/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/cast.cc.o
  1268 ms: ../../apache/arrow/cpp/build.clang-9.debug/src/arrow/CMakeFiles/arrow_objlib.dir/ipc/json_simple.cc.o
  1060 ms: ../../apache/arrow/cpp/build.clang-9.debug/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/hash.cc.o
   909 ms: ../../apache/arrow/cpp/build.clang-9.debug/src/arrow/CMakeFiles/arrow_objlib.dir/array.cc.o

After:

**** Files that took longest to codegen (compiler backend):
  6576 ms: ../../apache/arrow/cpp/build.clang-9.debug/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/take.cc.o
  4134 ms: ../../apache/arrow/cpp/build.clang-9.debug/src/arrow/CMakeFiles/arrow_objlib.dir/array/diff.cc.o
  3015 ms: ../../apache/arrow/cpp/build.clang-9.debug/src/arrow/CMakeFiles/arrow_objlib.dir/sparse_tensor/converter.cc.o
  2115 ms: ../../apache/arrow/cpp/build.clang-9.debug/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/filter.cc.o
  2017 ms: ../../apache/arrow/cpp/build.clang-9.debug/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/compare.cc.o
  1598 ms: ../../apache/arrow/cpp/build.clang-9.debug/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/cast.cc.o
  1418 ms: ../../apache/arrow/cpp/build.clang-9.debug/src/arrow/CMakeFiles/arrow_objlib.dir/type.cc.o
  1417 ms: ../../apache/arrow/cpp/build.clang-9.debug/src/arrow/CMakeFiles/arrow_objlib.dir/scalar.cc.o
  1280 ms: ../../apache/arrow/cpp/build.clang-9.debug/src/arrow/CMakeFiles/arrow_objlib.dir/sparse_tensor.cc.o
  1266 ms: ../../apache/arrow/cpp/build.clang-9.debug/src/arrow/CMakeFiles/arrow_objlib.dir/ipc/json_simple.cc.o

@github-actions
Copy link

github-actions bot commented Mar 4, 2020

@pitrou
Copy link
Member

pitrou commented Mar 4, 2020

Would you have the timings in release mode?

@mrkn
Copy link
Member Author

mrkn commented Mar 4, 2020

This is the result on release mode. I need additional work to reduce the codegen time of sparse_tensor/converter.cc.

Before:

**** Files that took longest to codegen (compiler backend):
 75997 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/sparse_tensor.cc.o
 45778 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/take.cc.o
 17773 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/array/diff.cc.o
 13087 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/filter.cc.o
 10313 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/compare.cc.o
  9937 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/cast.cc.o
  8628 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/hash.cc.o
  6480 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/scalar.cc.o
  6371 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/ipc/json_internal.cc.o
  6157 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/builder.cc.o

After:

**** Files that took longest to codegen (compiler backend):
 62183 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/sparse_tensor/converter.cc.o
 46198 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/take.cc.o
 17761 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/array/diff.cc.o
 13090 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/filter.cc.o
 11373 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/sparse_tensor.cc.o
 10288 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/compare.cc.o
  9948 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/cast.cc.o
  8703 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/hash.cc.o
  6400 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/scalar.cc.o
  6305 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/ipc/json_internal.cc.o

@mrkn mrkn changed the title ARROW-7785: [C++] sparse_tensor.cc is extremely slow to compile ARROW-7785: WIP: [C++] sparse_tensor.cc is extremely slow to compile Mar 4, 2020
@pitrou
Copy link
Member

pitrou commented Mar 4, 2020

Yes, the main problem is the codegen time in this case. I would hope it's possible to make some portions non-templated and only keep the hot conversion loops as template functions.

@mrkn
Copy link
Member Author

mrkn commented Mar 5, 2020

Finally, I got the following result.

**** Files that took longest to codegen (compiler backend):
 45358 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/take.cc.o
 26609 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/sparse_tensor/csf_converter.cc.o
 18784 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/array/diff.cc.o
 12986 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/filter.cc.o
 12112 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/sparse_tensor/coo_converter.cc.o
 11550 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/sparse_tensor.cc.o
 10641 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/sparse_tensor/csr_converter.cc.o
 10521 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/sparse_tensor/csc_converter.cc.o
 10315 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/compare.cc.o
  9921 ms: ../../apache/arrow/cpp/build.clang-9.release/src/arrow/CMakeFiles/arrow_objlib.dir/compute/kernels/cast.cc.o

@mrkn mrkn changed the title ARROW-7785: WIP: [C++] sparse_tensor.cc is extremely slow to compile ARROW-7785: [C++] sparse_tensor.cc is extremely slow to compile Mar 5, 2020
@mrkn
Copy link
Member Author

mrkn commented Mar 6, 2020

@pitrou Could you review this again?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you run IWYU on this file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a commit to fix for IWYU, but the following reports are remaining. I guess these reports are false positive.

include-what-you-use 0.13 (git:06e88ef) based on clang version 9.0.0-2~ubuntu18.04.2 (tags/RELEASE_900/final)
/home/mrkn/src/github.com/apache/arrow/cpp/src/arrow/tensor/csf_converter.cc should add these lines:
namespace arrow { template <typename T, typename Enable = void> class TypedBufferBuilder; }

/home/mrkn/src/github.com/apache/arrow/cpp/src/arrow/tensor/csf_converter.cc should remove these lines:
- #include "arrow/buffer_builder.h"  // lines 27-27

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems so

@mrkn
Copy link
Member Author

mrkn commented Mar 6, 2020

@wesm Thank you for reviewing this. I've added commits to fix things you pointed out.

Copy link
Member

@wesm wesm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 I fixed the linting issues and did some minor cleaning of the includes. Thanks for addressing my comments!

@wesm wesm changed the title ARROW-7785: [C++] sparse_tensor.cc is extremely slow to compile ARROW-7785: [C++] Improve compilation performance of sparse tensor related code Mar 6, 2020
@wesm
Copy link
Member

wesm commented Mar 6, 2020

Here's an appveyor build on my fork: https://ci.appveyor.com/project/wesm/arrow/builds/31299374

@wesm wesm closed this in f6a41a4 Mar 7, 2020
@mrkn mrkn deleted the ARROW-7785 branch April 8, 2020 04:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments