Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][CMake] gRPCAlt takes 15 minutes on linux #33882

Closed
h-vetinari opened this issue Jan 26, 2023 · 9 comments · Fixed by #34019
Closed

[C++][CMake] gRPCAlt takes 15 minutes on linux #33882

h-vetinari opened this issue Jan 26, 2023 · 9 comments · Fixed by #34019

Comments

@h-vetinari
Copy link
Contributor

Describe the bug, including details regarding any error messages, version, and platform.

In conda-forge, running the code related to (or around) gRPCAlt is near-instantaneous on osx & windows,

2023-01-26T06:03:55.9218860Z -- Providing CMake module for gRPCAlt as part of Arrow CMake package
2023-01-26T06:03:55.9220610Z -- pkg-config package for grpc++ for static link isn't found

while on linux it takes 15 minutes

2023-01-26T06:02:28.6762155Z -- Providing CMake module for gRPCAlt as part of Arrow CMake package
2023-01-26T06:17:03.7331385Z -- Using pkg-config package for grpc++ for static link

This flew under the radar a bit because the linux builds are generally the fastest of the bunch, but it still looks like a massive waste of time (almost a third of the build time).

It happens on all versions 7-10; I didn't test earlier versions, and haven't tested 11 yet.

Component(s)

Continuous Integration

@kou
Copy link
Member

kou commented Jan 27, 2023

Hmm. It's strange. It's happen in https://github.com/apache/arrow/blob/master/cpp/cmake_modules/ThirdpartyToolchain.cmake#L288-L305 .

Can we add debug logs (message(...)) in the block or enable something general CMake's debug log to debug this?

@h-vetinari
Copy link
Contributor Author

h-vetinari commented Jan 27, 2023

Hmm. It's strange. It's happen in https://github.com/apache/arrow/blob/master/cpp/cmake_modules/ThirdpartyToolchain.cmake#L288-L305 .

Can we add debug logs (message(...)) in the block or enable something general CMake's debug log to debug this?

That code is really hard to understand (for me). I'm happy to try with a patch, but not sure what/where.

Just now I got worried because on linux, pkg-config claims to find a static grpc++ somehwere, though the build coming in through conda-forge definitely does not contain any static libs. However, it looks like that's just the log-message being unaware of which flavour of dynamic/static is actually being found by pkg-config.

Can we just disable the whole pkg-config search completely? Our builds have CMake metadata, and find_package(${PACKAGE_NAME} REQUIRED) should work in the vast majority of cases (even moreso with CONFIG mode), and especially for gprc.

@kou
Copy link
Member

kou commented Jan 28, 2023

Could you try this?

diff --git a/cpp/cmake_modules/ThirdpartyToolchain.cmake b/cpp/cmake_modules/ThirdpartyToolchain.cmake
index 4310d6018b..5a88900fe9 100644
--- a/cpp/cmake_modules/ThirdpartyToolchain.cmake
+++ b/cpp/cmake_modules/ThirdpartyToolchain.cmake
@@ -286,14 +286,17 @@ macro(resolve_dependency DEPENDENCY_NAME)
   endif()
   if(${DEPENDENCY_NAME}_SOURCE STREQUAL "SYSTEM" AND ARG_IS_RUNTIME_DEPENDENCY)
     provide_find_module(${PACKAGE_NAME} "Arrow")
+    message(STATUS "Before find_package(PkgConfig)")
     list(APPEND ARROW_SYSTEM_DEPENDENCIES ${PACKAGE_NAME})
     find_package(PkgConfig QUIET)
+    message(STATUS "After find_package(PkgConfig)")
     foreach(ARG_PC_PACKAGE_NAME ${ARG_PC_PACKAGE_NAMES})
       pkg_check_modules(${ARG_PC_PACKAGE_NAME}_PC
                         ${ARG_PC_PACKAGE_NAME}
                         NO_CMAKE_PATH
                         NO_CMAKE_ENVIRONMENT_PATH
                         QUIET)
+      message(STATUS "After pkg_check_modules(${ARG_PC_PACKAGE_NAME})")
       if(${${ARG_PC_PACKAGE_NAME}_PC_FOUND})
         message(STATUS "Using pkg-config package for ${ARG_PC_PACKAGE_NAME} for static link"
         )

Can we just disable the whole pkg-config search completely? Our builds have CMake metadata, and find_package(${PACKAGE_NAME} REQUIRED) should work in the vast majority of cases (even moreso with CONFIG mode), and especially for gprc.

We can add an option to disable pkg-config support. But if we disable pkg-config support, conda users can't find Apache Arrow C++ by pkg-config. I'm not familiar with conda but I think that there are users who use pkg-config to find external packages.

@h-vetinari
Copy link
Contributor Author

Could you try this?

Done in conda-forge/arrow-cpp-feedstock#941; I'll get back with timing infos as soon as the CI completes.

We can add an option to disable pkg-config support. But if we disable pkg-config support, conda users can't find Apache Arrow C++ by pkg-config. I'm not familiar with conda but I think that there are users who use pkg-config to find external packages.

AFAIU there's a difference between looking for pkgconfig metadata of other packages (not necessary for us) and producing it for arrow's consumers (that would still be possible). It might be all bundled up in the pkg_check_modules function, but conceptually those things are very different (and need quite different CMake code).

@h-vetinari
Copy link
Contributor Author

Here are the timings including the debug statements from the faster agent:

2023-01-28T06:58:13.4947939Z -- Found utf8proc: /home/conda/feedstock_root/build_artifacts/apache-arrow_1674888786737/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/lib/libutf8proc.so (found suitable version "2.8.0", minimum required is "2.2.0") 
2023-01-28T06:58:13.4949077Z -- Providing CMake module for utf8proc as part of Arrow CMake package
2023-01-28T06:58:13.4949545Z -- Before find_package(PkgConfig)
2023-01-28T06:58:13.4995903Z -- After find_package(PkgConfig)
2023-01-28T06:58:13.5629462Z -- After pkg_check_modules(libutf8proc)
2023-01-28T06:58:13.5630196Z -- Using pkg-config package for libutf8proc for static link
2023-01-28T06:58:13.5684245Z -- Found c-ares: /home/conda/feedstock_root/build_artifacts/apache-arrow_1674888786737/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/lib/cmake/c-ares/c-ares-config.cmake (found version "1.18.1") 
2023-01-28T06:58:13.5837826Z -- Providing CMake module for gRPCAlt as part of Arrow CMake package
2023-01-28T06:58:13.5838361Z -- Before find_package(PkgConfig)
2023-01-28T06:58:13.5885242Z -- After find_package(PkgConfig)
2023-01-28T07:13:49.8029312Z -- After pkg_check_modules(grpc++)
2023-01-28T07:13:49.8030662Z -- Using pkg-config package for grpc++ for static link
2023-01-28T07:13:49.8059077Z -- Found nlohmann_json: /home/conda/feedstock_root/build_artifacts/apache-arrow_1674888786737

@kou
Copy link
Member

kou commented Feb 3, 2023

Thanks. I noticed that pkg-config grpc++ is slow because it has many absl_* dependencies.

OK. We can disable pkg-config search only with -DARROW_BUILD_STATIC=OFF. Does conda package specify it?

@h-vetinari
Copy link
Contributor Author

OK. We can disable pkg-config search only with -DARROW_BUILD_STATIC=OFF. Does conda package specify it?

Yes, it does (and has for a long time, so all the recent runs I looked at would have had it).

kou added a commit to kou/arrow that referenced this issue Feb 3, 2023
@kou kou changed the title [CMake] gRPCAlt takes 15 minutes on linux [C++][CMake] gRPCAlt takes 15 minutes on linux Feb 3, 2023
@h-vetinari
Copy link
Contributor Author

So the results from #34019 look great, now takes <1 sec rather than 15min - thanks a lot! :)

2023-02-04T03:42:59.9003207Z -- Providing CMake module for utf8proc as part of Arrow CMake package
2023-02-04T03:42:59.9065300Z -- Found c-ares: [...]/lib/cmake/c-ares/c-ares-config.cmake (found version "1.18.1") 
2023-02-04T03:42:59.9213892Z -- Providing CMake module for gRPCAlt as part of Arrow CMake package
2023-02-04T03:42:59.9237717Z -- Found nlohmann_json: [...]/share/cmake/nlohmann_json/nlohmann_jsonConfig.cmake (found version "3.11.2") 

PS: The hunk for the changes around BZip2 did not apply, so I dropped it while backporting the patch to v11.0.0

@kou kou closed this as completed in #34019 Feb 4, 2023
kou added a commit that referenced this issue Feb 4, 2023
)

### Rationale for this change

Because they are needless and `pkg-config grpc++` is slow.

### What changes are included in this PR?

Don't find .pc files with `ARROW_BUILD_STATIC=OFF`.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* Closes: #33882

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
@kou kou added this to the 12.0.0 milestone Feb 4, 2023
@kou
Copy link
Member

kou commented Feb 4, 2023

Thanks. I've merged it.

sjperkins pushed a commit to sjperkins/arrow that referenced this issue Feb 10, 2023
apache#34019)

### Rationale for this change

Because they are needless and `pkg-config grpc++` is slow.

### What changes are included in this PR?

Don't find .pc files with `ARROW_BUILD_STATIC=OFF`.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* Closes: apache#33882

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
gringasalpastor pushed a commit to gringasalpastor/arrow that referenced this issue Feb 17, 2023
apache#34019)

### Rationale for this change

Because they are needless and `pkg-config grpc++` is slow.

### What changes are included in this PR?

Don't find .pc files with `ARROW_BUILD_STATIC=OFF`.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* Closes: apache#33882

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
fatemehp pushed a commit to fatemehp/arrow that referenced this issue Feb 24, 2023
apache#34019)

### Rationale for this change

Because they are needless and `pkg-config grpc++` is slow.

### What changes are included in this PR?

Don't find .pc files with `ARROW_BUILD_STATIC=OFF`.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* Closes: apache#33882

Authored-by: Sutou Kouhei <kou@clear-code.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
h-vetinari added a commit to regro-cf-autotick-bot/arrow-cpp-feedstock that referenced this issue Jul 10, 2023
At the time, I only backported the fix for apache/arrow#33882
to 11.0.x (because it didn't apply cleanly to 10.0.x or earlier), but now that the time
spent for that has ballooned to almost an hour, fix the conflicts, to avoid wasting
massive amounts of CI time on this branch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants