Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-4834: [R] Feature flag when building parquet #3870

Closed

Conversation

@javierluraschi
Copy link
Contributor

javierluraschi commented Mar 12, 2019

Support ARROW_R_PARQUET_OFF build flag to disable Parquet for R builds.

Fix for: https://issues.apache.org/jira/browse/ARROW-4834

CC: @jeroen

@jeroen

This comment has been minimized.

Copy link
Contributor

jeroen commented Mar 12, 2019

I think libarrow should define macros in it's headers to specify which features are enabled. This way the bindings can conditionally opt in.

@javierluraschi

This comment has been minimized.

Copy link
Contributor Author

javierluraschi commented Mar 12, 2019

That seems reasonable. However, changing the arrow headers would probably affect all languages, I much rather leave this change scoped to R for the time being.

Let me know if this change does not work for you r-winlib, but I understand we can set this in makevars.win by adding to PKG_CPPFLAGS -DARROW_R_PARQUET_OFF

Arrow folks can comment further but I'm hoping this can get merged and help us move things forward. If arrow folks need a change in the PR, let me know how to help.

@wesm

This comment has been minimized.

Copy link
Member

wesm commented Mar 12, 2019

So far we have only developed header-level feature flags for features with very small scopes, like optional compression libraries such as lz4 or zstd. This is a different situation where there's a whole set of Parquet-related headers and a separate libparquet.so library

@jeroen

This comment has been minimized.

Copy link
Contributor

jeroen commented Mar 12, 2019

Ah okay. If parquet is a different library we should use pkg-config --exists parquet in the configure script of the R package in order to define ARROW_R_PARQUET_OFF.

Copy link
Contributor

romainfrancois left a comment

I don't mind this if this makes @javierluraschi life easier for now. I'll start working on parquet integration pretty soon, so it's easier to deal with this now, than in a few weeks.

r/src/parquet.cpp Outdated Show resolved Hide resolved
@jeroen

This comment has been minimized.

Copy link
Contributor

jeroen commented Mar 12, 2019

The configure script should check for pkg-config --exists parquet and if that fails it should write PKG_CPPFLAGS="-DARROW_R_PARQUET_OFF" to ./src/Makevars

@xhochy
xhochy approved these changes Mar 12, 2019
Copy link
Member

xhochy left a comment

+1, LGTM

@xhochy

This comment has been minimized.

Copy link
Member

xhochy commented Mar 12, 2019

I think libarrow should define macros in it's headers to specify which features are enabled. This way the bindings can conditionally opt in.

Created:

@wesm

This comment has been minimized.

Copy link
Member

wesm commented Mar 12, 2019

Hm, I think we should change the flag to use the "$COMPONENT_WITH_$THING" semantics like

https://github.com/apache/arrow/blob/master/cpp/src/arrow/util/compression.cc#L22

so

-DARROW_R_WITH_PARQUET

Opting in makes a bit more sense to be than having to opt out

@javierluraschi

This comment has been minimized.

Copy link
Contributor Author

javierluraschi commented Mar 12, 2019

Updated.

@kou

This comment has been minimized.

Copy link
Member

kou commented Mar 12, 2019

The configure script should check for pkg-config --exists parquet and if that fails it should write PKG_CPPFLAGS="-DARROW_R_PARQUET_OFF" to ./src/Makevars

I think this is a good approach too. For example:

diff --git a/r/configure b/r/configure
index 19f4d2c6..290e2f2b 100755
--- a/r/configure
+++ b/r/configure
@@ -32,13 +32,24 @@ PKG_RPM_NAME="arrow"
 PKG_CSW_NAME="arrow"
 PKG_BREW_NAME="apache-arrow"
 PKG_TEST_HEADER="<arrow/api.h>"
-PKG_LIBS="-larrow -lparquet"
+PKG_LIBS=""
 
 # Use pkg-config if available
 pkg-config --version >/dev/null 2>&1
 if [ $? -eq 0 ]; then
-  PKGCONFIG_CFLAGS=`pkg-config --cflags --silence-errors ${PKG_CONFIG_NAME}`
-  PKGCONFIG_LIBS=`pkg-config --libs ${PKG_CONFIG_NAME}`
+  PKGCONFIG_CFLAGS=$(pkg-config --cflags arrow)
+  if [ $? -ne 0 ]; then
+    echo "TODO: no arrow message"
+    exit 1
+  fi
+  PKGCONFIG_LIBS=$(pkg-config --libs arrow)
+  PKGCONFIG_CFLAGS_PARQUET=$(pkg-config --cflags parquet)
+  if [ $? -eq 0 ]; then
+    PKGCONFIG_CFLAGS="${PKGCONFIG_CFLAGS} ${PKGCONFIG_CFLAGS_PARQUET} -DARROW_R_WITH_PARQUET"
+    PKGCONFIG_LIBS="${PKGCONFIG_LIBS} $(pkg-config --libs parquet)"
+  fi
+else
+  PKG_LIBS="-larrow -lparquet"
 fi
 
 # Note that cflags may be empty in case of success
@javierluraschi

This comment has been minimized.

Copy link
Contributor Author

javierluraschi commented Mar 12, 2019

Makes sense, configure can be updated as a follow up PR. There is also additional work to remove read_parquet() appropriately in the R build, etc. I'm hoping this PR can be scoped to unblocking the build. Thanks!

Is there anything missing here to merge?

@wesm
wesm approved these changes Mar 13, 2019
Copy link
Member

wesm left a comment

This is definitely feels like a hack and isn't going to "scale" well as work on more fully-fledged Parquet support for R commences. Is there any way to conditionally include parquet.cpp in the Rcpp compilation? I'm merging this for now

@wesm wesm closed this in 95d62ca Mar 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.