New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmake: add submodule for Apache Arrow at v6.0.1 #44696
Conversation
jenkins test make check |
|
||
# only build static library | ||
list(APPEND arrow_CMAKE_ARGS -DARROW_BUILD_SHARED=OFF) | ||
list(APPEND arrow_CMAKE_ARGS -DARROW_BUILD_STATIC=ON) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the static linkage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, more broadly, what is built statically, with this option? does it leave the arrow and parquet targets as static libs? if so, yeah, I am skeptical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i just wanted to avoid packaging those shared libraries with radosgw. it seems like that would lead to conflicts if/when distros provide them. is shared linkage worth this risk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, more broadly, what is built statically, with this option? does it leave the arrow and parquet targets as static libs? if so, yeah, I am skeptical.
the intent was to build arrow and parquet statically, but leave their dependencies as shared. we'll just need to make sure that arrow's cmake finds the same libraries ceph is using, which might be tricky if ceph is building them from submodules
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I'd like to optimize later, but do whatever is cleanest for now
75e48f6
to
1204d05
Compare
i bumped arrow to v6.0.1, because 4.0.1 doesn't build on recent compilers. once i add a WITH_SYSTEM_ARROW, that should still be able to link against Kaleb's centos package for arrow-4.0.1 the jenkins builds (ubuntu) are working pretty well now i'm still working through some shaman issues, one due to our builders setting CMAKE_BUILD_TYPE=None, and another when arrow looks for our system boost i haven't figured out whether or not libre2 and libutf8proc are runtime dependencies yet. centos doesn't have a utf8proc, so the builds will fail there until we add Kaleb's repo everywhere |
it looks like i've made it past the boost issues for ubuntu, but now the arrow submodule is failing to find headers from its own xsimd submodule:
the command line shows the right include dir for it: |
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
bc688e3
to
3a13898
Compare
3a13898
to
b917109
Compare
it looks like we can't get utf8proc into EPEL for centos, so i submoduled that too. the centos build in shaman succeeded without requiring any extra repos |
b917109
to
a152a38
Compare
i have not been able to reproduce this on a ubuntu focal vm with the exact same cmake options |
still trying to debug the ubuntu build. i forked arrow and added a commit in https://github.com/cbodley/arrow/commits/ceph-6.0.1-xsimd-debug that lists everything under this include directory after the xsimd install step |
a152a38
to
9472f8c
Compare
wow, okay. so when arrow's cmake builds the xsimd library as a submodule, it passes a
however, xsimd's install step installs the header to:
this path has an extra prefix of
so we need a way to hide that environment variable from arrow's build |
7f4a8b6
to
c73e1fd
Compare
@galsalomon66 do you know whether we need this flag enabled?
|
this flag was OFF (with ORC=OFF) |
02a74bb
to
a9e805b
Compare
the centos package is 4.0.1 because that's what was in the apache repo that Gal was using. I will update to 6.0.1 at some point. |
I tried building it. Got the following:
I did install |
e94876c
to
ad0d8aa
Compare
thanks @ivancich, i updated the Findutf8proc.cmake module that i took from arrow, and built successfully on fedora with the default WITH_SYSTEM_UTF8PROC=ON. the jenkins builds still succeed, as do the shaman builds in https://shaman.ceph.com/builds/ceph/wip-arrow-submodule-utf8proc/ |
ad0d8aa
to
d334652
Compare
adds an arrow submodule. when WITH_RADOSGW_SELECT_PARQUET is enabled, the submodule is built as an external project and rgw links against its imported Arrow::Parquet target Signed-off-by: Casey Bodley <cbodley@redhat.com>
adds utf8proc submodule, needed by the arrow submodule in centos. add a WITH_SYSTEM_UTF8PROC option that controls whether or not utf8proc is built from submodule non-system utf8proc is built as a static library to avoid conflicts with system-provided libraries ceph.spec.in sets WITH_SYSTEM_UTF8PROC=OFF until it's available in centos Signed-off-by: Casey Bodley <cbodley@redhat.com>
the arrow submodule builds some C sources that trip up on _FORTIFY_SOURCE in debug builds [ 79%] Building C object src/arrow/CMakeFiles/arrow_objlib.dir/vendored/musl/strptime.c.o In file included from /usr/include/time.h:25, from /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-10531-gc73e1fda/rpm/el8/BUILD/ceph-17.0.0-10531-gc73e1fda/src/arrow/cpp/src/arrow/vendored/strptime.h:20, from /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-10531-gc73e1fda/rpm/el8/BUILD/ceph-17.0.0-10531-gc73e1fda/src/arrow/cpp/src/arrow/vendored/musl/strptime.c:4: /usr/include/features.h:381:4: error: #warning _FORTIFY_SOURCE requires compiling with optimization (-O) [-Werror=cpp] 381 | # warning _FORTIFY_SOURCE requires compiling with optimization (-O) | ^~~~~~~ cc1: all warnings being treated as errors make[5]: *** [src/arrow/CMakeFiles/arrow_objlib.dir/build.make:2543: src/arrow/CMakeFiles/arrow_objlib.dir/vendored/musl/strptime.c.o] Error 1 Signed-off-by: Casey Bodley <cbodley@redhat.com>
relies on a hack to find the installed ParquetConfig.cmake Signed-off-by: Casey <cbodley@redhat.com>
Signed-off-by: Casey <cbodley@redhat.com>
Signed-off-by: Casey <cbodley@redhat.com>
Signed-off-by: Casey <cbodley@redhat.com>
Signed-off-by: Casey <cbodley@redhat.com>
d334652
to
e8460cb
Compare
jenkins test make check |
@galsalomon66 if you think this is ready for merge and quincy backport, could please approve? approved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm requesting support for Arrow Flight be added.
hi @ivancich, this PR is a blocker for s3select in quincy so i'd really like to keep Arrow Flight out of scope for now. we're forcing the whole ceph project to build this submodule, so should only enable the features that we actually depend on. i'm happy to work with you to enable it for testing in the meantime |
I understand. I'm only asking that it be built conditionally. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking forward to getting Arrow Flight built through this more general Arrow enablement.
sure, thanks Eric! this PR still needs an approval to merge, cc @galsalomon66 |
@cbodley this PR seems to have introduced a regression for CentOS / RHEL operating systems: https://tracker.ceph.com/issues/55114 |
adds an arrow submodule. when WITH_RADOSGW_SELECT_PARQUET is enabled, the submodule is built as an external project and rgw links against its imported Arrow::Parquet target
TODO:
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox