Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-16340: [C++][Python] Move all Python related code into PyArrow #13311

Merged
merged 138 commits into from Aug 26, 2022

Conversation

AlenkaF
Copy link
Member

@AlenkaF AlenkaF commented Jun 3, 2022

This PR moves src/arrow/python directory into pyarrow and arranges PyArrow to build it. The build on the Python side is made in two steps:

  1. _run_cmake_pyarrow_cpp() where the C++ part of the pyarrow is build first (the part that was moved in the refactoring)
  2. _run_cmake() where pyarrow is built as before

No changes are needed in the build process from the user side to successfully build pyarrow after this refactoring. The test for PyArrow CPP will however be moved into Cython and can currently be run with:

>>> pushd python/build/dist/temp 
>>> ctest

@github-actions
Copy link

github-actions bot commented Jun 3, 2022

@github-actions
Copy link

github-actions bot commented Jun 3, 2022

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

arrow_find_package(ARROW_PYTHON
"${ARROW_HOME}"
"${CPYARROW_HOME}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This smells a bit hackish. Feels a bit like we are building two totally different projects.

On long term we should probably have libarrow_python.so just be one of the shared objects constituting pyarrow. libarrow_python.so should probably be possible to integrate with the rest of the built files using add_subdirectory or something equivalent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On long term we should probably have libarrow_python.so just be one of the shared objects constituting pyarrow.

That's actually what is already happening, I think. The libarrow_python.so gets copied into the pyarrow directory (either the repo itself for an inplace build, or the build directory, before that one gets copied to site-packages for a normal install).
But so for building the cython extensions, we need to point to where libarrow_python.so can be found (which is now a different location as where libarrow.so can be found). And doing that with a argument to cmake seems a good way to do that? (alternatively we might need to edit the arrow_find_package to work with this new situation?)

@AlenkaF
Copy link
Member Author

AlenkaF commented Jun 28, 2022

@kou there is one more issue I am struggling with and would be happy to get your opinion.

On the CI I get linker errors for python flight module (not locally on M1):
https://github.com/apache/arrow/runs/7087113174?check_suite_focus=true

I am facing similar issue locally when trying to build the tests for C++ part of PyArrow with GTest:

[100%] Linking CXX executable arrow-python-test
ld: library not found for -larrow_testing_shared
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [arrow-python-test] Error 1
make[1]: *** [CMakeFiles/arrow-python-test.dir/all] Error 2
make: *** [all] Error 2
error: command '/usr/bin/make' failed with exit code 2
~/repos/arrow-cmake

What could be the setup work I am missing so that the libraries would be linked correctly?

@kou
Copy link
Member

kou commented Jun 28, 2022

On the CI I get linker errors for python flight module (not locally on M1):
https://github.com/apache/arrow/runs/7087113174?check_suite_focus=true

It seems that this link is wrong. (This doesn't have the error message.)

What could be the setup work I am missing so that the libraries would be linked correctly?

We need find_package(ArrowTesting REQUIRED) to use arrow_testing_shared target.

@AlenkaF
Copy link
Member Author

AlenkaF commented Jun 29, 2022

On the CI I get linker errors for python flight module (not locally on M1):
https://github.com/apache/arrow/runs/7087113174?check_suite_focus=true

It seems that this link is wrong. (This doesn't have the error message.)

Maybe this way would be easier:

Error with make build of C PyArrow
-- Running cmake for C pyarrow
  cmake -DCMAKE_INSTALL_PREFIX=/arrow/python/build/dist -DCMAKE_BUILD_TYPE=debug -DPYARROW_WITH_DATASET=on -DPYARROW_WITH_PARQUET_ENCRYPTION=on -DPYARROW_WITH_HDFS=on /arrow/python/pyarrow/src_arrow
  -- CMAKE_MODULE_PATH /arrow/python/cmake_modules;/opt/conda/envs/arrow/lib/cmake/arrow
  -- The C compiler identification is GNU 10.3.0
  -- The CXX compiler identification is GNU 10.3.0
  -- Detecting C compiler ABI info
  -- Detecting C compiler ABI info - done
  -- Check for working C compiler: /opt/conda/envs/arrow/bin/x86_64-conda-linux-gnu-cc - skipped
  -- Detecting C compile features
  -- Detecting C compile features - done
  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Check for working CXX compiler: /opt/conda/envs/arrow/bin/x86_64-conda-linux-gnu-c++ - skipped
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  -- Found PkgConfig: /opt/conda/envs/arrow/bin/pkg-config (found version "0.29.2")
  -- Found Arrow: /opt/conda/envs/arrow/include (found version "9.0.0")
  -- Arrow version: 9.0.0 (HOME: /opt/conda/envs/arrow)
  -- Arrow SO and ABI version: 900
  -- Arrow full SO version: 900.0.0
  -- Found the Arrow core shared library: /opt/conda/envs/arrow/lib/libarrow.so
  -- Found the Arrow core import library: /opt/conda/envs/arrow/lib/libarrow.so
  -- Found the Arrow core static library: ARROW_static_lib-NOTFOUND
  -- Found Python3: /opt/conda/envs/arrow/bin/python3.9 (found suitable version "3.9.13", minimum required is "3.7") found components: Interpreter Development.Module NumPy
  -- Found Python3Alt: /opt/conda/envs/arrow/bin/python3.9 (Required is at least version "3.7")
  CMake Warning (dev) at CMakeLists.txt:55 (option):
    Policy CMP0077 is not set: option() honors normal variables.  Run "cmake
    --help-policy CMP0077" for policy details.  Use the cmake_policy command to
    set the policy and suppress this warning.
    For compatibility with older versions of CMake, option is clearing the
    normal variable 'ARROW_BUILD_SHARED'.
  This warning is for project developers.  Use -Wno-dev to suppress it.
  -- Found ArrowDataset: /opt/conda/envs/arrow/include (found version "9.0.0")
  -- Found the Arrow Dataset by HOME: /opt/conda/envs/arrow
  -- Found the Arrow Dataset shared library: /opt/conda/envs/arrow/lib/libarrow_dataset.so
  -- Found the Arrow Dataset import library: /opt/conda/envs/arrow/lib/libarrow_dataset.so
  -- Found the Arrow Dataset static library: ARROW_DATASET_static_lib-NOTFOUND
  -- Performing Test CXX_LINKER_SUPPORTS_VERSION_SCRIPT
  -- Performing Test CXX_LINKER_SUPPORTS_VERSION_SCRIPT - Success
  -- Found ArrowFlight: /opt/conda/envs/arrow/include (found version "9.0.0")
  -- Found the Arrow Flight by HOME: /opt/conda/envs/arrow
  -- Found the Arrow Flight shared library: /opt/conda/envs/arrow/lib/libarrow_flight.so
  -- Found the Arrow Flight import library: /opt/conda/envs/arrow/lib/libarrow_flight.so
  -- Found the Arrow Flight static library: ARROW_FLIGHT_static_lib-NOTFOUND
  -- Found ZLIB: /opt/conda/envs/arrow/lib/libz.so (found version "1.2.12")
  -- Found Protobuf: /opt/conda/envs/arrow/lib/libprotobuf.so;-lpthread (found version "3.19.4")
  -- Found OpenSSL: /opt/conda/envs/arrow/lib/libcrypto.so (found version "1.1.1o")
  -- Found c-ares: /opt/conda/envs/arrow/lib/cmake/c-ares/c-ares-config.cmake (found version "1.18.1")
  -- Found Threads: TRUE
  -- Check if compiler accepts -pthread
  -- Check if compiler accepts -pthread - yes
  -- Found RE2 via CMake.
  -- Configuring done
  -- Generating done
  -- Build files have been written to: /arrow/python/build/dist/temp
  -- Finished cmake for C pyarrow
  -- Running make build and install for C pyarrow
  make -j4
  [  3%] Generating /arrow/cpp/build/src/arrow/flight/Flight.pb.cc, /arrow/cpp/build/src/arrow/flight/Flight.pb.h, /arrow/cpp/build/src/arrow/flight/Flight.grpc.pb.cc, /arrow/cpp/build/src/arrow/flight/Flight.grpc.pb.h
  make[2]: I/arrow/format: No such file or directory
  make[2]: I/arrow/format: No such file or directory
  [  7%] Building CXX object CMakeFiles/arrow_python_objlib.dir/arrow_to_pandas.cc.o
  [ 11%] Building CXX object CMakeFiles/arrow_python_objlib.dir/benchmark.cc.o
  [ 15%] Building CXX object CMakeFiles/arrow_python_objlib.dir/common.cc.o
  [ 15%] Built target flight_grpc_gen
  [ 19%] Building CXX object CMakeFiles/arrow_python_objlib.dir/datetime.cc.o
  [ 23%] Building CXX object CMakeFiles/arrow_python_objlib.dir/decimal.cc.o
  [ 26%] Building CXX object CMakeFiles/arrow_python_objlib.dir/deserialize.cc.o
  [ 30%] Building CXX object CMakeFiles/arrow_python_objlib.dir/extension_type.cc.o
  [ 34%] Building CXX object CMakeFiles/arrow_python_objlib.dir/gdb.cc.o
  [ 38%] Building CXX object CMakeFiles/arrow_python_objlib.dir/helpers.cc.o
  [ 42%] Building CXX object CMakeFiles/arrow_python_objlib.dir/inference.cc.o
  [ 46%] Building CXX object CMakeFiles/arrow_python_objlib.dir/init.cc.o
  [ 50%] Building CXX object CMakeFiles/arrow_python_objlib.dir/io.cc.o
  [ 53%] Building CXX object CMakeFiles/arrow_python_objlib.dir/ipc.cc.o
  [ 57%] Building CXX object CMakeFiles/arrow_python_objlib.dir/numpy_convert.cc.o
  [ 61%] Building CXX object CMakeFiles/arrow_python_objlib.dir/numpy_to_arrow.cc.o
  [ 65%] Building CXX object CMakeFiles/arrow_python_objlib.dir/python_to_arrow.cc.o
  [ 69%] Building CXX object CMakeFiles/arrow_python_objlib.dir/pyarrow.cc.o
  [ 73%] Building CXX object CMakeFiles/arrow_python_objlib.dir/serialize.cc.o
  [ 76%] Building CXX object CMakeFiles/arrow_python_objlib.dir/udf.cc.o
  [ 80%] Building CXX object CMakeFiles/arrow_python_objlib.dir/csv.cc.o
  [ 84%] Building CXX object CMakeFiles/arrow_python_objlib.dir/filesystem.cc.o
  [ 88%] Building CXX object CMakeFiles/arrow_python_objlib.dir/parquet_encryption.cc.o
  [ 88%] Built target arrow_python_objlib
  [ 92%] Linking CXX shared library libarrow_python.so
  /opt/conda/envs/arrow/bin/../lib/gcc/x86_64-conda-linux-gnu/10.3.0/../../../../x86_64-conda-linux-gnu/bin/ld: cannot find -lparquet_shared
  collect2: error: ld returned 1 exit status
  make[2]: *** [CMakeFiles/arrow_python_shared.dir/build.make:126: libarrow_python.so.900.0.0] Error 1
  make[1]: *** [CMakeFiles/Makefile2:202: CMakeFiles/arrow_python_shared.dir/all] Error 2
  make: *** [Makefile:136: all] Error 2
  error: command '/opt/conda/envs/arrow/bin/make' failed with exit code 2
  error: subprocess-exited-with-error

@kou
Copy link
Member

kou commented Jun 29, 2022

We need find_package(Parquet REQUIRED) for parquet_shared.

@AlenkaF
Copy link
Member Author

AlenkaF commented Jun 29, 2022

Thank you @kou!

@AlenkaF
Copy link
Member Author

AlenkaF commented Jul 1, 2022

I would like to put this PR ready for review.

There are still two CI failures I am investigating (see ⬇️ ) plus test_cython_api() from test_cython.py is broke, so I am researching it at the moment. But I think this shouldn't block this PR to get initial reviews 🙏 @kou @pitrou @kszucs @jorisvandenbossche

  • Could NOT find Python3 (missing: Python3_NumPy_INCLUDE_DIRS NumPy)
    I have seen this error in some other failing builds and am not sure if it is connected to this PR or not.
  CMake Error at /opt/conda/envs/arrow/share/cmake-3.23/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
    Could NOT find Python3 (missing: Python3_NumPy_INCLUDE_DIRS NumPy) (found
    suitable version "3.9.13", minimum required is "3.7")
  Call Stack (most recent call first):
    /opt/conda/envs/arrow/share/cmake-3.23/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
    /opt/conda/envs/arrow/share/cmake-3.23/Modules/FindPython/Support.cmake:3192 (find_package_handle_standard_args)
    /opt/conda/envs/arrow/share/cmake-3.23/Modules/FindPython3.cmake:490 (include)
    /arrow/python/cmake_modules/FindPython3Alt.cmake:56 (find_package)
    CMakeLists.txt:64 (find_package)
  • fatal error: arrow/flight/Flight.pb.h: No such file or directory
    This error is for sure connected to this PR. I have bumped into it locally for python flight build but it dissapeared after I continued making changes to the code.
  In file included from /arrow/cpp/src/arrow/flight/serialization_internal.h:22,
                   from /arrow/python/pyarrow/src_arrow/flight.cc:21:
  /arrow/cpp/src/arrow/flight/protocol_internal.h:23:10: fatal error: arrow/flight/Flight.pb.h: No such file or directory
     23 | #include "arrow/flight/Flight.pb.h"  // IWYU pragma: export
        |          ^~~~~~~~~~~~~~~~~~~~~~~~~~
  compilation terminated.
  make[2]: *** [CMakeFiles/arrow_python_flight_objlib.dir/build.make:76: CMakeFiles/arrow_python_flight_objlib.dir/flight.cc.o] Error 1
  make[1]: *** [CMakeFiles/Makefile2:255: CMakeFiles/arrow_python_flight_objlib.dir/all] Error 2
  make: *** [Makefile:136: all] Error 2
  error: command '/opt/conda/envs/arrow/bin/make' failed with exit code 2
  error: subprocess-exited-with-error

@AlenkaF AlenkaF marked this pull request as ready for review July 1, 2022 04:39
cpp/cmake_modules/FindArrowPython.cmake Outdated Show resolved Hide resolved
cpp/src/arrow/public_api_test.cc Outdated Show resolved Hide resolved
python/pyarrow/src_arrow/CMakeLists.txt Outdated Show resolved Hide resolved
python/pyarrow/src_arrow/CMakeLists.txt Outdated Show resolved Hide resolved
python/pyarrow/src_arrow/CMakeLists.txt Outdated Show resolved Hide resolved
python/pyarrow/src_arrow/CMakeLists.txt Outdated Show resolved Hide resolved
python/pyarrow/src_arrow/CMakeLists.txt Outdated Show resolved Hide resolved
python/pyarrow/src_arrow/CMakeLists.txt Outdated Show resolved Hide resolved
python/pyarrow/src_arrow/CMakeLists.txt Outdated Show resolved Hide resolved
python/setup.py Outdated Show resolved Hide resolved
@pitrou
Copy link
Member

pitrou commented Jul 1, 2022

Hmm, we should try to make sure the Python Flight bindings can compile without including some internal headers.

@AlenkaF
Copy link
Member Author

AlenkaF commented Jul 1, 2022

Hmm, we should try to make sure the Python Flight bindings can compile without including some internal headers.

Are you thinking in terms of this PR or in general?

@pitrou
Copy link
Member

pitrou commented Jul 1, 2022

I mean in general, though that could be done as part of this PR (unless I'm missing something that makes internal headers mandatory here, but that would be surprising).

@lidavidm
Copy link
Member

lidavidm commented Jul 1, 2022

I think it should be easy enough to remove the need for serialization_internal.h inside python/flight.cc (use Make instead of directly constructing the structure)

@AlenkaF
Copy link
Member Author

AlenkaF commented Jul 1, 2022

Thanks for the comments!
In that case I will try to remove the internal headers for Python Flight and see how it goes.

@AlenkaF
Copy link
Member Author

AlenkaF commented Jul 4, 2022

Is there something similar I can do with this part:

Status CreateSchemaResult(const std::shared_ptr<arrow::Schema>& schema,
std::unique_ptr<arrow::flight::SchemaResult>* out) {
std::string schema_in;
RETURN_NOT_OK(arrow::flight::internal::SchemaToString(*schema, &schema_in));
arrow::flight::SchemaResult value(schema_in);
*out = std::unique_ptr<arrow::flight::SchemaResult>(
new arrow::flight::SchemaResult(value));
return Status::OK();
}

that also uses arrow::flight::internal::SchemaToString? @lidavidm

I have successfully changed the construction of FlightInfo for Status CreateFlightInfo in python/flight.cc.

@lidavidm
Copy link
Member

lidavidm commented Jul 4, 2022

Hmm, looks like we need to add a factory method for SchemaResult. There isn't anything available already

@AlenkaF
Copy link
Member Author

AlenkaF commented Aug 25, 2022

I have rebased the PR and corrected the arrow.soec.in file, hope I got it right @kou? The changes can be found here.
If yes I can run the nightly checks again.

@kou
Copy link
Member

kou commented Aug 25, 2022

Yes. Thanks.
Sorry, I miss one more thing. Could you also remove python*-devel and python*-numpy from dev/tasks/linux-packages/apache-arrow/yum/*/Dockerfiles?

@AlenkaF
Copy link
Member Author

AlenkaF commented Aug 25, 2022

@github-actions crossbow submit -g nightly-tests -g nightly-packaging -g nightly-release

@github-actions
Copy link

Revision: 0860306

Submitted crossbow builds: ursacomputing/crossbow @ actions-58857300ac

Task Status
almalinux-8-amd64 Github Actions
almalinux-8-arm64 TravisCI
almalinux-9-amd64 Github Actions
almalinux-9-arm64 TravisCI
amazon-linux-2-amd64 Github Actions
amazon-linux-2-arm64 TravisCI
centos-7-amd64 Github Actions
centos-8-stream-amd64 Github Actions
centos-8-stream-arm64 TravisCI
centos-9-stream-amd64 Github Actions
centos-9-stream-arm64 TravisCI
conan-maximum Github Actions
conan-minimum Github Actions
conda-clean Azure
conda-linux-gcc-py310-arm64 Azure
conda-linux-gcc-py310-cpu Azure
conda-linux-gcc-py310-cuda Azure
conda-linux-gcc-py310-ppc64le Azure
conda-linux-gcc-py37-arm64 Azure
conda-linux-gcc-py37-cpu-r40 Azure
conda-linux-gcc-py37-cpu-r41 Azure
conda-linux-gcc-py37-cuda Azure
conda-linux-gcc-py37-ppc64le Azure
conda-linux-gcc-py38-arm64 Azure
conda-linux-gcc-py38-cpu Azure
conda-linux-gcc-py38-cuda Azure
conda-linux-gcc-py38-ppc64le Azure
conda-linux-gcc-py39-arm64 Azure
conda-linux-gcc-py39-cpu Azure
conda-linux-gcc-py39-cuda Azure
conda-linux-gcc-py39-ppc64le Azure
conda-osx-arm64-clang-py310 Azure
conda-osx-arm64-clang-py38 Azure
conda-osx-arm64-clang-py39 Azure
conda-osx-clang-py310 Azure
conda-osx-clang-py37-r40 Azure
conda-osx-clang-py37-r41 Azure
conda-osx-clang-py38 Azure
conda-osx-clang-py39 Azure
conda-win-vs2017-py310 Azure
conda-win-vs2017-py37-r40 Azure
conda-win-vs2017-py37-r41 Azure
conda-win-vs2017-py38 Azure
conda-win-vs2017-py39 Azure
debian-bookworm-amd64 Github Actions
debian-bookworm-arm64 TravisCI
debian-bullseye-amd64 Github Actions
debian-bullseye-arm64 TravisCI
example-cpp-minimal-build-static Github Actions
example-cpp-minimal-build-static-system-dependency Github Actions
example-python-minimal-build-fedora-conda Github Actions
example-python-minimal-build-ubuntu-venv Github Actions
homebrew-cpp Github Actions
homebrew-r-autobrew Github Actions
homebrew-r-brew Github Actions
java-jars Github Actions
nuget Github Actions
python-sdist Github Actions
r-binary-packages Github Actions
test-alpine-linux-cpp Github Actions
test-build-cpp-fuzz Github Actions
test-build-vcpkg-win Github Actions
test-conda-cpp Github Actions
test-conda-cpp-valgrind Azure
test-conda-python-3.10 Github Actions
test-conda-python-3.7 Github Actions
test-conda-python-3.7-hdfs-2.9.2 Github Actions
test-conda-python-3.7-hdfs-3.2.1 Github Actions
test-conda-python-3.7-kartothek-latest Github Actions
test-conda-python-3.7-kartothek-master Github Actions
test-conda-python-3.7-pandas-0.24 Github Actions
test-conda-python-3.7-pandas-latest Github Actions
test-conda-python-3.7-spark-v3.1.2 Github Actions
test-conda-python-3.8 Github Actions
test-conda-python-3.8-hypothesis Github Actions
test-conda-python-3.8-pandas-latest Github Actions
test-conda-python-3.8-pandas-nightly Github Actions
test-conda-python-3.8-spark-v3.2.0 Github Actions
test-conda-python-3.9 Github Actions
test-conda-python-3.9-dask-latest Github Actions
test-conda-python-3.9-dask-master Github Actions
test-conda-python-3.9-pandas-master Github Actions
test-conda-python-3.9-spark-master Github Actions
test-debian-10-cpp-amd64 Github Actions
test-debian-10-cpp-i386 Github Actions
test-debian-11-cpp-amd64 Github Actions
test-debian-11-cpp-i386 Github Actions
test-debian-11-go-1.16 Azure
test-debian-11-python-3 Azure
test-debian-c-glib Github Actions
test-debian-ruby Github Actions
test-fedora-35-cpp Github Actions
test-fedora-35-python-3 Azure
test-fedora-r-clang-sanitizer Azure
test-r-arrow-backwards-compatibility Github Actions
test-r-depsource-bundled Azure
test-r-depsource-system Github Actions
test-r-dev-duckdb Github Actions
test-r-devdocs Github Actions
test-r-gcc-11 Github Actions
test-r-gcc-12 Github Actions
test-r-install-local Github Actions
test-r-linux-as-cran Github Actions
test-r-linux-rchk Github Actions
test-r-linux-valgrind Azure
test-r-minimal-build Azure
test-r-offline-maximal Github Actions
test-r-offline-minimal Azure
test-r-rhub-debian-gcc-devel-lto-latest Azure
test-r-rhub-debian-gcc-release-custom-ccache Azure
test-r-rhub-ubuntu-gcc-release-latest Azure
test-r-rocker-r-base-latest Azure
test-r-rstudio-r-base-4.1-opensuse153 Azure
test-r-rstudio-r-base-4.2-centos7-devtoolset-8 Azure
test-r-rstudio-r-base-4.2-focal Azure
test-r-ubuntu-22.04 Github Actions
test-r-versions Github Actions
test-skyhook-integration Github Actions
test-ubuntu-18.04-cpp Github Actions
test-ubuntu-18.04-cpp-release Github Actions
test-ubuntu-18.04-cpp-static Github Actions
test-ubuntu-18.04-r-sanitizer Azure
test-ubuntu-20.04-cpp Github Actions
test-ubuntu-20.04-cpp-14 Github Actions
test-ubuntu-20.04-cpp-17 Github Actions
test-ubuntu-20.04-cpp-bundled Github Actions
test-ubuntu-20.04-cpp-thread-sanitizer Github Actions
test-ubuntu-20.04-python-3 Azure
test-ubuntu-22.04-cpp Github Actions
test-ubuntu-c-glib Github Actions
test-ubuntu-default-docs Azure
test-ubuntu-ruby Github Actions
ubuntu-bionic-amd64 Github Actions
ubuntu-bionic-arm64 TravisCI
ubuntu-focal-amd64 Github Actions
ubuntu-focal-arm64 TravisCI
ubuntu-jammy-amd64 Github Actions
ubuntu-jammy-arm64 TravisCI
verify-rc-source-cpp-linux-almalinux-8-amd64 Github Actions
verify-rc-source-cpp-linux-conda-latest-amd64 Github Actions
verify-rc-source-cpp-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-cpp-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-cpp-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-cpp-macos-amd64 Github Actions
verify-rc-source-cpp-macos-arm64 Github Actions
verify-rc-source-cpp-macos-conda-amd64 Github Actions
verify-rc-source-csharp-linux-almalinux-8-amd64 Github Actions
verify-rc-source-csharp-linux-conda-latest-amd64 Github Actions
verify-rc-source-csharp-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-csharp-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-csharp-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-csharp-macos-amd64 Github Actions
verify-rc-source-csharp-macos-arm64 Github Actions
verify-rc-source-go-linux-almalinux-8-amd64 Github Actions
verify-rc-source-go-linux-conda-latest-amd64 Github Actions
verify-rc-source-go-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-go-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-go-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-go-macos-amd64 Github Actions
verify-rc-source-go-macos-arm64 Github Actions
verify-rc-source-integration-linux-almalinux-8-amd64 Github Actions
verify-rc-source-integration-linux-conda-latest-amd64 Github Actions
verify-rc-source-integration-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-integration-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-integration-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-integration-macos-amd64 Github Actions
verify-rc-source-integration-macos-arm64 Github Actions
verify-rc-source-integration-macos-conda-amd64 Github Actions
verify-rc-source-java-linux-almalinux-8-amd64 Github Actions
verify-rc-source-java-linux-conda-latest-amd64 Github Actions
verify-rc-source-java-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-java-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-java-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-java-macos-amd64 Github Actions
verify-rc-source-js-linux-almalinux-8-amd64 Github Actions
verify-rc-source-js-linux-conda-latest-amd64 Github Actions
verify-rc-source-js-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-js-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-js-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-js-macos-amd64 Github Actions
verify-rc-source-js-macos-arm64 Github Actions
verify-rc-source-python-linux-almalinux-8-amd64 Github Actions
verify-rc-source-python-linux-conda-latest-amd64 Github Actions
verify-rc-source-python-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-python-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-python-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-python-macos-amd64 Github Actions
verify-rc-source-python-macos-arm64 Github Actions
verify-rc-source-python-macos-conda-amd64 Github Actions
verify-rc-source-ruby-linux-almalinux-8-amd64 Github Actions
verify-rc-source-ruby-linux-conda-latest-amd64 Github Actions
verify-rc-source-ruby-linux-ubuntu-18.04-amd64 Github Actions
verify-rc-source-ruby-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-ruby-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-ruby-macos-amd64 Github Actions
verify-rc-source-ruby-macos-arm64 Github Actions
verify-rc-source-windows Github Actions
wheel-macos-big-sur-cp310-arm64 Github Actions
wheel-macos-big-sur-cp310-universal2 Github Actions
wheel-macos-big-sur-cp38-arm64 Github Actions
wheel-macos-big-sur-cp39-arm64 Github Actions
wheel-macos-big-sur-cp39-universal2 Github Actions
wheel-macos-high-sierra-cp310-amd64 Github Actions
wheel-macos-high-sierra-cp37-amd64 Github Actions
wheel-macos-high-sierra-cp38-amd64 Github Actions
wheel-macos-high-sierra-cp39-amd64 Github Actions
wheel-macos-mavericks-cp310-amd64 Github Actions
wheel-macos-mavericks-cp37-amd64 Github Actions
wheel-macos-mavericks-cp38-amd64 Github Actions
wheel-macos-mavericks-cp39-amd64 Github Actions
wheel-manylinux2014-cp310-amd64 Github Actions
wheel-manylinux2014-cp310-arm64 TravisCI
wheel-manylinux2014-cp37-amd64 Github Actions
wheel-manylinux2014-cp37-arm64 TravisCI
wheel-manylinux2014-cp38-amd64 Github Actions
wheel-manylinux2014-cp38-arm64 TravisCI
wheel-manylinux2014-cp39-amd64 Github Actions
wheel-manylinux2014-cp39-arm64 TravisCI
wheel-windows-cp310-amd64 Github Actions
wheel-windows-cp37-amd64 Github Actions
wheel-windows-cp38-amd64 Github Actions
wheel-windows-cp39-amd64 Github Actions

@AlenkaF
Copy link
Member Author

AlenkaF commented Aug 26, 2022

Lots of green 😍 😆
@kou @pitrou the failures do not look related, all issues from before in the nightlies should be fixed 👍

Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

The centos-9-stream-amd64 failure is a new failure but I think that it's not related to this pull request. We can fix it as a follow-up task.

https://github.com/ursacomputing/crossbow/runs/8027904112?check_suite_focus=true#step:6:4839

FAILED: gandiva-glib/Gandiva-1.0.gir 
/usr/bin/meson --internal exe --unpickle /build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/meson-private/meson_exe_g-ir-scanner_f62efc76b62635482d998451ca9762c4e34e9776.dat
while executing ['/usr/bin/g-ir-scanner', '--no-libtool', '--namespace=Gandiva', '--nsversion=1.0', '--warn-all', '--output', 'gandiva-glib/Gandiva-1.0.gir', '--c-include=gandiva-glib/gandiva-glib.h', '--warn-all', '--include-uninstalled=./arrow-glib/Arrow-1.0.gir', '-I/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/gandiva-glib', '-I/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/gandiva-glib', '-I/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/.', '-I/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/.', '-I/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/../cpp/redhat-linux-build/src', '-I/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/../cpp/redhat-linux-build/src', '-I/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/../cpp/src', '-I/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/../cpp/src', '-I/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/.', '-I/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/.', '-I/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/../cpp/redhat-linux-build/src', '-I/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/../cpp/redhat-linux-build/src', '-I/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/../cpp/src', '-I/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/../cpp/src', '--filelist=/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/gandiva-glib/libgandiva-glib.so.1000.0.0.p/Gandiva_1.0_gir_filelist', '--include=Arrow-1.0', '--symbol-prefix=ggandiva', '--identifier-prefix=GGandiva', '--pkg-export=gandiva-glib', '--cflags-begin', '-I/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/.', '-I/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/.', '-I/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/../cpp/redhat-linux-build/src', '-I/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/../cpp/redhat-linux-build/src', '-I/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/../cpp/src', '-I/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/../cpp/src', '-I/usr/include/glib-2.0', '-I/usr/lib64/glib-2.0/include', '-I/usr/include/sysprof-4', '-I/usr/include/gobject-introspection-1.0', '--cflags-end', '--add-include-path=/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/arrow-glib', '--add-include-path=/usr/share/gir-1.0', '-L/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/gandiva-glib', '--library', 'gandiva-glib', '-L/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/arrow-glib', '-L/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/../../cpp/redhat-linux-build/release', '--extra-library=gobject-2.0', '--extra-library=glib-2.0', '--extra-library=girepository-1.0', '--sources-top-dirs', '/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/subprojects/', '--sources-top-dirs', '/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/subprojects/', '--warn-error']
--- stdout ---
Package arrow was not found in the pkg-config search path.
Perhaps you should add the directory containing `arrow.pc'
to the PKG_CONFIG_PATH environment variable
Package 'arrow', required by 'arrow-glib-uninstalled', not found

g-ir-scanner: link: gcc -o /build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/tmp-introspecttrutbm6g/Gandiva-1.0 -O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -march=x86-64-v2 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection /build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/tmp-introspecttrutbm6g/Gandiva-1.0.o -L. -Wl,-rpath,. -Wl,--no-as-needed -L/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/gandiva-glib -Wl,-rpath,/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/gandiva-glib -L/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/arrow-glib -Wl,-rpath,/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/arrow-glib -L/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/../../cpp/redhat-linux-build/release -Wl,-rpath,/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/../../cpp/redhat-linux-build/release -lgandiva-glib -lgobject-2.0 -lglib-2.0 -lgirepository-1.0 -lgio-2.0 -lgobject-2.0 -Wl,--export-dynamic -lgmodule-2.0 -pthread -lglib-2.0 -lglib-2.0 -Wl,-z,relro -Wl,--as-needed -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1

--- stderr ---
/usr/bin/ld: /build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/../../cpp/redhat-linux-build/release/libgandiva.so.1000: undefined reference to `std::__glibcxx_assert_fail(char const*, int, char const*, char const*)'
collect2: error: ld returned 1 exit status
linking of temporary binary failed: Command '['gcc', '-o', '/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/tmp-introspecttrutbm6g/Gandiva-1.0', '-O2', '-flto=auto', '-ffat-lto-objects', '-fexceptions', '-g', '-grecord-gcc-switches', '-pipe', '-Wall', '-Werror=format-security', '-Wp,-D_FORTIFY_SOURCE=2', '-Wp,-D_GLIBCXX_ASSERTIONS', '-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1', '-fstack-protector-strong', '-specs=/usr/lib/rpm/redhat/redhat-annobin-cc1', '-m64', '-march=x86-64-v2', '-mtune=generic', '-fasynchronous-unwind-tables', '-fstack-clash-protection', '-fcf-protection', '/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/tmp-introspecttrutbm6g/Gandiva-1.0.o', '-L.', '-Wl,-rpath,.', '-Wl,--no-as-needed', '-L/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/gandiva-glib', '-Wl,-rpath,/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/gandiva-glib', '-L/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/arrow-glib', '-Wl,-rpath,/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/arrow-glib', '-L/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/../../cpp/redhat-linux-build/release', '-Wl,-rpath,/build/rpmbuild/BUILD/apache-arrow-9.0.0.dev783/c_glib/build/../../cpp/redhat-linux-build/release', '-lgandiva-glib', '-lgobject-2.0', '-lglib-2.0', '-lgirepository-1.0', '-lgio-2.0', '-lgobject-2.0', '-Wl,--export-dynamic', '-lgmodule-2.0', '-pthread', '-lglib-2.0', '-lglib-2.0', '-Wl,-z,relro', '-Wl,--as-needed', '-Wl,-z,now', '-specs=/usr/lib/rpm/redhat/redhat-hardened-ld', '-specs=/usr/lib/rpm/redhat/redhat-annobin-cc1']' returned non-zero exit status 1.

@kou kou changed the title ARROW-16340: [Python] Move all Python related code into PyArrow ARROW-16340: [C++][Python] Move all Python related code into PyArrow Aug 26, 2022
@kou
Copy link
Member

kou commented Aug 26, 2022

Could you update the description of this pull request before we merge this?

@AlenkaF
Copy link
Member Author

AlenkaF commented Aug 26, 2022

Will also make a JIRA for centos-9-stream-amd64 failure.

@AlenkaF
Copy link
Member Author

AlenkaF commented Aug 26, 2022

@kou Description is updated and the JIRA issue for the centos-9-stream-amd64 failure can be found here: https://issues.apache.org/jira/browse/ARROW-17536.

@kou
Copy link
Member

kou commented Aug 26, 2022

Thanks!
I merge this.

@kou kou merged commit b832853 into apache:master Aug 26, 2022
@ursabot
Copy link

ursabot commented Aug 26, 2022

Benchmark runs are scheduled for baseline = 7e7b8e1 and contender = b832853. b832853 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed] test-mac-arm
[Failed ⬇️6.03% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.25% ⬆️0.39%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] b832853b ec2-t3-xlarge-us-east-2
[Failed] b832853b test-mac-arm
[Failed] b832853b ursa-i9-9960x
[Finished] b832853b ursa-thinkcentre-m75q
[Finished] 7e7b8e1f ec2-t3-xlarge-us-east-2
[Finished] 7e7b8e1f test-mac-arm
[Failed] 7e7b8e1f ursa-i9-9960x
[Finished] 7e7b8e1f ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@ursabot
Copy link

ursabot commented Aug 26, 2022

['Python', 'R'] benchmarks have high level of regressions.
ursa-i9-9960x

anjakefala pushed a commit to anjakefala/arrow that referenced this pull request Aug 31, 2022
…pache#13311)

This PR moves `src/arrow/python` directory into `pyarrow` and arranges PyArrow to build it. The build on the Python side is made in two steps:

1. `_run_cmake_pyarrow_cpp()` where the C++ part of the pyarrow is build first (the part that was moved in the refactoring)
2. `_run_cmake()` where pyarrow is built as before

No changes are needed in the build process from the user side to successfully build pyarrow after this refactoring. The test for PyArrow CPP will however be moved into Cython and can currently be run with:

```shell
>>> pushd python/build/dist/temp 
>>> ctest
```

Lead-authored-by: Alenka Frim <frim.alenka@gmail.com>
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
@AlenkaF AlenkaF deleted the ARROW-16340 branch September 16, 2022 06:35
zagto pushed a commit to zagto/arrow that referenced this pull request Oct 7, 2022
…pache#13311)

This PR moves `src/arrow/python` directory into `pyarrow` and arranges PyArrow to build it. The build on the Python side is made in two steps:

1. `_run_cmake_pyarrow_cpp()` where the C++ part of the pyarrow is build first (the part that was moved in the refactoring)
2. `_run_cmake()` where pyarrow is built as before

No changes are needed in the build process from the user side to successfully build pyarrow after this refactoring. The test for PyArrow CPP will however be moved into Cython and can currently be run with:

```shell
>>> pushd python/build/dist/temp 
>>> ctest
```

Lead-authored-by: Alenka Frim <frim.alenka@gmail.com>
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
jorisvandenbossche pushed a commit that referenced this pull request Oct 10, 2022
…es (#14275)

This PR is a follow-up of #13311 where it was decided to change the base directory for PyArrow C++ headers to avoid top level inclusions. See #13311 (comment)

Authored-by: Alenka Frim <frim.alenka@gmail.com>
Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
fatemehp pushed a commit to fatemehp/arrow that referenced this pull request Oct 17, 2022
…pache#13311)

This PR moves `src/arrow/python` directory into `pyarrow` and arranges PyArrow to build it. The build on the Python side is made in two steps:

1. `_run_cmake_pyarrow_cpp()` where the C++ part of the pyarrow is build first (the part that was moved in the refactoring)
2. `_run_cmake()` where pyarrow is built as before

No changes are needed in the build process from the user side to successfully build pyarrow after this refactoring. The test for PyArrow CPP will however be moved into Cython and can currently be run with:

```shell
>>> pushd python/build/dist/temp 
>>> ctest
```

Lead-authored-by: Alenka Frim <frim.alenka@gmail.com>
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
fatemehp pushed a commit to fatemehp/arrow that referenced this pull request Oct 17, 2022
…es (apache#14275)

This PR is a follow-up of apache#13311 where it was decided to change the base directory for PyArrow C++ headers to avoid top level inclusions. See apache#13311 (comment)

Authored-by: Alenka Frim <frim.alenka@gmail.com>
Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
kou pushed a commit that referenced this pull request Oct 25, 2022
…14498)

Remove all instances of `ARROW_BUILD_DIR` environment variable that was aded in the PyArrow refactoring PR #13311.

Authored-by: Alenka Frim <frim.alenka@gmail.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants