Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] ERROR: Failed building wheel for pyarrow on Linux #36965

Closed
vwbusguy opened this issue Aug 1, 2023 · 9 comments
Closed

[Python] ERROR: Failed building wheel for pyarrow on Linux #36965

vwbusguy opened this issue Aug 1, 2023 · 9 comments

Comments

@vwbusguy
Copy link

vwbusguy commented Aug 1, 2023

Describe the bug, including details regarding any error messages, version, and platform.

Pyarrow stopped building in our CI at some point a few hours ago (though was building this morning) with error ERROR: Could not build wheels for pyarrow, which is required to install pyproject.toml-based projects

Container source reference: https://github.com/ucsb-pstat/pstat-174-container-image

This is based on Jupyter upstream base images for Ubuntu 22.04.

I believe this is a duplicate of #36963 but was asked to file a separate issue since this isn't FreeBSD.

      -- Running cmake for pyarrow
      cmake -DPYTHON_EXECUTABLE=/opt/conda/bin/python3.11 -DPython3_EXECUTABLE=/opt/conda/bin/python3.11 "" -DPYARROW_BUILD_CUDA=off -DPYARROW_BUILD_FLIGHT=off -DPYARROW_BUILD_GANDIVA=off -DPYARROW_BUILD_DATASET=off -DPYARROW_BUILD_ORC=off -DPYARROW_BUILD_PARQUET=off -DPYARROW_BUILD_PARQUET_ENCRYPTION=off -DPYARROW_BUILD_PLASMA=off -DPYARROW_BUILD_S3=off -DPYARROW_BUILD_HDFS=off -DPYARROW_USE_TENSORFLOW=off -DPYARROW_BUNDLE_ARROW_CPP=off -DPYARROW_BUNDLE_BOOST=off -DPYARROW_GENERATE_COVERAGE=off -DPYARROW_BOOST_USE_SHARED=on -DPYARROW_PARQUET_USE_SHARED=on -DCMAKE_BUILD_TYPE=release /tmp/pip-install-pr092oko/pyarrow_c75dd9cd7746480bb33e6fef1767af9b
      -- The C compiler identification is GNU 11.4.0
      -- The CXX compiler identification is GNU 11.4.0
      -- Detecting C compiler ABI info
      -- Detecting C compiler ABI info - done
      -- Check for working C compiler: /usr/bin/cc - skipped
      -- Detecting C compile features
      -- Detecting C compile features - done
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: /usr/bin/c++ - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      -- System processor: x86_64
      -- Performing Test CXX_SUPPORTS_SSE4_2
      -- Performing Test CXX_SUPPORTS_SSE4_2 - Success
      -- Performing Test CXX_SUPPORTS_AVX2
      -- Performing Test CXX_SUPPORTS_AVX2 - Success
      -- Performing Test CXX_SUPPORTS_AVX512
      -- Performing Test CXX_SUPPORTS_AVX512 - Success
      -- Arrow build warning level: PRODUCTION
      Using ld linker
      Configured for RELEASE build (set with cmake -DCMAKE_BUILD_TYPE={release,debug,...})
      -- Build Type: RELEASE
      -- Generator: Unix Makefiles
      -- Build output directory: /tmp/pip-install-pr092oko/pyarrow_c75dd9cd7746480bb33e6fef1767af9b/build/temp.linux-x86_64-cpython-311/release
      -- Found Python3: /opt/conda/bin/python3.11 (found version "3.11.4") found components: Interpreter Development.Module NumPy
      -- Found Python3Alt: /opt/conda/bin/python3.11
      -- Found PkgConfig: /opt/conda/bin/pkg-config (found version "0.29.2")
      -- Could NOT find Arrow (missing: Arrow_DIR)
      -- Checking for module 'arrow'
      --   No package 'arrow' found
      CMake Error at /usr/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
        Could NOT find Arrow (missing: ARROW_INCLUDE_DIR ARROW_LIB_DIR
        ARROW_FULL_SO_VERSION ARROW_SO_VERSION)
      Call Stack (most recent call first):
        /usr/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
        cmake_modules/FindArrow.cmake:450 (find_package_handle_standard_args)
        cmake_modules/FindArrowPython.cmake:46 (find_package)
        CMakeLists.txt:231 (find_package)
      
      
      -- Configuring incomplete, errors occurred!
      See also "/tmp/pip-install-pr092oko/pyarrow_c75dd9cd7746480bb33e6fef1767af9b/build/temp.linux-x86_64-cpython-311/CMakeFiles/CMakeOutput.log".
      error: command '/usr/bin/cmake' failed with exit code 1
      [end of output]

Component(s)

Python

@kou
Copy link
Member

kou commented Aug 1, 2023

If you want to use conda, you should use https://anaconda.org/conda-forge/pyarrow .
If you don't want to use https://anaconda.org/conda-forge/pyarrow , you can use https://anaconda.org/conda-forge/libarrow to install Apache Arrow C++.

@vwbusguy
Copy link
Author

vwbusguy commented Aug 1, 2023

Edit - I'll try installing it with conda instead of pip.

@vwbusguy
Copy link
Author

vwbusguy commented Aug 1, 2023

Installing via conda gets the same error as via pip. It's curious that this has been building for the past four months up until today on that Containerfile spec.

      --   No package 'arrow-python' found
      CMake Error at /usr/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
        Could NOT find ArrowPython (missing: ARROW_PYTHON_INCLUDE_DIR
        ARROW_PYTHON_LIB_DIR) (found version "12.0.1")
      Call Stack (most recent call first):
        /usr/share/cmake-3.22/Modules/FindPackageHandleStandardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
        cmake_modules/FindArrowPython.cmake:76 (find_package_handle_standard_args)
        CMakeLists.txt:231 (find_package)
      
      
      -- Configuring incomplete, errors occurred!
      See also "/tmp/pip-install-re2qkyh5/pyarrow_d94b37a86fab4d5883549be9791b492c/build/temp.linux-x86_64-cpython-311/CMakeFiles/CMakeOutput.log".
      error: command '/usr/bin/cmake' failed with exit code 1
      [end of output]

@kou
Copy link
Member

kou commented Aug 1, 2023

https://anaconda.org/conda-forge/pyarrow is a binary package.
So building pyarrow from source is strange. Are you really using https://anaconda.org/conda-forge/pyarrow ?

@vwbusguy
Copy link
Author

vwbusguy commented Aug 1, 2023

That's a very good point. Let me do some digging.

@vwbusguy
Copy link
Author

vwbusguy commented Aug 1, 2023

Looks like this error is actually happening when installing gluonts, which was bringing in pyarrow. Installing pyarrow standalone seems to work just fine with either pip or conda, but it seems installing gluonts from pip tries to build pyarrow even if pyarrow already exists where installing it through conda-forge is content to use the existing binary.

More curious, the build this morning pulled the wheel directly:

Collecting pyarrow~=8.0
 Downloading pyarrow-8.0.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (29.4 MB)
    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 29.4/29.4 MB 289.0 MB/s eta 0:00:00

Where the builds as of this afternoon started pulling pyarrow in as a tar.gz. Seems like perhaps something has happened in PyPi land today and now the source is being downloaded instead of the binary?

Collecting pyarrow~=8.0 (from gluonts[R,mxnet,pro,torch])
  Downloading pyarrow-8.0.0.tar.gz (846 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 846.6/846.6 kB 114.6 MB/s eta 0:00:00
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
  Preparing metadata (pyproject.toml): started
  Preparing metadata (pyproject.toml): finished with status 'done'

It is also reproducible with from current upstream Jupyter image directly: podman run -it --pull=Always jupyter/r-notebook pip install gluonts[R,mxnet,pro,torch] while installing pyarrow standalone works from the same image: podman run -it --pull=Always jupyter/r-notebook pip install pyarrow

That said, since pyarrow seems to install fine on its own with pip or conda, I'm much less confident that this is an issue with pyarrow and might be something funny today with PyPi and/or gluonts. Switching everything to conda-forge works. I'm happy to help with testing things on PyPi if you'd like but otherwise, I think we can close this. Thanks for walking through this with me.

@vwbusguy
Copy link
Author

vwbusguy commented Aug 1, 2023

I just noticed that gluonts is pulling in an older version (8.0) where standalone is pulling in a newer one (12.0). pip install pyarrow~=8.0 appears to be the problem and I think I see why. Jupyter is now on Python 3.11 and the newest binaries for 8.0 are for 3.10. I guess the story here is that conda-forge has a 3.11 compatible binary for its gluonts package and PyPi does not, and pyarrow 8.0 appears to be incompatible with Python 3.11 when building from source in PyPi. As to why builds started failing today for us, another dev bumped our base image jupyter release stream for R images since this morning and brought in 3.11 in the process.

@kou
Copy link
Member

kou commented Aug 1, 2023

OK. I close this.

@kou kou closed this as not planned Won't fix, can't repro, duplicate, stale Aug 1, 2023
@assignUser
Copy link
Member

Thanks for the thorough investigation on your end @vwbusguy !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants