Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build fails on Linux with GCC 9, 10, and 12 #63

Closed
wookayin opened this issue Nov 13, 2022 · 17 comments
Closed

Build fails on Linux with GCC 9, 10, and 12 #63

wookayin opened this issue Nov 13, 2022 · 17 comments

Comments

@wookayin
Copy link

wookayin commented Nov 13, 2022

https://google.github.io/tensorstore/installation.html says Build dependencies are GCC 9 or later, but it seems that tensorstore cannot be built using GCC 9 or 10. What is the minimum required version of GCC/G++?

Environment: Ubuntu 20.04 Focal LTS
tensorstore: 0.1.28 (08-11-2022)

GCC 9.x: ./tensorstore/box.h:214:44: error: expected template-name before '<' token
  Use --sandbox_debug to see verbose messages from the sandbox
  In file included from ./tensorstore/box.h:30,
                   from ./tensorstore/internal/box_difference.h:20,
                   from tensorstore/internal/box_difference.cc:15:
  ./tensorstore/internal/multi_vector.h: In substitution of 'template<long int Extent, class ... Ts> using MultiVectorStorage = tensorstore::internal::MultiVectorStorageImpl<tenso
rstore::RankConstraint::FromInlineRank(Extent).tensorstore::RankConstraint::operator tensorstore::DimensionIndex(), tensorstore::InlineRankLimit(Extent), Ts ...> [with long int Ex
tent = Rank; Ts = {long int, long int}]':
  ./tensorstore/box.h:137:67:   required from here
  ./tensorstore/internal/multi_vector.h:105:58: error: taking address of rvalue [-fpermissive]
    105 |     MultiVectorStorageImpl<RankConstraint::FromInlineRank(Extent),
        |                            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~
  ./tensorstore/internal/multi_vector.h:106:58: error: no matching function for call to 'tensorstore::RankConstraint::operator tensorstore::DimensionIndex(tensorstore::RankConstra
int*)'
    106 |                            InlineRankLimit(Extent), Ts...>;
        |                                                          ^
  In file included from ./tensorstore/internal/multi_vector.h:29,
                   from ./tensorstore/box.h:30,
                   from ./tensorstore/internal/box_difference.h:20,
                   from tensorstore/internal/box_difference.cc:15:
  ./tensorstore/rank.h:127:13: note: candidate: 'constexpr tensorstore::RankConstraint::operator tensorstore::DimensionIndex() const'
    127 |   constexpr operator DimensionIndex() const { return rank; }
        |             ^~~~~~~~
  ./tensorstore/rank.h:127:13: note:   candidate expects 0 arguments, 1 provided
  In file included from ./tensorstore/internal/box_difference.h:20,
                   from tensorstore/internal/box_difference.cc:15:
  ./tensorstore/box.h:214:44: error: expected template-name before '<' token
    214 | class Box : public internal_box::BoxStorage<Rank> {
        |                                            ^
  ./tensorstore/box.h:214:44: error: expected '{' before '<' token
  cc1plus: warning: unrecognized command line option '-Wno-unknown-warning-option'
  Target //python/tensorstore:_tensorstore__shared_objects failed to build
GCC 10.x: python_headers/object.h:136:30: error: lvalue required as left operand of assignment
  ERROR: /tmp/pip-install-ltft0rab/tensorstore_0bc800fcb17142a7ba2120da8a360a3f/python/tensorstore/BUILD:381:20: Compiling python/tensorstore/bfloat16.cc failed:
 (Exit 1): gcc-10 failed: error executing command                                                                                                                
    (cd $HOME/.cache/bazel/_bazel_$USER/4694eb6f528e602d9a898e0775c25c1f/sandbox/linux-sandbox/1851/execroot/com_google_tensorstore && \                   
    exec env - \                                                                                                                                                 
      PATH=/bin:/usr/bin:/usr/local/bin \                                                                                                                        
      PWD=/proc/self/cwd \                                                                                                                                       

  (... omitted...)
                                                                                               
  In file included from bazel-out/k8-opt/bin/external/local_config_python/_virtual_includes/python_headers/Python.h:44,                                          
                   from bazel-out/k8-opt/bin/external/com_github_pybind_pybind11/_virtual_includes/pybind11/pybind11/detail/../detail/common.h:208,              
                   from bazel-out/k8-opt/bin/external/com_github_pybind_pybind11/_virtual_includes/pybind11/pybind11/detail/../attr.h:13,                        
                   from bazel-out/k8-opt/bin/external/com_github_pybind_pybind11/_virtual_includes/pybind11/pybind11/detail/class.h:12,                          
                   from bazel-out/k8-opt/bin/external/com_github_pybind_pybind11/_virtual_includes/pybind11/pybind11/pybind11.h:13,                              
                   from ./python/tensorstore/numpy.h:35,                                                                                                         
                   from python/tensorstore/bfloat16.cc:15:                                                                                                       
  python/tensorstore/bfloat16.cc: In function 'bool tensorstore::internal_python::{anonymous}::Initialize()':                                                    
  bazel-out/k8-opt/bin/external/local_config_python/_virtual_includes/python_headers/object.h:136:30: error: lvalue required as left operand of assignment       
    136 | #  define Py_TYPE(ob) Py_TYPE(_PyObject_CAST(ob))                                                                                                      
        |                       ~~~~~~~^~~~~~~~~~~~~~~~~~~~
  python/tensorstore/bfloat16.cc:774:3: note: in expansion of macro 'Py_TYPE'
    774 |   Py_TYPE(&NPyBfloat16_Descr) = &PyArrayDescr_Type;
        |   ^~~~~~~
  At global scope:
  cc1plus: note: unrecognized command-line option '-Wno-unknown-warning-option' may have been intended to silence earlier diagnostics
  Target //python/tensorstore:_tensorstore__shared_objects failed to build
GCC 12.x: com_google_boringssl/src/crypto/refcount_c11.c:29:15: error: expected declaration specifiers or '...'
In file included from external/com_google_boringssl/src/crypto/refcount_c11.c:15:
external/com_google_boringssl/src/crypto/internal.h: In function 'CRYPTO_load_word_be':
external/com_google_boringssl/src/crypto/internal.h:923:3: warning: implicit declaration of function 'static_assert' [-Wimplicit-function-declaration]
  923 |   static_assert(sizeof(v) == 8, "crypto_word_t has unexpected size");
      |   ^~~~~~~~~~~~~
      In file included from external/com_google_boringssl/src/crypto/internal.h:130:
external/com_google_boringssl/src/crypto/refcount_c11.c: At top level:
external/com_google_boringssl/src/crypto/refcount_c11.c:29:15: error: expected declaration specifiers or '...' before '_Alignof'
29 | static_assert(alignof(CRYPTO_refcount_t) == alignof(_Atomic CRYPTO_refcount_t),
      |               ^~~~~~~
external/com_google_boringssl/src/crypto/refcount_c11.c:30:15: error: expected declaration specifiers or '...' before string constant
   30 |               "_Atomic alters the needed alignment of a reference count");
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
external/com_google_boringssl/src/crypto/refcount_c11.c:31:15: error: expected declaration specifiers or '...' before 'sizeof'
   31 | static_assert(sizeof(CRYPTO_refcount_t) == sizeof(_Atomic CRYPTO_refcount_t),
      |               ^~~~~~
external/com_google_boringssl/src/crypto/refcount_c11.c:32:15: error: expected declaration specifiers or '...' before string constant
   32 |               "_Atomic alters the size of a reference count");
|               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
external/com_google_boringssl/src/crypto/refcount_c11.c:34:15: error: expected declaration specifiers or '...' before '(' token
   34 | static_assert((CRYPTO_refcount_t)-1 == CRYPTO_REFCOUNT_MAX,
|               ^
external/com_google_boringssl/src/crypto/refcount_c11.c:35:15: error: expected declaration specifiers or '...' before string constant
   35 |               "CRYPTO_REFCOUNT_MAX is incorrect");
|               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: note: unrecognized command-line option '-Wno-unknown-warning-option' may have been intended to silence earlier diagnostics
Target //python/tensorstore:_tensorstore__shared_objects failed to build
INFO: Elapsed time: 41.252s, Critical Path: 27.78s
INFO: 1243 processes: 162 internal, 1081 linux-sandbox.
FAILED: Build did NOT complete successfully

Command I ran in the above: pip install tensorstore

@laramiel
Copy link
Collaborator

laramiel commented Nov 14, 2022

I think that our current build machine uses gcc 12.2.0, as does my workspace (gcc (Debian 12.2.0-3) 12.2.0).
Can you run the following in each build environment and tell me what it says:

./bazelisk.py build tensorstore/...
`./bazelisk.py info output_base`/external/local_config_cc/cc_wrapper.sh --version

These errors were on pip install tensorstore? What's your os, other environment like, because we should have prebuilt packages from github CI.

@wookayin
Copy link
Author

The cc_wrapper.sh --version prints the same one as gcc --version on the $PATH.

(on the master branch)

GCC 10.3 (The system g++ installed on this Linux machine):

❯❯❯ which -a gcc g++
gcc: aliased to nocorrect gcc
/bin/gcc
/usr/bin/gcc
/bin/g++
/usr/bin/g++

❯❯❯ $(./bazelisk.py info output_base)/external/local_config_cc/cc_wrapper.sh --version
gcc (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

GCC 12.2 (conda gcc/g++)

❯❯❯ which g++
$HOME/.miniforge3/bin/g++         # g++ (conda-forge gcc 12.2.0-19) 12.2.0

❯❯❯ $(./bazelisk.py info output_base)/external/local_config_cc/cc_wrapper.sh --version 

x86_64-conda-linux-gnu-cc (conda-forge gcc 12.2.0-19) 12.2.0
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

With those bazel commands I somehow run into a different error:

ERROR: .../tensorstore/tensorstore/serialization/BUILD:64:20: Linking tensorstore/serialization/function_test failed: (Exit 1): x86_64-conda-linux-gnu-cc failed: error executing command $CONDA_PREFIX/bin/x86_64-conda-linux-gnu-cc @bazel-out/k8-fastbuild/bin/tensorstore/serialization/function_test-2.params

Use --sandbox_debug to see verbose messages from the sandbox
bazel-out/k8-fastbuild/bin/_solib_k8/libexternal_Scom_Ugoogle_Uabsl_Sabsl_Stime_Slibtime.so: error: undefined reference to 'clock_gettime'

This particular error has something to do with glibc 2.17+ not requiring -lrt flag.

One note is that the error messages in the OP was during installation via pip install . (which invokes bazel like python -u bazelisk.py build -c opt //python/tensorstore:_tensorstore__shared_objects --verbose_failures --copt=-fvisibility=hidden). In this case the same gcc, g++ on the $PATH was used.

@jbms
Copy link
Collaborator

jbms commented Nov 14, 2022

The gcc 9 failures are known and expected --- we just need to change the documentation.

The other issues ideally can be solved. Which Python version?

@wookayin
Copy link
Author

wookayin commented Nov 14, 2022

I see, documentation could be updated. Having GCC 12 as a standard, recommended version sounds good, but due to GCC's incompatibility with CUDA (https://stackoverflow.com/questions/6622454/cuda-incompatible-with-my-gcc-version) I hope that at least GCC 10.x could be supported. I think conda-shipped gcc 12.2 should work, but the linux system I'm particularly using has a bit old libstdc++ and glibc.

Python -- fails both on Python 3.10 and 3.11. I was actually trying to build tensorstore for python 3.11 because there is no prebuilt wheel available (it'd be also great if an official py311 support can be added!), but build failure is also there on python 3.10 as well with the same environment. In this issue, I more look to C++ build commands and/or documentations could be improved so bazel build can work with different environment configurations.

@laramiel
Copy link
Collaborator

The clock_getime failure is from Abseil; maybe we should raise a bug there. I don't see any -lrt conditional flags in the build, and I haven't looked into which gcc / glibc version require it.

I should try and setup an image to repro this. uname -a? You're running ubuntu and installed gcc how?

@wookayin
Copy link
Author

Yes, my env is Ubuntu 20.04 LTS (focal), which ships with GCC 10. (Click to expand)
$ uname -a
Linux HOSTNAME 5.4.0-131-generic #147-Ubuntu SMP Fri Oct 14 17:07:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

$ cat /etc/lsb-release                                                                                                                  
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.5 LTS"

$ sudo apt install gcc-10 g++-10
# (Optional)
$ sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 10
$ sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-10 10

copybara-service bot pushed a commit that referenced this issue Nov 19, 2022
* gcc 10 build fails referencing static_rank field.
* Change minimum gcc version to g++-10

These should help with #63

PiperOrigin-RevId: 489580988
Change-Id: I603f0aa94da7d2789220621c713911f838dbf6b3
wookayin added a commit to wookayin/flax that referenced this issue Dec 7, 2022
tensorstore is an optional dependency for flax. In google#2520 it was added
back as the normal dependency, but on some Linux and macOS environments
tensorstore still fails to build (see google#2341, google/tensorstore#63).
@sameeul
Copy link

sameeul commented Feb 17, 2023

Just wanted to report that, with GCC 10.4 + and GCC 11, the build process (both python setup.py develop and building executables from the example fail due to failure to build boringssl. I can build cleanly with GCC 10.2 and GCC 10.3. I know this is not exactly a tensorstore issue rather an issue coming from one of its dependencies. I was just curious if any of you came across this except the author of issue (The GCC 12.X log).

@jbms
Copy link
Collaborator

jbms commented Feb 18, 2023

@sameeul Are you also using gcc from conda?

It appears that it lacks proper C11 support, which is required by boringssl. For example, after installing conda-forge:

echo -e "#include <assert.h>\nstatic_assert(1);" > test.c
gcc -std=c11 -c test.cc

Fails with

t.c:2:15: error: expected declaration specifiers or '...' before numeric constant
    2 | static_assert(1);
      |               ^

In particular, if you look at mambaforge/x86_64-conda-linux-gnu/sysroot/usr/include/assert.h you will see that it is a very old version that lacks a #define static_assert line.

@jbms
Copy link
Collaborator

jbms commented Feb 18, 2023

I filed upstream issue: conda-forge/linux-sysroot-feedstock#44

@sameeul
Copy link

sameeul commented Feb 18, 2023

@jbms : yes, I was using conda gcc in cases where my build was failing. The cases where it was working were compilers that came directly from debian-bullseye repo. Thanks for looking into it.

@alcarazolabs
Copy link

Someone can help to fix this problem? Im triying to install tensorstore in a nvidia jetson orin nano arm64:

(.myenv) orin@ubuntu:~/Documents/tensorstore$ pip install tensorstore
Collecting tensorstore
  Using cached tensorstore-0.1.51.tar.gz (6.4 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: numpy>=1.16.0 in /home/orin/Documents/whisper/whisperjax/.myenv/lib/python3.9/site-packages (from tensorstore) (1.26.2)
Collecting ml-dtypes>=0.3.1 (from tensorstore)
  Using cached ml_dtypes-0.3.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (20 kB)
Using cached ml_dtypes-0.3.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (202 kB)
Building wheels for collected packages: tensorstore
  Building wheel for tensorstore (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  × Building wheel for tensorstore (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [3245 lines of output]
      WARNING setuptools_scm._integration.setuptools pyproject.toml does not contain a tool.setuptools_scm section
      WARNING setuptools_scm.pyproject_reading toml section missing 'pyproject.toml does not contain a tool.setuptools_scm section'
      running bdist_wheel
      running build
      running build_py
      creating /tmp/tmpdqjypyim/lib.linux-aarch64-cpython-39
      creating /tmp/tmpdqjypyim/lib.linux-aarch64-cpython-39/tensorstore
      copying python/tensorstore/__init__.py -> /tmp/tmpdqjypyim/lib.linux-aarch64-cpython-39/tensorstore
      running build_ext
      /home/orin/Documents/whisper/whisperjax/.myenv/bin/python3.9 -u bazelisk.py build -c opt //python/tensorstore:_tensorstore__shared_objects --verbose_failures --action_env=PATH=/home/orin/Documents/whisper/whisperjax/.myenv/bin:/usr/local/cuda-11.4/bin:/home/orin/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin --copt=-fvisibility=hidden
      Starting local Bazel server and connecting to it...
      WARNING: ignoring LD_PRELOAD in environment.
      Loading:
      Loading:
      Loading:
      Loading: 0 packages loaded
      Analyzing: target //python/tensorstore:_tensorstore__shared_objects (1 packages loaded, 0 targets configured)
      Analyzing: target //python/tensorstore:_tensorstore__shared_objects (31 packages loaded, 9 targets configured)
      Analyzing: target //python/tensorstore:_tensorstore__shared_objects (34 packages loaded, 127 targets configured)
      Analyzing: target //python/tensorstore:_tensorstore__shared_objects (83 packages loaded, 290 targets configured)
      Analyzing: target //python/tensorstore:_tensorstore__shared_objects (131 packages loaded, 372 targets configured)
      Analyzing: target //python/tensorstore:_tensorstore__shared_objects (140 packages loaded, 1436 targets configured)
      Analyzing: target //python/tensorstore:_tensorstore__shared_objects (149 packages loaded, 2013 targets configured)
      Analyzing: target //python/tensorstore:_tensorstore__shared_objects (155 packages loaded, 2359 targets configured)
      Analyzing: target //python/tensorstore:_tensorstore__shared_objects (163 packages loaded, 3649 targets configured)
      Analyzing: target //python/tensorstore:_tensorstore__shared_objects (169 packages loaded, 4790 targets configured)
      Analyzing: target //python/tensorstore:_tensorstore__shared_objects (194 packages loaded, 6966 targets configured)
      Analyzing: target //python/tensorstore:_tensorstore__shared_objects (249 packages loaded, 7865 targets configured)
      INFO: Analyzed target //python/tensorstore:_tensorstore__shared_objects (250 packages loaded, 8109 targets configured).
       checking cached actions
......

      ./tensorstore/box.h:676:78: error: no matching function for call to 'tensorstore::RankConstraint::operator tensorstore::DimensionIndex(tensorstore::RankConstraint*)'
        676 | BoxView(const Box<Rank>& box) -> BoxView<RankConstraint::FromInlineRank(Rank)>;
            |                                                                              ^
      In file included from ./tensorstore/json_serialization_options_base.h:19,
                       from ./tensorstore/internal/json_binding/bindable.h:23,
                       from ./tensorstore/context_impl_base.h:37,
                       from ./tensorstore/context.h:29,
                       from ./tensorstore/kvstore/s3/s3_resource.h:25,
                       from tensorstore/kvstore/s3/s3_resource.cc:15:
      ./tensorstore/rank.h:126:13: note: candidate: 'constexpr tensorstore::RankConstraint::operator tensorstore::DimensionIndex() const'
        126 |   constexpr operator DimensionIndex() const { return rank; }
            |             ^~~~~~~~
      ./tensorstore/rank.h:126:13: note:   candidate expects 0 arguments, 1 provided
      cc1plus: warning: unrecognized command line option '-Wno-unknown-warning-option'
      Target //python/tensorstore:_tensorstore__shared_objects failed to build
      INFO: Elapsed time: 686.598s, Critical Path: 38.57s
      INFO: 2701 processes: 531 internal, 2166 linux-sandbox, 4 local.
      FAILED: Build did NOT complete successfully
      error: command '/home/orin/Documents/whisper/whisperjax/.myenv/bin/python3.9' failed with exit code 1
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for tensorstore
Failed to build tensorstore
ERROR: Could not build wheels for tensorstore, which is required to install pyproject.toml-based projects
(.myenv) orin@ubuntu:~/Documents/tensorstore$ 

Any suggestion I will appreciate it guys, thanks so much.

@jbms
Copy link
Collaborator

jbms commented Dec 7, 2023

What is your compiler version?

@alcarazolabs
Copy link

alcarazolabs commented Dec 7, 2023

What is your compiler version?

gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0

Some minutes ago I updated the gcc to 10.5 now again appears the 9.4.0

I did:

sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt-get update
sudo apt-get install gcc-10

sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-10 10

@jbms
Copy link
Collaborator

jbms commented Dec 7, 2023

It should work with GCC 10, but it will build using whatever is the default version of gcc and g++ in your path. You may need to do update-alternatives for both gcc and g++, and confirm their versions via gcc -v and g++ -v before trying pip install again.

@alcarazolabs
Copy link

It should work with GCC 10, but it will build using whatever is the default version of gcc and g++ in your path. You may need to do update-alternatives for both gcc and g++, and confirm their versions via gcc -v and g++ -v before trying pip install again.

Thanks for the reply. I should have the same version for gcc and g++?

Now I'm getting this error:

 # Configuration: 2e75659f83b112ea910d43e089378d23c32e14fbc15d5f986826762c5f40b8be
      # Execution platform: @local_config_platform//:host
      
      Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
      gcc: fatal error: cannot execute 'cc1plus': execvp: No such file or directory
      compilation terminated.
      Target //python/tensorstore:_tensorstore__shared_objects failed to build
      INFO: Elapsed time: 18.191s, Critical Path: 0.20s
      INFO: 440 processes: 436 internal, 1 linux-sandbox, 3 local.
      FAILED: Build did NOT complete successfully
      error: command '/home/orin/Documents/whisper/whisperjax/.myenv/bin/python3.9' failed with exit code 1
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for tensorstore

Now I updated the gcc version, now is permanent.:

gcc (Ubuntu 10.5.0-1ubuntu1~20.04) 10.5.0

However the g++ has this versión:

g++ (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0

Should I update the g++? thanks

@jbms
Copy link
Collaborator

jbms commented Dec 7, 2023

Yes, you will need to upgrade g++ as well to the same version.

@alcarazolabs
Copy link

Yes, you will need to upgrade g++ as well to the same version.

Thanks so much, now I was able to install tensorstore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants