Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to build TensorFlow from source with avx2 configuration. #59563

Closed
ke1ding opened this issue Feb 4, 2023 · 5 comments
Closed

Unable to build TensorFlow from source with avx2 configuration. #59563

ke1ding opened this issue Feb 4, 2023 · 5 comments
Assignees
Labels
subtype: ubuntu/linux Ubuntu/Linux Build/Installation Issues TF 2.11 Issues related to TF 2.11 type:build/install Build and install issues

Comments

@ke1ding
Copy link

ke1ding commented Feb 4, 2023

Click to expand!

Issue Type

Bug

Have you reproduced the bug with TF nightly?

Yes

Source

source

Tensorflow Version

master and r2.12

Custom Code

No

OS Platform and Distribution

Linux Ubuntu 20.04.5 LTS (Focal Fossa)

Mobile device

No response

Python version

Python 3.8.10

Bazel version

bazel 5.3.0

GCC/Compiler version

gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0

CUDA/cuDNN version

No response

GPU model and memory

No response

Current Behaviour?

A bug happened!

Unable to build TensorFlow with avx2 configuration.
6d71dc3ae89012d8c301b07d8635b41cfe6e6850 is the first bad commit

Eigen build also breaks after this commit.

Standalone code to reproduce the issue

python configure.py

bazel --bazelrc=.bazelrc build -c opt --copt=-march=haswell tensorflow/tools/pip_package:build_pip_package

Relevant log output

root@ebed50890c33:~/tensorflow# git bisect good
Bisecting: 5 revisions left to test after this (roughly 3 steps)
[6d71dc3ae89012d8c301b07d8635b41cfe6e6850] Update Eigen to commit:3460f3558e7b469efb8a225894e21929c8c77629
root@ebed50890c33:~/tensorflow# bazel --bazelrc=.bazelrc build -c opt --copt=-march=haswell tensorflow/tools/pip_package:build_pip_package
INFO: Options provided by the client:
  Inherited 'common' options: --isatty=1 --terminal_columns=176
INFO: Reading rc options for 'build' from /root/tensorflow/.bazelrc:
  Inherited 'common' options: --experimental_repo_remote_exec
INFO: Reading rc options for 'build' from /root/tensorflow/.bazelrc:
  'build' options: --define framework_shared_object=true --define tsl_protobuf_header_only=true --define=use_fast_cpp_protos=true --define=allow_oversize_protos=true --spawn_strategy=standalone -c opt --announce_rc --define=grpc_no_ares=true --noincompatible_remove_legacy_whole_archive --enable_platform_specific_config --define=with_xla_support=true --config=short_logs --config=v2 --define=no_aws_support=true --define=no_hdfs_support=true --experimental_cc_shared_library --experimental_link_static_libraries_once=false --incompatible_enforce_config_setting_visibility
INFO: Reading rc options for 'build' from /root/tensorflow/.tf_configure.bazelrc:
  'build' options: --action_env PYTHON_BIN_PATH=/usr/local/bin/python --action_env PYTHON_LIB_PATH=/usr/lib/python3/dist-packages --python_path=/usr/local/bin/python
INFO: Reading rc options for 'build' from /root/tensorflow/.bazelrc:
  'build' options: --deleted_packages=tensorflow/compiler/mlir/tfrt,tensorflow/compiler/mlir/tfrt/benchmarks,tensorflow/compiler/mlir/tfrt/jit/python_binding,tensorflow/compiler/mlir/tfrt/jit/transforms,tensorflow/compiler/mlir/tfrt/python_tests,tensorflow/compiler/mlir/tfrt/tests,tensorflow/compiler/mlir/tfrt/tests/ir,tensorflow/compiler/mlir/tfrt/tests/analysis,tensorflow/compiler/mlir/tfrt/tests/jit,tensorflow/compiler/mlir/tfrt/tests/lhlo_to_tfrt,tensorflow/compiler/mlir/tfrt/tests/lhlo_to_jitrt,tensorflow/compiler/mlir/tfrt/tests/tf_to_corert,tensorflow/compiler/mlir/tfrt/tests/tf_to_tfrt_data,tensorflow/compiler/mlir/tfrt/tests/saved_model,tensorflow/compiler/mlir/tfrt/transforms/lhlo_gpu_to_tfrt_gpu,tensorflow/core/runtime_fallback,tensorflow/core/runtime_fallback/conversion,tensorflow/core/runtime_fallback/kernel,tensorflow/core/runtime_fallback/opdefs,tensorflow/core/runtime_fallback/runtime,tensorflow/core/runtime_fallback/util,tensorflow/core/tfrt/eager,tensorflow/core/tfrt/eager/backends/cpu,tensorflow/core/tfrt/eager/backends/gpu,tensorflow/core/tfrt/eager/core_runtime,tensorflow/core/tfrt/eager/cpp_tests/core_runtime,tensorflow/core/tfrt/gpu,tensorflow/core/tfrt/run_handler_thread_pool,tensorflow/core/tfrt/runtime,tensorflow/core/tfrt/saved_model,tensorflow/core/tfrt/graph_executor,tensorflow/core/tfrt/saved_model/tests,tensorflow/core/tfrt/tpu,tensorflow/core/tfrt/utils
INFO: Reading rc options for 'build' from /root/.mkl.bazelrc:
  'build' options: --cxxopt=-D_GLIBCXX_USE_CXX11_ABI=0 --copt=-O3 --copt=-Wformat --copt=-Wformat-security --copt=-fstack-protector --copt=-fPIC --copt=-fpic --linkopt=-znoexecstack --linkopt=-zrelro --linkopt=-znow --linkopt=-fstack-protector --config=mkl --copt=-march=skylake-avx512
INFO: Found applicable config definition build:short_logs in file /root/tensorflow/.bazelrc: --output_filter=DONT_MATCH_ANYTHING
INFO: Found applicable config definition build:v2 in file /root/tensorflow/.bazelrc: --define=tf_api_version=2 --action_env=TF2_BEHAVIOR=1
INFO: Found applicable config definition build:mkl in file /root/tensorflow/.bazelrc: --define=build_with_mkl=true --define=enable_mkl=true --define=tensorflow_mkldnn_contraction_kernel=0 --define=build_with_openmp=true -c opt
INFO: Found applicable config definition build:linux in file /root/tensorflow/.bazelrc: --host_copt=-w --copt=-Wno-all --copt=-Wno-extra --copt=-Wno-deprecated --copt=-Wno-deprecated-declarations --copt=-Wno-ignored-attributes --copt=-Wno-array-bounds --copt=-Wunused-result --copt=-Werror=unused-result --copt=-Wswitch --copt=-Werror=switch --copt=-Wno-error=unused-but-set-variable --define=PREFIX=/usr --define=LIBDIR=$(PREFIX)/lib --define=INCLUDEDIR=$(PREFIX)/include --define=PROTOBUF_INCLUDE_PATH=$(PREFIX)/include --cxxopt=-std=c++17 --host_cxxopt=-std=c++17 --config=dynamic_kernels --distinct_host_configuration=false --experimental_guard_against_concurrent_changes
INFO: Found applicable config definition build:dynamic_kernels in file /root/tensorflow/.bazelrc: --define=dynamic_loaded_kernels=true --copt=-DAUTOLOAD_DYNAMIC_KERNELS
INFO: Analyzed target //tensorflow/tools/pip_package:build_pip_package (3 packages loaded, 5645 targets configured).
INFO: Found 1 target...
ERROR: /root/tensorflow/tensorflow/tsl/framework/contraction/BUILD:110:11: Compiling tensorflow/tsl/framework/contraction/eigen_contraction_kernel.cc failed: (Exit 1): gcc failed: error executing command /usr/bin/gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer -g0 -O2 '-D_FORTIFY_SOURCE=1' -DNDEBUG -ffunction-sections ... (remaining 60 arguments skipped)
In file included from ./tensorflow/tsl/framework/fixedpoint/FixedPoint.h:34,
                 from ./tensorflow/tsl/framework/contraction/eigen_contraction_kernel.h:37,
                 from tensorflow/tsl/framework/contraction/eigen_contraction_kernel.cc:16:
./tensorflow/tsl/framework/fixedpoint/MatMatProductAVX2.h: In member function 'void Eigen::internal::gemm_pack_lhs<Eigen::QInt16, Index, DataMapper, Pack1, Pack2, Eigen::QInt16, 0, Conjugate, PanelMode>::operator()(Eigen::QInt16*, const DataMapper&, Index, Index, Index, Index)':
./tensorflow/tsl/framework/fixedpoint/MatMatProductAVX2.h:176:5: error: there are no arguments to 'assert' that depend on a template parameter, so a declaration of 'assert' must be available [-fpermissive]
  176 |     assert(false &&
      |     ^~~~~~
./tensorflow/tsl/framework/fixedpoint/MatMatProductAVX2.h:176:5: note: (if you use '-fpermissive', G++ will accept your code, but allowing the use of an undeclared name is deprecated)
./tensorflow/tsl/framework/fixedpoint/MatMatProductAVX2.h: In member function 'void Eigen::internal::gemm_pack_rhs<Eigen::QInt16, Index, DataMapper, nr, 0, Conjugate, PanelMode>::operator()(Eigen::QInt16*, const DataMapper&, Index, Index, Index, Index)':
./tensorflow/tsl/framework/fixedpoint/MatMatProductAVX2.h:262:5: error: there are no arguments to 'assert' that depend on a template parameter, so a declaration of 'assert' must be available [-fpermissive]
  262 |     assert(false &&
      |     ^~~~~~
./tensorflow/tsl/framework/fixedpoint/MatMatProductAVX2.h: In member function 'void Eigen::internal::gebp_kernel<Eigen::QInt16, Eigen::QInt16, Index, DataMapper, mr, nr, ConjugateLhs, ConjugateRhs>::operator()(const DataMapper&, const Eigen::QInt16*, const Eigen::QInt16*, Index, Index, Index, Eigen::QInt32, Index, Index, Index, Index)':
./tensorflow/tsl/framework/fixedpoint/MatMatProductAVX2.h:363:5: error: there are no arguments to 'assert' that depend on a template parameter, so a declaration of 'assert' must be available [-fpermissive]
  363 |     assert(false &&
      |     ^~~~~~
Target //tensorflow/tools/pip_package:build_pip_package failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 9.326s, Critical Path: 6.23s
INFO: 179 processes: 115 internal, 64 local.
FAILED: Build did NOT complete successfully
@bhavani-subramanian
Copy link
Contributor

Tagging @penpornk and @cantonios.

@ke1ding
Copy link
Author

ke1ding commented Feb 4, 2023

Eigen test also breaks after eigen update.

Before:
Executed 4597 out of 4623 tests: 4050 tests pass, 26 fail to build and 547 fail locally.

After:
Executed 2640 out of 4629 tests: 2227 tests pass, 1989 fail to build and 413 fail locally.

@cantonios
Copy link
Contributor

@ke1ding Thanks for letting me know. Someone must be defining EIGEN_NO_DEBUG, and then using asserts somewhere without first including <cassert>.

Previously, Eigen blindly included the <cassert> header unconditionally. However, we found out there are ODR issues with using assert in header files (Eigen being a header-only library) that prevent it from compiling with C++20 modules. So Eigen now completely disable asserts if EIGEN_NO_DEBUG is defined, and we no longer include the header.

It seems like some files within TensorFlow use plain assert(...) without first including the header themselves. I'll fix these. In the meantime, you can check your build flags to remove EIGEN_NO_DEBUG.

@ke1ding
Copy link
Author

ke1ding commented Feb 4, 2023

@cantonios Thanks

@tiruk007 tiruk007 added subtype: ubuntu/linux Ubuntu/Linux Build/Installation Issues TF 2.11 Issues related to TF 2.11 type:build/install Build and install issues and removed type:bug Bug labels Feb 6, 2023
@tensorflow tensorflow deleted a comment from saurabhmj11 Feb 6, 2023
@google-ml-butler
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

tensorflow-jenkins pushed a commit that referenced this issue Feb 28, 2023
The plain assert macro can lead to ODR violations when used in headers.  Fixes #59563.

PiperOrigin-RevId: 507500117
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
subtype: ubuntu/linux Ubuntu/Linux Build/Installation Issues TF 2.11 Issues related to TF 2.11 type:build/install Build and install issues
Projects
None yet
Development

No branches or pull requests

4 participants