C++ compilation of rule '//tensorflow/python:bfloat16_lib' failed #40688

mmartial · 2020-06-22T19:43:47Z

System information

OS Platform and Distribution: Linux Ubuntu 18.04 -- building inside Dockerfile with FROM nvidia/cuda:10.1-cudnn7-devel-ubuntu18.04
TensorFlow installed from: source
TensorFlow version: 2.2.0
Python version: 3.6.9
Installed using virtualenv? pip? conda?: No
Bazel version: 2.0.0 (extracted from _TF_MAX_BAZEL)
GCC/Compiler version: 7.4.0
CUDA/cuDNN version: 10.1 / 7
GPU model and memory: tested on Titan XP and RTX 2070 8GB

Describe the problem

Build fails with

ESC[0mESC[91mtensorflow/python/lib/core/bfloat16.cc: In function 'bool tensorflow::{anonymous}::Initialize()':
tensorflow/python/lib/core/bfloat16.cc:636:36: error: no match for call to '(tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>) (const c
har [6], <unresolved overloaded function type>, const std::array<int, 3>&)'
                       compare_types)) {
                                    ^
tensorflow/python/lib/core/bfloat16.cc:610:60: note: candidate: tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>
                             const std::array<int, 3>& types) {
                                                            ^
tensorflow/python/lib/core/bfloat16.cc:610:60: note:   no known conversion for argument 2 from '<unresolved overloaded function type>' to 'PyUFuncGenericFunction {aka void (*)(char**, const long int
*, const long int*, void*)}'
tensorflow/python/lib/core/bfloat16.cc:640:36: error: no match for call to '(tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>) (const c
har [10], <unresolved overloaded function type>, const std::array<int, 3>&)'
                       compare_types)) {
                                    ^
tensorflow/python/lib/core/bfloat16.cc:610:60: note: candidate: tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>
                             const std::array<int, 3>& types) {
                                                            ^
tensorflow/python/lib/core/bfloat16.cc:610:60: note:   no known conversion for argument 2 from '<unresolved overloaded function type>' to 'PyUFuncGenericFunction {aka void (*)(char**, const long int
*, const long int*, void*)}'
tensorflow/python/lib/core/bfloat16.cc:643:77: error: no match for call to '(tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>) (const c
har [5], <unresolved overloaded function type>, const std::array<int, 3>&)'
   if (!register_ufunc("less", CompareUFunc<Bfloat16LtFunctor>, compare_types)) {

                                                                            ^
tensorflow/python/lib/core/bfloat16.cc:610:60: note: candidate: tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>
                             const std::array<int, 3>& types) {
                                                            ^
tensorflow/python/lib/core/bfloat16.cc:610:60: note:   no known conversion for argument 2 from '<unresolved overloaded function type>' to 'PyUFuncGenericFunction {aka void (*)(char**, const long int*, const long int*, void*)}'
tensorflow/python/lib/core/bfloat16.cc:647:36: error: no match for call to '(tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>) (const char [8], <unresolved overloaded function type>, const std::array<int, 3>&)'
                       compare_types)) {
                                    ^
tensorflow/python/lib/core/bfloat16.cc:610:60: note: candidate: tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>
                             const std::array<int, 3>& types) {
                                                            ^
tensorflow/python/lib/core/bfloat16.cc:610:60: note:   no known conversion for argument 2 from '<unresolved overloaded function type>' to 'PyUFuncGenericFunction {aka void (*)(char**, const long int*, const long int*, void*)}'
tensorflow/python/lib/core/bfloat16.cc:651:36: error: no match for call to '(tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>) (const char [11], <unresolved overloaded function type>, const std::array<int, 3>&)'
                       compare_types)) {
                                    ^
tensorflow/python/lib/core/bfloat16.cc:610:60: note: candidate: tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>
                             const std::array<int, 3>& types) {
                                                            ^
tensorflow/python/lib/core/bfloat16.cc:610:60: note:   no known conversion for argument 2 from '<unresolved overloaded function type>' to 'PyUFuncGenericFunction {aka void (*)(char**, const long int*, const long int*, void*)}'
tensorflow/python/lib/core/bfloat16.cc:655:36: error: no match for call to '(tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>) (const char [14], <unresolved overloaded function type>, const std::array<int, 3>&)'
                       compare_types)) {
                                    ^
tensorflow/python/lib/core/bfloat16.cc:610:60: note: candidate: tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>
                             const std::array<int, 3>& types) {
                                                            ^
tensorflow/python/lib/core/bfloat16.cc:610:60: note:   no known conversion for argument 2 from '<unresolved overloaded function type>' to 'PyUFuncGenericFunction {aka void (*)(char**, const long int*, const long int*, void*)}'
ESC[0mESC[91mTarget //tensorflow/tools/pip_package:build_pip_package failed to build
ESC[0mESC[91mERROR: /usr/local/src/tensorflow/tensorflow/tools/pip_package/BUILD:62:1 C++ compilation of rule '//tensorflow/python:bfloat16_lib' failed (Exit 1)
ESC[0mESC[91mINFO: Elapsed time: 1828.057s, Critical Path: 881.14s
INFO: 13824 processes: 13824 local.
ESC[0mESC[91mFAILED: Build did NOT complete successfully
ESC[0mESC[91mFAILED: Build did NOT complete successfully
ESC[0mESC[91mCommand exited with non-zero status 1

Provide the exact sequence of commands / steps that you executed before running into the problem

Reproducible with the following Dockerfile

FROM nvidia/cuda:10.1-cudnn7-devel-ubuntu18.04

# Install system packages
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update -y \
  && apt-get install -y --no-install-recommends apt-utils \
  && apt-get install -y \
    build-essential \
    checkinstall \
    cmake \
    curl \
    g++ \
    gcc \
    git \
    locales \
    perl \
    pkg-config \
    protobuf-compiler \
    python3-dev \
    rsync \
    software-properties-common \
    unzip \
    wget \
    zip \
    zlib1g-dev \
  && apt-get clean

# UTF-8
RUN localedef -i en_US -c -f UTF-8 -A /usr/share/locale/locale.alias en_US.UTF-8
ENV LANG en_US.utf8

# Setup pip
RUN wget -q -O /tmp/get-pip.py --no-check-certificate https://bootstrap.pypa.io/get-pip.py \
  && python3 /tmp/get-pip.py \
  && pip3 install -U pip \
  && rm /tmp/get-pip.py
# Some TF tools expect a "python" binary
RUN ln -s $(which python3) /usr/local/bin/python

# /etc/ld.so.conf.d/nvidia.conf point to /usr/local/nvidia which seems to be missing, point to the cuda directory install for libraries
RUN cd /usr/local && ln -s cuda nvidia
ARG CTO_CUDA_VERSION="10.1"
ARG CTO_CUDA_PRIMEVERSION="10.0"
ARG CTO_CUDA_APT="cuda-npp-${CTO_CUDA_VERSION} cuda-cublas-${CTO_CUDA_PRIMEVERSION} cuda-cufft-${CTO_CUDA_VERSION} cuda-libraries-${CTO_CUDA_VERSION} cuda-npp-dev-${CTO_CUDA_VERSION} cuda-cublas-dev-${CTO_CUDA_PRIMEVERSION} cuda-cufft-dev-${CTO_CUDA_VERSION} cuda-libraries-dev-${CTO_CUDA_VERSION}"
RUN apt-get install -y --no-install-recommends \
  time ${CTO_CUDA_APT} \
  && apt-get clean

ENV LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/extras/CUPTI/lib64"

# Install Python tools 
RUN pip3 install -U \
  mock \
  numpy \
  setuptools \
  six \
  wheel \
  && pip3 install 'future>=0.17.1' \
  && pip3 install -U keras_applications --no-deps \
  && pip3 install -U keras_preprocessing --no-deps \
  && rm -rf /root/.cache/pip

## Download & Building TensorFlow from source
ARG LATEST_BAZELISK=1.5.0
ARG CTO_TENSORFLOW_VERSION="2.2.0"
RUN curl -s -Lo /usr/local/bin/bazel https://github.com/bazelbuild/bazelisk/releases/download/v${LATEST_BAZELISK}/bazelisk-linux-amd64 \
  && chmod +x /usr/local/bin/bazel \
  && mkdir -p /usr/local/src \
  && cd /usr/local/src \
  && wget -q --no-check-certificate https://github.com/tensorflow/tensorflow/archive/v${CTO_TENSORFLOW_VERSION}.tar.gz \
  && tar xfz v${CTO_TENSORFLOW_VERSION}.tar.gz \
  && mv tensorflow-${CTO_TENSORFLOW_VERSION} tensorflow \
  && rm v${CTO_TENSORFLOW_VERSION}.tar.gz \
  && cd /usr/local/src/tensorflow \
  && fgrep _TF_MAX_BAZEL configure.py | grep '=' | perl -ne 'print $1 if (m%\=\s+.([\d\.]+).$+%)' > .bazelversion
RUN cd /usr/local/src/tensorflow \
  && TF_CUDA_CLANG=0 TF_CUDA_VERSION=${CTO_CUDA_VERSION} TF_CUDNN_VERSION=7 TF_DOWNLOAD_CLANG=0 TF_DOWNLOAD_MKL=0 TF_ENABLE_XLA=0 TF_NEED_AWS=0 TF_NEED_COMPUTECPP=0 TF_NEED_CUDA=1 TF_NEED_GCP=0 TF_NEED_GDR=0 TF_NEED_HDFS=0 TF_NEED_JEMALLOC=1 TF_NEED_KAFKA=0 TF_NEED_MKL=0 TF_NEED_MPI=0 TF_NEED_OPENCL=0 TF_NEED_OPENCL_SYCL=0 TF_NEED_ROCM=0 TF_NEED_S3=0 TF_NEED_TENSORRT=0 TF_NEED_VERBS=0 TF_SET_ANDROID_WORKSPACE=0 TF_CUDA_COMPUTE_CAPABILITIES="5.3,6.0,6.1,6.2,7.0,7.2,7.5" GCC_HOST_COMPILER_PATH=$(which gcc) CC_OPT_FLAGS="-march=native" PYTHON_BIN_PATH=$(which python) PYTHON_LIB_PATH="$(python -c 'import site; print(site.getsitepackages()[0])')" ./configure
RUN cd /usr/local/src/tensorflow \
  && time bazel build --verbose_failures --config=opt --config=v2 --config=cuda //tensorflow/tools/pip_package:build_pip_package \
  && time ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg \
  && time pip3 install /tmp/tensorflow_pkg/tensorflow-*.whl

CMD bash

Built using docker build --tag cto:test .

Note tested with CUDA 10.1, 10.0 and 10.2.
Also occurs with TF 1.15.3

Any other info / logs
I can provide the full build log if requested (91MB)

 ---> Running in 9690386205a5
2020/06/22 14:11:17 Downloading https://releases.bazel.build/2.0.0/release/bazel-2.0.0-linux-x86_64...
Extracting Bazel installation...
You have bazel 2.0.0 installed.
Found CUDA 10.1 in:
    /usr/local/cuda-10.1/lib64
    /usr/local/cuda-10.1/include
Found cuDNN 7 in:
    /usr/lib/x86_64-linux-gnu
    /usr/include


Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
        --config=mkl            # Build with MKL support.
        --config=monolithic     # Config for mostly static monolithic build.
        --config=ngraph         # Build with Intel nGraph support.
        --config=numa           # Build with NUMA support.
        --config=dynamic_kernels        # (Experimental) Build kernels into separate shared objects.
        --config=v2             # Build TensorFlow 2.x instead of 1.x.
Preconfigured Bazel build configs to DISABLE default on features:
        --config=noaws          # Disable AWS S3 filesystem support.
        --config=nogcp          # Disable GCP support.
        --config=nohdfs         # Disable HDFS support.
        --config=nonccl         # Disable NVIDIA NCCL support.
Configuration finished
Removing intermediate container 9690386205a5
 ---> 8910acc4d9c5
Step 19/20 : RUN cd /usr/local/src/tensorflow   && time bazel build --verbose_failures --config=opt --config=v2 --config=cuda //tensorflow/tools/pip_package:build_pip_package   && time ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg   && time pip3 install /tmp/tensorflow_pkg/tensorflow-*.whl
 ---> Running in 3b0267b1209d
ESC[91mStarting local Bazel server and connecting to it...
ESC[0mESC[91mWARNING: The following configs were expanded more than once: [v2, cuda, using_cuda]. For repeatable flags, repeats are counted twice and may lead to unexpected behavior.
ESC[0mESC[91mINFO: Options provided by the client:
  Inherited 'common' options: --isatty=0 --terminal_columns=80
ESC[0mESC[91mINFO: Reading rc options for 'build' from /usr/local/src/tensorflow/.bazelrc:
  Inherited 'common' options: --experimental_repo_remote_exec
INFO: Reading rc options for 'build' from /usr/local/src/tensorflow/.bazelrc:
  'build' options: --apple_platform_type=macos --define framework_shared_object=true --define open_source_build=true --java_toolchain=//third_party/toolchains/java:tf_java_toolchain --host_java_toolchain=//third_party/toolchains/java:tf_java_toolchain --define=use_fast_cpp_protos=true --define=allow_oversize_protos=true --spawn_strategy=standalone -c opt --announce_rc --define=grpc_no_ares=true --noincompatible_remove_legacy_whole_archive --noincompatible_prohibit_aapt1 --enable_platform_specific_config --config=v2
INFO: Reading rc options for 'build' from /usr/local/src/tensorflow/.tf_configure.bazelrc:
  'build' options: --action_env PYTHON_BIN_PATH=/usr/local/bin/python --action_env PYTHON_LIB_PATH=/usr/local/lib/python3.6/dist-packages --python_path=/usr/local/bin/python --action_env TF_CUDA_VERSION=10.1 --action_env TF_CUDNN_VERSION=7 --action_env CUDA_TOOLKIT_PATH=/usr/local/cuda-10.1 --action_env TF_CUDA_COMPUTE_CAPABILITIES=5.3,6.0,6.1,6.2,7.0,7.2,7.5 --action_env LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/extras/CUPTI/lib64 --action_env GCC_HOST_COMPILER_PATH=/usr/bin/x86_64-linux-gnu-gcc-7 --config=cuda --action_env TF_CONFIGURE_IOS=0
INFO: Found applicable config definition build:v2 in file /usr/local/src/tensorflow/.bazelrc: --define=tf_api_version=2 --action_env=TF2_BEHAVIOR=1
ESC[0mESC[91mINFO: Found applicable config definition build:cuda in file /usr/local/src/tensorflow/.bazelrc: --config=using_cuda --define=using_cuda_nvcc=true
INFO: Found applicable config definition build:using_cuda in file /usr/local/src/tensorflow/.bazelrc: --define=using_cuda=true --action_env TF_NEED_CUDA=1 --crosstool_top=@local_config_cuda//crosstool:toolchain
INFO: Found applicable config definition build:opt in file /usr/local/src/tensorflow/.tf_configure.bazelrc: --copt=-march=native --host_copt=-march=native --define with_default_optimizations=true
INFO: Found applicable config definition build:v2 in file /usr/local/src/tensorflow/.bazelrc: --define=tf_api_version=2 --action_env=TF2_BEHAVIOR=1
INFO: Found applicable config definition build:cuda in file /usr/local/src/tensorflow/.bazelrc: --config=using_cuda --define=using_cuda_nvcc=true
INFO: Found applicable config definition build:using_cuda in file /usr/local/src/tensorflow/.bazelrc: --define=using_cuda=true --action_env TF_NEED_CUDA=1 --crosstool_top=@local_config_cuda//crosstool:toolchain
INFO: Found applicable config definition build:linux in file /usr/local/src/tensorflow/.bazelrc: --copt=-w --define=PREFIX=/usr --define=LIBDIR=$(PREFIX)/lib --define=INCLUDEDIR=$(PREFIX)/include --cxxopt=-std=c++14 --host_cxxopt=-std=c++14 --config=dynamic_kernels
INFO: Found applicable config definition build:dynamic_kernels in file /usr/local/src/tensorflow/.bazelrc: --define=dynamic_loaded_kernels=true --copt=-DAUTOLOAD_DYNAMIC_KERNELS
ESC[0mESC[91mLoading: 
ESC[0mESC[91mLoading: 0 packages loaded
ESC[0mESC[91mLoading: 0 packages loaded
ESC[0mESC[91mLoading: 0 packages loaded
ESC[0mESC[91mDEBUG: Rule 'io_bazel_rules_docker' indicated that a canonical reproducible form can be obtained by modifying arguments shallow_since = "1556410077 -0400"
ESC[0mESC[91mDEBUG: Call stack for the definition of repository 'io_bazel_rules_docker' which is a git_repository (rule definition at /root/.cache/bazel/_bazel_root/bbcc73fcc5c2b01ab08b6bcf7c29e42e/external/bazel_tools/tools/build_defs/repo/git.bzl:195:18):
 - /root/.cache/bazel/_bazel_root/bbcc73fcc5c2b01ab08b6bcf7c29e42e/external/bazel_toolchains/repositories/repositories.bzl:37:9
 - /usr/local/src/tensorflow/WORKSPACE:37:1
ESC[0mESC[91mLoading: 0 packages loaded
ESC[0mESC[91mLoading: 0 packages loaded
ESC[0mESC[91mLoading: 0 packages loaded
ESC[0mESC[91mLoading: 0 packages loaded
    currently loading: tensorflow/tools/pip_package
ESC[0mESC[91mDEBUG: /root/.cache/bazel/_bazel_root/bbcc73fcc5c2b01ab08b6bcf7c29e42e/external/bazel_tools/tools/cpp/lib_cc_configure.bzl:118:5: 
[...]```

The text was updated successfully, but these errors were encountered:

adk9 · 2020-06-22T20:27:52Z

Possibly duplicate of #40654? I'm also seeing the same issue with v2.2.0 and GCC 7.5.0.

mmartial · 2020-06-22T22:02:43Z

Possibly duplicate of #40654? I'm also seeing the same issue with v2.2.0 and GCC 7.5.0.

Thank you :)
I looked at the PR, and am integrating this change into the Dockerfile:
&& perl -pi.bak -e 's%, CompareUFunc%, (PyUFuncGenericFunction) CompareUFunc%g' tensorflow/python/lib/core/bfloat16.cc \
right before the ./configure step.

Will report if this fixes the build

mmartial · 2020-06-22T22:59:26Z

Confirming that this solves the build issue (for 2.20 and 10.1):

Target //tensorflow/tools/pip_package:build_pip_package up-to-date:
  bazel-bin/tensorflow/tools/pip_package/build_pip_package

Going to check for 2.20 and 10.2 then 1.15.3 and 10.2

adk9 · 2020-06-23T01:07:00Z

In our testing, we found that this issue breaks building from source for TF 1.15.x and 2.x.

The issue comes from source build being incompatible with numpy 1.19.0 which has a breaking ABI change (numpy/numpy#15355) and was released 2 days ago.

Fixing numpy to pre 1.19.0 fixes the issue:

pip install numpy<1.19.0

Fix numpy to pre-1.19.0 because of breaking ABI change in numpy 1.19.0 (numpy/numpy#15355) See tensorflow/tensorflow#40688.

mmartial · 2020-06-23T01:37:03Z

Thank you, will force numpy<1.19.0 for the time being.

Also confirming 2.20 and 10.2 compiles with the PyUFuncGenericFunction fix

mmartial · 2020-06-23T03:18:32Z

Confirming successful compilation on 2.20 and 10.2 with numpy<1.19.0.

Okay to close the issue.

I have different problems with 1.15.3 and nvlink (with 10.0 and 10.1) but if I can not resolve, I will open a different ticket.

cbalint13 · 2020-06-23T07:08:44Z

@mmartial ,

On behalf #40654 thank you for investigation !

amahendrakar · 2020-06-23T13:47:33Z

Okay to close the issue.

Marking the issue as closed, as it is resolved. Please feel free to re-open the issue if required. Thanks!

google-ml-butler · 2020-06-23T13:47:38Z

Are you satisfied with the resolution of your issue?
Yes
No

xlnwel · 2020-06-23T14:03:16Z

Hi, @mmartial. I also run into the same issue. I've downgraded numpy to 1.18.5 but it did not fix the problem. Here's the error message I received

tensorflow/python/lib/core/bfloat16.cc:610:60: note: candidate: tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>
const std::array<int, 3>& types) {
^
tensorflow/python/lib/core/bfloat16.cc:610:60: note: no known conversion for argument 2 from ‘’ to ‘PyUFuncGenericFunction {aka void ()(char**, const long int, const long int*, void*)}’
tensorflow/python/lib/core/bfloat16.cc:640:36: error: no match for call to ‘(tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>) (const char [10], , const std::array<int, 3>&)’
compare_types)) {
^
tensorflow/python/lib/core/bfloat16.cc:610:60: note: candidate: tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>
const std::array<int, 3>& types) {
^
tensorflow/python/lib/core/bfloat16.cc:610:60: note: no known conversion for argument 2 from ‘’ to ‘PyUFuncGenericFunction {aka void ()(char**, const long int, const long int*, void*)}’
tensorflow/python/lib/core/bfloat16.cc:643:77: error: no match for call to ‘(tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>) (const char [5], , const std::array<int, 3>&)’
if (!register_ufunc("less", CompareUFunc, compare_types)) {
^
tensorflow/python/lib/core/bfloat16.cc:610:60: note: candidate: tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>
const std::array<int, 3>& types) {
^
tensorflow/python/lib/core/bfloat16.cc:610:60: note: no known conversion for argument 2 from ‘’ to ‘PyUFuncGenericFunction {aka void ()(char**, const long int, const long int*, void*)}’
tensorflow/python/lib/core/bfloat16.cc:647:36: error: no match for call to ‘(tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>) (const char [8], , const std::array<int, 3>&)’
compare_types)) {
^
tensorflow/python/lib/core/bfloat16.cc:610:60: note: candidate: tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>
const std::array<int, 3>& types) {
^
tensorflow/python/lib/core/bfloat16.cc:610:60: note: no known conversion for argument 2 from ‘’ to ‘PyUFuncGenericFunction {aka void ()(char**, const long int, const long int*, void*)}’
tensorflow/python/lib/core/bfloat16.cc:651:36: error: no match for call to ‘(tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>) (const char [11], , const std::array<int, 3>&)’
compare_types)) {
^
tensorflow/python/lib/core/bfloat16.cc:610:60: note: candidate: tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>
const std::array<int, 3>& types) {
^
tensorflow/python/lib/core/bfloat16.cc:610:60: note: no known conversion for argument 2 from ‘’ to ‘PyUFuncGenericFunction {aka void ()(char**, const long int, const long int*, void*)}’
tensorflow/python/lib/core/bfloat16.cc:655:36: error: no match for call to ‘(tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>) (const char [14], , const std::array<int, 3>&)’
compare_types)) {
^
tensorflow/python/lib/core/bfloat16.cc:610:60: note: candidate: tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>
const std::array<int, 3>& types) {
^
tensorflow/python/lib/core/bfloat16.cc:610:60: note: no known conversion for argument 2 from ‘’ to ‘PyUFuncGenericFunction {aka void ()(char**, const long int, const long int*, void*)}’
Target //tensorflow/tools/pip_package:build_pip_package failed to build
Use --verbose_failures to see the command lines of failed build steps.
ERROR: /home/aptx4869/github/tensorflow/tensorflow/tools/pip_package/BUILD:62:1 C++ compilation of rule '//tensorflow/python:bfloat16_lib' failed (Exit 1)
INFO: Elapsed time: 24.977s, Critical Path: 13.97s
INFO: 2 processes: 2 local.
FAILED: Build did NOT complete successfully

It seems related to PyUFuncGenericFunction which you mentioned to "fix". How should I do it?

Here's my environment information:

Ubuntu: 18.04
TF: r2.2 (trying to build from source but failed)
CUDA: 10.2
CuDNN: 7.6.5
python: 3.7.7
Bazel: 2.0.0

And here's the output of pip list in case you need:

certifi 2020.6.20
decorator 4.4.0
future 0.18.2
h5py 2.10.0
Keras-Applications 1.0.8
Keras-Preprocessing 1.1.2
mock 4.0.2
numpy 1.18.5
pip 20.1.1
setuptools 47.3.1.post20200622
six 1.15.0
wheel 0.34.2

mmartial · 2020-06-23T16:58:00Z

tensorflow/python/lib/core/bfloat16.cc:655:36: error: no match for call to ‘(tensorflow::{anonymous}::Initialize()::<lambda(const char*, PyUFuncGenericFunction, const std::array<int, 3>&)>) (const char [14], , const std::array<int, 3>&)’
compare_types)) {

@xlnwel Looking at the above, I wonder: did you use both the PR (or the Perl command) and the numpy<1.19.0 ?

To make it work, I had to use either of those.

I am putting below the updated Dockerfile hoping it works for you:

ARG CTO_CUDA_VERSION="10.2"
FROM nvidia/cuda:${CTO_CUDA_VERSION}-cudnn7-devel-ubuntu18.04
ARG CTO_CUDA_VERSION="10.2"

# Install system packages
ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update -y \
  && apt-get install -y --no-install-recommends apt-utils \
  && apt-get install -y \
    build-essential \
    checkinstall \
    cmake \
    curl \
    g++ \
    gcc \
    git \
    locales \
    perl \
    pkg-config \
    protobuf-compiler \
    python3-dev \
    rsync \
    software-properties-common \
    unzip \
    wget \
    zip \
    zlib1g-dev \
  && apt-get clean

# UTF-8
RUN localedef -i en_US -c -f UTF-8 -A /usr/share/locale/locale.alias en_US.UTF-8
ENV LANG en_US.utf8

# Setup pip
RUN wget -q -O /tmp/get-pip.py --no-check-certificate https://bootstrap.pypa.io/get-pip.py \
  && python3 /tmp/get-pip.py \
  && pip3 install -U pip \
  && rm /tmp/get-pip.py
# Some TF tools expect a "python" binary
RUN ln -s $(which python3) /usr/local/bin/python

# /etc/ld.so.conf.d/nvidia.conf point to /usr/local/nvidia which seems to be missing, point to the cuda directory install for libraries
RUN cd /usr/local && ln -s cuda nvidia
ARG CTO_CUDA_PRIMEVERSION="10.0"
ARG CTO_CUDA_APT="cuda-npp-${CTO_CUDA_VERSION} cuda-cublas-${CTO_CUDA_PRIMEVERSION} cuda-cufft-${CTO_CUDA_VERSION} cuda-libraries-${CTO_CUDA_VERSION} cuda-npp-dev-${CTO_CUDA_VERSION} cuda-cublas-dev-${CTO_CUDA_PRIMEVERSION} cuda-cufft-dev-${CTO_CUDA_VERSION} cuda-libraries-dev-${CTO_CUDA_VERSION}"
RUN echo ${CTO_CUDA_APT}
RUN apt-get install -y --no-install-recommends \
  time ${CTO_CUDA_APT} \
  && apt-get clean

# Install TensorRT. Requires that libcudnn7 is installed
#RUN apt-get install -y --no-install-recommends \
#  libnvinfer6=6.0.1-1+cuda${CTO_CUDA_VERSION} \
#  libnvinfer-dev=6.0.1-1+cuda${CTO_CUDA_VERSION} \
#  libnvinfer-plugin6=6.0.1-1+cuda${CTO_CUDA_VERSION} \
#  && apt-get clean

ENV LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/extras/CUPTI/lib64"

# Install Python tools 
RUN pip3 install -U \
  mock \
  'numpy<1.19.0' \
  setuptools \
  six \
  wheel \
  && pip3 install 'future>=0.17.1' \
  && pip3 install -U keras_applications --no-deps \
  && pip3 install -U keras_preprocessing --no-deps \
  && rm -rf /root/.cache/pip

## Download & Building TensorFlow from source
ARG LATEST_BAZELISK=1.5.0
ARG CTO_TENSORFLOW_VERSION="2.2.0"
RUN curl -s -Lo /usr/local/bin/bazel https://github.com/bazelbuild/bazelisk/releases/download/v${LATEST_BAZELISK}/bazelisk-linux-amd64 \
  && chmod +x /usr/local/bin/bazel \
  && mkdir -p /usr/local/src \
  && cd /usr/local/src \
  && wget -q --no-check-certificate https://github.com/tensorflow/tensorflow/archive/v${CTO_TENSORFLOW_VERSION}.tar.gz \
  && tar xfz v${CTO_TENSORFLOW_VERSION}.tar.gz \
  && mv tensorflow-${CTO_TENSORFLOW_VERSION} tensorflow \
  && rm v${CTO_TENSORFLOW_VERSION}.tar.gz \
  && cd /usr/local/src/tensorflow \
  && fgrep _TF_MAX_BAZEL configure.py | grep '=' | perl -ne 'print $1 if (m%\=\s+.([\d\.]+).$+%)' > .bazelversion
RUN cd /usr/local/src/tensorflow \
  && TF_CUDA_CLANG=0 TF_CUDA_VERSION=${CTO_CUDA_VERSION} TF_CUDNN_VERSION=7 TF_DOWNLOAD_CLANG=0 TF_DOWNLOAD_MKL=0 TF_ENABLE_XLA=0 TF_NEED_AWS=0 TF_NEED_COMPUTECPP=0 TF_NEED_CUDA=1 TF_NEED_GCP=0 TF_NEED_GDR=0 TF_NEED_HDFS=0 TF_NEED_JEMALLOC=1 TF_NEED_KAFKA=0 TF_NEED_MKL=0 TF_NEED_MPI=0 TF_NEED_OPENCL=0 TF_NEED_OPENCL_SYCL=0 TF_NEED_ROCM=0 TF_NEED_S3=0 TF_NEED_TENSORRT=0 TF_NEED_VERBS=0 TF_SET_ANDROID_WORKSPACE=0 TF_CUDA_COMPUTE_CAPABILITIES="5.3,6.0,6.1,6.2,7.0,7.2,7.5" GCC_HOST_COMPILER_PATH=$(which gcc) CC_OPT_FLAGS="-march=native" PYTHON_BIN_PATH=$(which python) PYTHON_LIB_PATH="$(python -c 'import site; print(site.getsitepackages()[0])')" ./configure
RUN cd /usr/local/src/tensorflow \
  && time bazel build --verbose_failures --config=opt --config=v2 --config=cuda //tensorflow/tools/pip_package:build_pip_package \
  && time ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg \
  && time pip3 install /tmp/tensorflow_pkg/tensorflow-*.whl

CMD bash

xlnwel · 2020-06-23T22:56:25Z

Hi @mmartial. I build TF2.2 following the official guide without Dockerfile. Do you mean I should execute fgrep _TF_MAX_BAZEL configure.py | grep '=' | perl -ne 'print $1 if (m%\=\s+.([\d\.]+).$+%)' > .bazelversion before bazel build? I've tried it and then found bazel build //tensorflow/tools/pip_package:build_pip_package seemed to have no effect at all.

mmartial · 2020-06-24T01:57:33Z

Hi @mmartial. I build TF2.2 following the official guide without Dockerfile. Do you mean I should execute fgrep _TF_MAX_BAZEL configure.py | grep '=' | perl -ne 'print $1 if (m%\=\s+.([\d\.]+).$+%)' > .bazelversion before bazel build? I've tried it and then found bazel build //tensorflow/tools/pip_package:build_pip_package seemed to have no effect at all.

No, I was referring to #40688 (comment)
When I saw your error, I saw the PyUFuncGenericFunction which was fixed by that call.

Note that simply using 'numpy<1.19.0' in my pip install was sufficient to solve this issue.

xlnwel · 2020-06-24T02:11:18Z

Unfortunately it does not work for me. Maybe I have to open another issue.

tensorflow/tensorflow#40688 tensorflow/tensorflow#40742

See #40688. PiperOrigin-RevId: 318122157 Change-Id: Ief46c5610f3aaf0cdd7d43ce1a10d6d87e8e8e01

See tensorflow#40688. PiperOrigin-RevId: 318122157 Change-Id: Ief46c5610f3aaf0cdd7d43ce1a10d6d87e8e8e01

The new version of numpy seems to have some API change that doesn't work with TF anymore. See tensorflow/tensorflow#40688 (comment)

See tensorflow issue at tensorflow/tensorflow#40688

ebrevdo · 2020-08-04T17:47:55Z

@amahendrakar this is still an issue on r2.3; i just tried to build tf branch r2.3 on my ubuntu system and ran into the same issue; the perl rewrite works, we should just fix the code to do a proper static cast. @penpornk who's closest to this code?

penpornk · 2020-08-04T20:50:01Z

@ebrevdo This is Python glue code so it probably belongs to TF Core folks. But the fixes are simple enough. I can do it.

penpornk · 2020-08-05T00:40:41Z

It seems @chsigg has already fixed this in 75ea0b3 recently (Jun 26, 2020) by adding an overload function. I tried compiling with the latest code from master and didn't get the error anymore.

(It's too late to patch this into releases 2.2.0 and 2.3.0 now, so this issue will be fixed in release 2.4.0.)

ebrevdo · 2020-08-05T01:32:27Z

Thank you for the update!

…

On Tue, Aug 4, 2020 at 5:40 PM Penporn Koanantakool < ***@***.***> wrote: It seems @chsigg <https://github.com/chsigg> has already fixed this in 75ea0b3 <75ea0b3> recently (Jun 26, 2020) by adding an overload function. I tried compiling with the latest code from master and didn't get the error anymore. (It's too late to patch this into releases 2.2.0 and 2.3.0 now, so this issue will be fixed in release 2.4.0.) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#40688 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AANWFG73UXJSXZ36KTGUOW3R7CTBTANCNFSM4OE7JZYQ> .

mmartial · 2020-08-05T02:48:27Z

I will take this opportunity to update another part of the perl-glue (for the bazel version): since in 2.3.0 it looks like the max bazel version is now set to 3.99 (while the lastest bazel release is 3.4.1), I added the following version checking function to my 2.3.0 build (still in testing)

ARG LATEST_BAZEL=3.4.1
[...]
  && fgrep _TF_MAX_BAZEL configure.py | grep '=' | perl -ne '$lb="'${LATEST_BAZEL}'";$brv=$1 if (m%\=\s+.([\d\.]+).$+%); sub numit{@g=split(m%\.%,$_[0]);return(1000000*$g[0]+1000*$g[1]+$g[2]);}; if (&numit($brv) > &numit($lb)) { print "$lb" } else {print "$brv"};' > .bazelversion \
  && bazel clean \
[...]

mightyroy · 2020-08-29T03:52:10Z

Remember to run bazel clean after downgrading numpy. I downloaded numpy 1.18 and it worked.

Numpy introduced a breaking API change in version 1.19.x, see [1]. There is a simple fix [2] available in the master branch. [1]: tensorflow/tensorflow#40688 [2]: tensorflow/tensorflow@75ea0b3

Numpy introduced a breaking API change in version 1.19.x, see [1]. There is a simple fix [2] available in the master branch. [1]: tensorflow/tensorflow#40688 [2]: tensorflow/tensorflow@75ea0b3 (cherry picked from commit 8f5bfd6)

…ericFunction. See tensorflow/tensorflow#40688, tensorflow/tensorflow#40654. PiperOrigin-RevId: 318452381 Change-Id: Icc5152f2b020ef19882a49e3c86ac80bbe048d64

mihaimaruseac · 2020-10-16T20:40:28Z

Fixed by aafe25d

google-ml-butler · 2020-10-16T20:40:30Z

Are you satisfied with the resolution of your issue?
Yes
No

vinhphat89 · 2020-10-18T09:09:54Z

Confirmed r2.2 source incompatible with Numpy version 1.18.5 and 1.19.0. Downgrade numpy < 1.18.5 will resolve the issues.

pip install numpy<1.18.5

mmartial added the type:build/install Build and install issues label Jun 22, 2020

google-ml-butler bot assigned amahendrakar Jun 22, 2020

adk9 added a commit to adk9/docs that referenced this issue Jun 23, 2020

Fix numpy version to pre-1.19.0

1d21b64

Fix numpy to pre-1.19.0 because of breaking ABI change in numpy 1.19.0 (numpy/numpy#15355) See tensorflow/tensorflow#40688.

This was referenced Jun 23, 2020

Fix numpy version to pre-1.19.0 adk9/docs#1

Merged

Fix numpy version to pre-1.19.0 tensorflow/docs#1603

Merged

bzhaoopenstack mentioned this issue Jun 23, 2020

Add arm64 third-party CI #40463

Closed

amahendrakar closed this as completed Jun 23, 2020

amahendrakar added subtype: ubuntu/linux Ubuntu/Linux Build/Installation Issues TF 2.2 Issues related to TF 2.2 labels Jun 23, 2020

xlnwel mentioned this issue Jun 24, 2020

compilation of rule '//tensorflow/python:bfloat16_lib' failed when building TF2.2 from source #40742

Closed

bzhaoopenstack added a commit to theopenlab/openlab-zuul-jobs that referenced this issue Jun 24, 2020

Fix tf build as numpy latest release error

51776db

tensorflow/tensorflow#40688 tensorflow/tensorflow#40742

bzhaoopenstack mentioned this issue Jun 24, 2020

Fix tf build as numpy latest release error theopenlab/openlab-zuul-jobs#908

Merged

bzhaoopenstack added a commit to theopenlab/openlab-zuul-jobs that referenced this issue Jun 24, 2020

Fix tf build as numpy latest release error (#908)

fb8dfff

tensorflow/tensorflow#40688 tensorflow/tensorflow#40742

tensorflow-copybara pushed a commit that referenced this issue Jun 24, 2020

Pin numpy below 1.19.0 in Dockerfiles

b8b2016

See #40688. PiperOrigin-RevId: 318122157 Change-Id: Ief46c5610f3aaf0cdd7d43ce1a10d6d87e8e8e01

geetachavan1 pushed a commit to geetachavan1/tensorflow that referenced this issue Jun 25, 2020

Pin numpy below 1.19.0 in Dockerfiles

04bd81d

See tensorflow#40688. PiperOrigin-RevId: 318122157 Change-Id: Ief46c5610f3aaf0cdd7d43ce1a10d6d87e8e8e01

meteorcloudy added a commit to bazelbuild/continuous-integration that referenced this issue Jun 26, 2020

TensorFlow: install numpy < 1.19.0

7f296e5

The new version of numpy seems to have some API change that doesn't work with TF anymore. See tensorflow/tensorflow#40688 (comment)

pjattke mentioned this issue Jul 27, 2020

Building ngraph-tf fails intel/he-transformer#45

Closed

pjattke pushed a commit to pjattke/docker-he-transformer that referenced this issue Jul 31, 2020

Adds a fix for numpy

2d571cf

See tensorflow issue at tensorflow/tensorflow#40688

ebrevdo reopened this Aug 4, 2020

pjattke mentioned this issue Aug 6, 2020

Fixes issue related to bfloat in numpy intel/he-transformer#48

Merged

amahendrakar assigned jvishnuvardhan and unassigned amahendrakar Aug 11, 2020

jvishnuvardhan assigned angerson and unassigned jvishnuvardhan Aug 11, 2020

jvishnuvardhan added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Aug 11, 2020

returnwellbeing mentioned this issue Aug 12, 2020

Tensorflow returnwellbeing/.myconfig#2

Open

avdv mentioned this issue Aug 25, 2020

tensorflow: 1.15.2 -> 1.15.3 NixOS/nixpkgs#93177

Closed

10 tasks

chrstphrchvz mentioned this issue Aug 26, 2020

py-tensorflow, py-tensorflow1: fix for NumPy 1.19.x, use LTS JDK, legacysupport pg macports/macports-ports#8203

Merged

9 tasks

danieldk mentioned this issue Sep 21, 2020

[20.09] tensorflow: Fix compilation with numpy 1.19.x NixOS/nixpkgs#98373

Merged

10 tasks

PatriceVignola mentioned this issue Sep 26, 2020

Provide overload to cope with const-ness change of NumPy's PyUFuncGenericFunction. microsoft/tensorflow-directml#56

Merged

mihaimaruseac closed this as completed Oct 16, 2020

tensorflow locked as resolved and limited conversation to collaborators Oct 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

C++ compilation of rule '//tensorflow/python:bfloat16_lib' failed #40688

C++ compilation of rule '//tensorflow/python:bfloat16_lib' failed #40688

mmartial commented Jun 22, 2020

adk9 commented Jun 22, 2020

mmartial commented Jun 22, 2020 •

edited

mmartial commented Jun 22, 2020 •

edited

adk9 commented Jun 23, 2020

mmartial commented Jun 23, 2020 •

edited

mmartial commented Jun 23, 2020

cbalint13 commented Jun 23, 2020

amahendrakar commented Jun 23, 2020

google-ml-butler bot commented Jun 23, 2020

xlnwel commented Jun 23, 2020 •

edited

mmartial commented Jun 23, 2020 •

edited

xlnwel commented Jun 23, 2020

mmartial commented Jun 24, 2020

xlnwel commented Jun 24, 2020 •

edited

ebrevdo commented Aug 4, 2020

penpornk commented Aug 4, 2020

penpornk commented Aug 5, 2020

ebrevdo commented Aug 5, 2020 via email

mmartial commented Aug 5, 2020 •

edited

mightyroy commented Aug 29, 2020 •

edited

mihaimaruseac commented Oct 16, 2020

google-ml-butler bot commented Oct 16, 2020

vinhphat89 commented Oct 18, 2020

C++ compilation of rule '//tensorflow/python:bfloat16_lib' failed #40688

C++ compilation of rule '//tensorflow/python:bfloat16_lib' failed #40688

Comments

mmartial commented Jun 22, 2020

adk9 commented Jun 22, 2020

mmartial commented Jun 22, 2020 • edited

mmartial commented Jun 22, 2020 • edited

adk9 commented Jun 23, 2020

mmartial commented Jun 23, 2020 • edited

mmartial commented Jun 23, 2020

cbalint13 commented Jun 23, 2020

amahendrakar commented Jun 23, 2020

google-ml-butler bot commented Jun 23, 2020

xlnwel commented Jun 23, 2020 • edited

mmartial commented Jun 23, 2020 • edited

xlnwel commented Jun 23, 2020

mmartial commented Jun 24, 2020

xlnwel commented Jun 24, 2020 • edited

ebrevdo commented Aug 4, 2020

penpornk commented Aug 4, 2020

penpornk commented Aug 5, 2020

ebrevdo commented Aug 5, 2020 via email

mmartial commented Aug 5, 2020 • edited

mightyroy commented Aug 29, 2020 • edited

mihaimaruseac commented Oct 16, 2020

google-ml-butler bot commented Oct 16, 2020

vinhphat89 commented Oct 18, 2020

mmartial commented Jun 22, 2020 •

edited

mmartial commented Jun 22, 2020 •

edited

mmartial commented Jun 23, 2020 •

edited

xlnwel commented Jun 23, 2020 •

edited

mmartial commented Jun 23, 2020 •

edited

xlnwel commented Jun 24, 2020 •

edited

mmartial commented Aug 5, 2020 •

edited

mightyroy commented Aug 29, 2020 •

edited