New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build Error Windows, No results found for more than one instance of overloaded function "google::protobuf::Arena::CreateM essageInternal". #17067

Closed
apiszcz opened this Issue Feb 16, 2018 · 31 comments

Comments

Projects
None yet
10 participants
@apiszcz

apiszcz commented Feb 16, 2018

No results found for more than one instance of overloaded function "google::protobuf::Arena::CreateM essageInternal".

:: - MSVC Community 2015
:: - ANACONDA 4.4.4 (Python 3.5.5)
:: - CMake 3.10.2
:: - SWIG 3.0.12
:: - GIT 2.15.1.windows.2
:: - NVIDIA CUDA 9.1, CUDNN 7.05

TensorFlow pulled from git repo on 2/12/2018

C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/arena.h(719): error : more than one instance of overloaded function "google::protobuf::Arena::CreateM
essageInternal" matches the argument list: [C:\g\tensorflow\tensorflow\tensorflow\contrib\cmake\build\tf_core_gpu_kernels.vcxproj]
function template "T *google::protobuf::Arena::CreateMessageInternal(google::protobuf::Arena *)"
function template "T *google::protobuf::Arena::CreateMessageInternal<T,Args...>(Args &&...)"
argument types are: (google::protobuf::Arena *)
detected during:
instantiation of "Msg *google::protobuf::Arena::CreateMaybeMessage(google::protobuf::Arena *, google::protobuf::internal::true_type) [with Msg=tensorflow::TensorShapeProto_Dim]"
(729): here
instantiation of "T *google::protobuf::Arena::CreateMaybeMessage(google::protobuf::Arena *) [with T=tensorflow::TensorShapeProto_Dim]"
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/repeated_field.h(648): here
instantiation of "GenericType *google::protobuf::internal::GenericTypeHandler::New(google::protobuf::Arena *) [with GenericType=tensorflow::TensorShapeProto_Dim]"
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/repeated_field.h(675): here
instantiation of "GenericType *google::protobuf::internal::GenericTypeHandler::NewFromPrototype(const GenericType *, google::protobuf::Arena *) [with GenericType=tensorflow::
TensorShapeProto_Dim]"
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/repeated_field.h(1554): here
instantiation of "TypeHandler::Type *google::protobuf::internal::RepeatedPtrFieldBase::Add(TypeHandler::Type *) [with TypeHandler=google::protobuf::RepeatedPtrField::TypeHandler]"
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/repeated_field.h(2001): here
instantiation of "Element *google::protobuf::RepeatedPtrField::Add() [with Element=tensorflow::TensorShapeProto_Dim]"
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build\tensorflow/core/framework/tensor_shape.pb.h(471): here

C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/arena.h(719): error : more than one instance of overloaded function "google::protobuf::Arena::CreateM
essageInternal" matches the argument list: [C:\g\tensorflow\tensorflow\tensorflow\contrib\cmake\build\tf_core_gpu_kernels.vcxproj]
function template "T *google::protobuf::Arena::CreateMessageInternal(google::protobuf::Arena *)"
function template "T *google::protobuf::Arena::CreateMessageInternal<T,Args...>(Args &&...)"
cwise_op_bitwise_and.cc
argument types are: (google::protobuf::Arena *)
detected during:
instantiation of "Msg *google::protobuf::Arena::CreateMaybeMessage(google::protobuf::Arena *, google::protobuf::internal::true_type) [with Msg=tensorflow::ResourceHandleProto]"
(729): here
instantiation of "T *google::protobuf::Arena::CreateMaybeMessage(google::protobuf::Arena *) [with T=tensorflow::ResourceHandleProto]"
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/repeated_field.h(648): here
instantiation of "GenericType *google::protobuf::internal::GenericTypeHandler::New(google::protobuf::Arena *) [with GenericType=tensorflow::ResourceHandleProto]"
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/repeated_field.h(675): here
instantiation of "GenericType *google::protobuf::internal::GenericTypeHandler::NewFromPrototype(const GenericType *, google::protobuf::Arena *) [with GenericType=tensorflow::
ResourceHandleProto]"
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/repeated_field.h(1554): here
instantiation of "TypeHandler::Type *google::protobuf::internal::RepeatedPtrFieldBase::Add(TypeHandler::Type *) [with TypeHandler=google::protobuf::RepeatedPtrField::TypeHandler]"
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/repeated_field.h(2001): here
instantiation of "Element *google::protobuf::RepeatedPtrField::Add() [with Element=tensorflow::ResourceHandleProto]"
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build\tensorflow/core/framework/tensor.pb.h(1091): here

C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/arena.h(719): error : more than one instance of overloaded function "google::protobuf::Arena::CreateM
essageInternal" matches the argument list: [C:\g\tensorflow\tensorflow\tensorflow\contrib\cmake\build\tf_core_gpu_kernels.vcxproj]
function template "T *google::protobuf::Arena::CreateMessageInternal(google::protobuf::Arena *)"
function template "T *google::protobuf::Arena::CreateMessageInternal<T,Args...>(Args &&...)"
argument types are: (google::protobuf::Arena *)
detected during:
instantiation of "Msg *google::protobuf::Arena::CreateMaybeMessage(google::protobuf::Arena *, google::protobuf::internal::true_type) [with Msg=tensorflow::VariantTensorDataProto]"
(729): here
instantiation of "T *google::protobuf::Arena::CreateMaybeMessage(google::protobuf::Arena *) [with T=tensorflow::VariantTensorDataProto]"
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/repeated_field.h(648): here
instantiation of "GenericType *google::protobuf::internal::GenericTypeHandler::New(google::protobuf::Arena *) [with GenericType=tensorflow::VariantTensorDataProto]"
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/repeated_field.h(675): here
instantiation of "GenericType *google::protobuf::internal::GenericTypeHandler::NewFromPrototype(const GenericType *, google::protobuf::Arena *) [with GenericType=tensorflow::
VariantTensorDataProto]"
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/repeated_field.h(1554): here
instantiation of "TypeHandler::Type *google::protobuf::internal::RepeatedPtrFieldBase::Add(TypeHandler::Type *) [with TypeHandler=google::protobuf::RepeatedPtrField::TypeHandler]"
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/repeated_field.h(2001): here
instantiation of "Element *google::protobuf::RepeatedPtrField::Add() [with Element=tensorflow::VariantTensorDataProto]"
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build\tensorflow/core/framework/tensor.pb.h(1121): here

C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/arena.h(719): error : more than one instance of overloaded function "google::protobuf::Arena::CreateM
essageInternal" matches the argument list: [C:\g\tensorflow\tensorflow\tensorflow\contrib\cmake\build\tf_core_gpu_kernels.vcxproj]
function template "T *google::protobuf::Arena::CreateMessageInternal(google::protobuf::Arena *)"
function template "T *google::protobuf::Arena::CreateMessageInternal<T,Args...>(Args &&...)"
argument types are: (google::protobuf::Arena *)
detected during:
instantiation of "Msg *google::protobuf::Arena::CreateMaybeMessage(google::protobuf::Arena *, google::protobuf::internal::true_type) [with Msg=tensorflow::TensorProto]"
(729): here
instantiation of "T *google::protobuf::Arena::CreateMaybeMessage(google::protobuf::Arena *) [with T=tensorflow::TensorProto]"
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/repeated_field.h(648): here
instantiation of "GenericType *google::protobuf::internal::GenericTypeHandler::New(google::protobuf::Arena *) [with GenericType=tensorflow::TensorProto]"
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/repeated_field.h(675): here
instantiation of "GenericType *google::protobuf::internal::GenericTypeHandler::NewFromPrototype(const GenericType *, google::protobuf::Arena *) [with GenericType=tensorflow::
TensorProto]"
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/repeated_field.h(1554): here
instantiation of "TypeHandler::Type *google::protobuf::internal::RepeatedPtrFieldBase::Add(TypeHandler::Type *) [with TypeHandler=google::protobuf::RepeatedPtrField::TypeHandler]"
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/repeated_field.h(2001): here
instantiation of "Element *google::protobuf::RepeatedPtrField::Add() [with Element=tensorflow::TensorProto]"
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build\tensorflow/core/framework/tensor.pb.h(1365): here

4 errors detected in the compilation of "C:/Users/user/AppData/Local/Temp/tmpxft_0002aa34_00000000-12_adjust_contrast_op_gpu.cu.compute_52.cpp1.ii".

@tensorflowbutler

This comment has been minimized.

Member

tensorflowbutler commented Feb 17, 2018

Thank you for your post. We noticed you have not filled out the following field in the issue template. Could you update them if they are relevant in your case, or leave them as N/A? Thanks.
Have I written custom code
OS Platform and Distribution
TensorFlow installed from
TensorFlow version
Bazel version
CUDA/cuDNN version
GPU model and memory
Exact command to reproduce

@apiszcz

This comment has been minimized.

apiszcz commented Feb 17, 2018

:: - MSVC Community 2015
:: - ANACONDA 4.4.4 (Python 3.5.5)
:: - CMake 3.10.2
:: - SWIG 3.0.12
:: - GIT 2.15.1.windows.2
:: - NVIDIA CUDA 9.1, CUDNN 7.05
:: No BAZEL
:: TensorFlow version (latest from GIT repository)

@ningzhihui

This comment has been minimized.

ningzhihui commented Feb 17, 2018

I get the same error. can you help me?

OS Platform and Distribution windows8.2 msys2
TensorFlow installed from source
TensorFlow version r1.6
Bazel version 0.10.0
CUDA/cuDNN version cuda 9.1 cuDNN 7.0
GPU model and memory NVIDIA GeForce GTX 850
Exact command to reproduce
bazel build -c opt --action_env=USE_MSVC_WRAPPER=1 --config=win-cuda //tensorflow/tools/pip_package:build_pip_package --verbose_failures

@apiszcz

This comment has been minimized.

apiszcz commented Feb 17, 2018

After a few more trials, i am seeing that error 4 times,

more than one instance of overloaded function "google::protobuf::Arena::CreateMessageInternal"

Here is the code section from arena.h: (line 719 called out in the error message) in the first return below

  // CreateMessage<T> requires that T supports arenas, but this private method
  // works whether or not T supports arenas. These are not exposed to user code
  // as it can cause confusing API usages, and end up having double free in
  // user code. These are used only internally from LazyField and Repeated
  // fields, since they are designed to work in all mode combinations.
  template <typename Msg> GOOGLE_PROTOBUF_ATTRIBUTE_ALWAYS_INLINE
  static Msg* CreateMaybeMessage(Arena* arena, google::protobuf::internal::true_type) {
    return CreateMessageInternal<Msg>(arena);
  }

  template <typename T> GOOGLE_PROTOBUF_ATTRIBUTE_ALWAYS_INLINE
  static T* CreateMaybeMessage(Arena* arena, google::protobuf::internal::false_type) {
    return CreateInternal<T>(arena);
  }

  template <typename T> GOOGLE_PROTOBUF_ATTRIBUTE_ALWAYS_INLINE
  static T* CreateMaybeMessage(Arena* arena) {
    return CreateMaybeMessage<T>(arena, is_arena_constructable<T>());
  }

  // Just allocate the required size for the given type assuming the
  // type has a trivial constructor.
  template<typename T> GOOGLE_PROTOBUF_ATTRIBUTE_ALWAYS_INLINE
  T* CreateInternalRawArray(size_t num_elements) {
    GOOGLE_CHECK_LE(num_elements,
             std::numeric_limits<size_t>::max() / sizeof(T))
        << "Requested size is too large to fit into size_t.";
    const size_t n = internal::AlignUpTo8(sizeof(T) * num_elements);
    // Monitor allocation if needed.
    AllocHook(RTTI_TYPE_ID(T), n);
    return static_cast<T*>(impl_.AllocateAligned(n));
  }

     ```

C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/arena.h(719): error : more than one instance of overloaded function "google::protobuf::Arena::CreateMessageInternal" matches the argument list: [C:\g\tensorflow\tensorflow\tensorflow\contrib\cmake\build\tf_core_gpu_kernels.vcxproj]
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/arena.h(719): error : more than one instance of overloaded function "google::protobuf::Arena::CreateMessageInternal" matches the argument list: [C:\g\tensorflow\tensorflow\tensorflow\contrib\cmake\build\tf_core_gpu_kernels.vcxproj]
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/arena.h(719): error : more than one instance of overloaded function "google::protobuf::Arena::CreateMessageInternal" matches the argument list: [C:\g\tensorflow\tensorflow\tensorflow\contrib\cmake\build\tf_core_gpu_kernels.vcxproj]
C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/protobuf/src/protobuf/src\google/protobuf/arena.h(719): error : more than one instance of overloaded function "google::protobuf::Arena::CreateMessageInternal" matches the argument list: [C:\g\tensorflow\tensorflow\tensorflow\contrib\cmake\build\tf_core_gpu_kernels.vcxproj]

@apiszcz

This comment has been minimized.

apiszcz commented Feb 17, 2018

The four templates for CreateMessageInternal appear at 852 - 881 in arena.h.

There are four argument signatures:
854:(Args&&... args)
862:()
869:(const Arg& arg)
877:(const Arg1& arg1, const Arg2& arg2)

@apiszcz

This comment has been minimized.

apiszcz commented Feb 17, 2018

File: tf_core_gpu_kernels.vcxproj code block 235-247

cd C:\g\tensorflow\tensorflow\tensorflow\contrib\cmake\build\CMakeFiles\tf_core_gpu_kernels.dir\__\__\core\kernels
if %errorlevel% neq 0 goto :cmEnd
C:
if %errorlevel% neq 0 goto :cmEnd
"C:\Program Files\CMake\bin\cmake.exe" -E make_directory C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/CMakeFiles/tf_core_gpu_kernels.dir/__/__/core/kernels/$(Configuration)
if %errorlevel% neq 0 goto :cmEnd
"C:\Program Files\CMake\bin\cmake.exe" -D verbose:BOOL=OFF -D "CCBIN:PATH=$(VCInstallDir)bin" -D build_configuration:STRING=$(ConfigurationName) -D generated_file:STRING=C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/CMakeFiles/tf_core_gpu_kernels.dir/__/__/core/kernels/$(Configuration)/tf_core_gpu_kernels_generated_adjust_contrast_op_gpu.cu.cc.obj -D generated_cubin_file:STRING=C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/CMakeFiles/tf_core_gpu_kernels.dir/__/__/core/kernels/$(Configuration)/tf_core_gpu_kernels_generated_adjust_contrast_op_gpu.cu.cc.obj.cubin.txt -P C:/g/tensorflow/tensorflow/tensorflow/contrib/cmake/build/CMakeFiles/tf_core_gpu_kernels.dir/__/__/core/kernels/tf_core_gpu_kernels_generated_adjust_contrast_op_gpu.cu.cc.obj.Debug.cmake
if %errorlevel% neq 0 goto :cmEnd
:cmEnd
endlocal &amp; call :cmErrorLevel %errorlevel% &amp; goto :cmDone
:cmErrorLevel
exit /b %1
:cmDone
@ningzhihui

This comment has been minimized.

ningzhihui commented Feb 17, 2018

Is there conflict between bazel vesion and tensorflow
This build done on Windows 8 , Is there GPU support?

c:\temp_bazel_nin\aki74rxt\execroot\org_tensorflow\external\eigen_archiv e\eigen\src/Core/ArrayWrapper.h(94): warning: __declspec attributes ignored

external/protobuf_archive/src\google/protobuf/arena_impl.h(57): warning: integer conversion resulted in a change of sign

external/protobuf_archive/src\google/protobuf/arena_impl.h(304): warning: intege r conversion resulted in a change of sign

external/protobuf_archive/src\google/protobuf/arena_impl.h(305): warning: intege r conversion resulted in a change of sign

external/protobuf_archive/src\google/protobuf/map.h(1030): warning: invalid frie nd declaration

external/protobuf_archive/src\google/protobuf/map_entry_lite.h(151): warning: ex ception specification for virtual function "google::protobuf::internal::MapEntry Impl<Derived, Base, Key, Value, kKeyFieldType, kValueFieldType, default_enum_val ue>::~MapEntryImpl [with Derived=T, Base=google::protobuf::MessageLite, Key=Key, Value=Value, kKeyFieldType=kKeyFieldType, kValueFieldType=kValueFieldType, defa ult_enum_value=default_enum_value]" is incompatible with that of overridden func tion "google::protobuf::MessageLite::~MessageLite"

external/protobuf_archive/src\google/protobuf/arena.h(719): error: more than one instance of overloaded function "google::protobuf::Arena::CreateMessageInternal " matches the argument list:
function template "T *google::protobuf::Arena::CreateMessageInternal (google::protobuf::Arena *)"
function template "T *google::protobuf::Arena::CreateMessageInternal <T,Args...>(Args &&...)"
argument types are: (google::protobuf::Arena *)
detected during:
instantiation of "Msg *google::protobuf::Arena::CreateMaybeMessage(google::protobuf::Arena *, google::protobuf::internal::true_type) [with Msg= tensorflow::TensorShapeProto_Dim]"
(729): here
instantiation of "T *google::protobuf::Arena::CreateMaybeMessage( google::protobuf::Arena *) [with T=tensorflow::TensorShapeProto_Dim]"
external/protobuf_archive/src\google/protobuf/repeated_field.h(648): here
instantiation of "GenericType *google::protobuf::internal::GenericTy peHandler::New(google::protobuf::Arena *) [with GenericType=tensorf low::TensorShapeProto_Dim]"
external/protobuf_archive/src\google/protobuf/repeated_field.h(675): here
instantiation of "GenericType *google::protobuf::internal::GenericTy peHandler::NewFromPrototype(const GenericType *, google::protobuf:: Arena *) [with GenericType=tensorflow::TensorShapeProto_Dim]"
external/protobuf_archive/src\google/protobuf/repeated_field.h(1554): here
instantiation of "TypeHandler::Type *google::protobuf::internal::Rep eatedPtrFieldBase::Add(TypeHandler::Type *) [with TypeHandler=googl e::protobuf::RepeatedPtrFieldtensorflow::TensorShapeProto_Dim::TypeHandler]"
external/protobuf_archive/src\google/protobuf/repeated_field.h(2001): here
instantiation of "Element *google::protobuf::RepeatedPtrField::Add() [with Element=tensorflow::TensorShapeProto_Dim]"
bazel-out/x64_windows-py3-opt/genfiles\tensorflow/core/framework/tensor_shape.pb .h(471): here

external/protobuf_archive/src\google/protobuf/arena.h(719): error: more than one instance of overloaded function "google::protobuf::Arena::CreateMessageInternal " matches the argument list:
function template "T *google::protobuf::Arena::CreateMessageInternal (google::protobuf::Arena *)"
function template "T *google::protobuf::Arena::CreateMessageInternal <T,Args...>(Args &&...)"
argument types are: (google::protobuf::Arena *)
detected during:
instantiation of "Msg *google::protobuf::Arena::CreateMaybeMessage(google::protobuf::Arena *, google::protobuf::internal::true_type) [with Msg= tensorflow::ResourceHandleProto]"
(729): here
instantiation of "T *google::protobuf::Arena::CreateMaybeMessage( google::protobuf::Arena *) [with T=tensorflow::ResourceHandleProto]"
external/protobuf_archive/src\google/protobuf/repeated_field.h(648): here
instantiation of "GenericType *google::protobuf::internal::GenericTy peHandler::New(google::protobuf::Arena *) [with GenericType=tensorf low::ResourceHandleProto]"
external/protobuf_archive/src\google/protobuf/repeated_field.h(675): here
instantiation of "GenericType *google::protobuf::internal::GenericTy peHandler::NewFromPrototype(const GenericType *, google::protobuf:: Arena *) [with GenericType=tensorflow::ResourceHandleProto]"
external/protobuf_archive/src\google/protobuf/repeated_field.h(1554): here
instantiation of "TypeHandler::Type *google::protobuf::internal::Rep eatedPtrFieldBase::Add(TypeHandler::Type *) [with TypeHandler=googl e::protobuf::RepeatedPtrFieldtensorflow::ResourceHandleProto::TypeHandler]"
external/protobuf_archive/src\google/protobuf/repeated_field.h(2001): here
instantiation of "Element *google::protobuf::RepeatedPtrField::Add() [with Element=tensorflow::ResourceHandleProto]"
bazel-out/x64_windows-py3-opt/genfiles\tensorflow/core/framework/tensor.pb.h(109 1): here

external/protobuf_archive/src\google/protobuf/arena.h(719): error: more than one instance of overloaded function "google::protobuf::Arena::CreateMessageInternal " matches the argument list:
function template "T *google::protobuf::Arena::CreateMessageInternal (google::protobuf::Arena *)"
function template "T *google::protobuf::Arena::CreateMessageInternal <T,Args...>(Args &&...)"
argument types are: (google::protobuf::Arena *)
detected during:
instantiation of "Msg *google::protobuf::Arena::CreateMaybeMessage(google::protobuf::Arena *, google::protobuf::internal::true_type) [with Msg= tensorflow::VariantTensorDataProto]"
(729): here
instantiation of "T *google::protobuf::Arena::CreateMaybeMessage( google::protobuf::Arena *) [with T=tensorflow::VariantTensorDataProto]"
external/protobuf_archive/src\google/protobuf/repeated_field.h(648): here
instantiation of "GenericType *google::protobuf::internal::GenericTy peHandler::New(google::protobuf::Arena *) [with GenericType=tensorf low::VariantTensorDataProto]"
external/protobuf_archive/src\google/protobuf/repeated_field.h(675): here
instantiation of "GenericType *google::protobuf::internal::GenericTy peHandler::NewFromPrototype(const GenericType *, google::protobuf:: Arena *) [with GenericType=tensorflow::VariantTensorDataProto]"
external/protobuf_archive/src\google/protobuf/repeated_field.h(1554): here
instantiation of "TypeHandler::Type *google::protobuf::internal::Rep eatedPtrFieldBase::Add(TypeHandler::Type *) [with TypeHandler=googl e::protobuf::RepeatedPtrFieldtensorflow::VariantTensorDataProto::TypeHandler]"
external/protobuf_archive/src\google/protobuf/repeated_field.h(2001): here
instantiation of "Element *google::protobuf::RepeatedPtrField::Add() [with Element=tensorflow::VariantTensorDataProto]"
bazel-out/x64_windows-py3-opt/genfiles\tensorflow/core/framework/tensor.pb.h(112 1): here

external/protobuf_archive/src\google/protobuf/arena.h(719): error: more than one instance of overloaded function "google::protobuf::Arena::CreateMessageInternal " matches the argument list:
function template "T *google::protobuf::Arena::CreateMessageInternal (google::protobuf::Arena *)"
function template "T *google::protobuf::Arena::CreateMessageInternal <T,Args...>(Args &&...)"
argument types are: (google::protobuf::Arena *)
detected during:
instantiation of "Msg *google::protobuf::Arena::CreateMaybeMessage(google::protobuf::Arena *, google::protobuf::internal::true_type) [with Msg= tensorflow::TensorProto]"
(729): here
instantiation of "T *google::protobuf::Arena::CreateMaybeMessage( google::protobuf::Arena *) [with T=tensorflow::TensorProto]"
external/protobuf_archive/src\google/protobuf/repeated_field.h(648): here
instantiation of "GenericType *google::protobuf::internal::GenericTy peHandler::New(google::protobuf::Arena *) [with GenericType=tensorf low::TensorProto]"
external/protobuf_archive/src\google/protobuf/repeated_field.h(675): here
instantiation of "GenericType *google::protobuf::internal::GenericTy peHandler::NewFromPrototype(const GenericType *, google::protobuf:: Arena *) [with GenericType=tensorflow::TensorProto]"
external/protobuf_archive/src\google/protobuf/repeated_field.h(1554): here
instantiation of "TypeHandler::Type *google::protobuf::internal::Rep eatedPtrFieldBase::Add(TypeHandler::Type *) [with TypeHandler=googl e::protobuf::RepeatedPtrFieldtensorflow::TensorProto::TypeHandler]"
external/protobuf_archive/src\google/protobuf/repeated_field.h(2001): here
instantiation of "Element *google::protobuf::RepeatedPtrField::Add() [with Element=tensorflow::TensorProto]"
bazel-out/x64_windows-py3-opt/genfiles\tensorflow/core/framework/tensor.pb.h(136 5): here

4 errors detected in the compilation of "C:/Users/NIN1/AppData/Local/Temp/nv cc_inter_files_tmp_dir/extract_image_patches_op_gpu.cu.compute_52.cpp1.ii".
Target //tensorflow/tools/pip_package:build_pip_package failed to build
ERROR: F:/machinelearning/tensorflow/tensorflow/tools/pip_package/BUILD:53:1 C++ compilation of rule '//tensorflow/core/kernels:extract_image_patches_op_gpu' fa iled (Exit 1): msvc_cl.bat failed: error executing command
cd C:/temp/_bazel_ningzhihui/aki74rxt/execroot/org_tensorflow
SET CUDA_COMPUTE_CAPABILITIE=None
SET CUDA_PATH=F:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v9.1
SET CUDA_TOOLKIT_PATH=F:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v9. 1
SET CUDNN_INSTALL_PATH=F:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v9 .1
SET INCLUDE=E:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\INCLUDE;E :\Program Files (x86)\Microsoft Visual Studio 14.0\VC\ATLMFC\INCLUDE;C:\Program Files (x86)\Windows Kits\10\include\10.0.10240.0\ucrt;C:\Program Files (x86)\Win dows Kits\NETFXSDK\4.6.1\include\um;C:\Program Files (x86)\Windows Kits\8.1\incl ude\shared;C:\Program Files (x86)\Windows Kits\8.1\include\um;C:\Program Files (x86)\Windows Kits\8.1\include\winrt;
SET LIB=E:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\LIB\amd64;E:\ Program Files (x86)\Microsoft Visual Studio 14.0\VC\ATLMFC\LIB\amd64;C:\Program Files (x86)\Windows Kits\10\lib\10.0.10240.0\ucrt\x64;C:\Program Files (x86)\Win dows Kits\NETFXSDK\4.6.1\lib\um\x64;C:\Program Files (x86)\Windows Kits\8.1\lib\ winv6.3\um\x64;
SET NO_WHOLE_ARCHIVE_OPTION=1
SET PATH=F:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v9.1/bin;E:\Prog ram Files (x86)\Microsoft Visual Studio 14.0\Common7\IDE\CommonExtensions\Micros oft\TestWindow;E:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\amd64; C:\windows\Microsoft.NET\Framework64\v4.0.30319;E:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\VCPackages;E:\Program Files (x86)\Microsoft Visual Studio 14.0\Common7\IDE;E:\Program Files (x86)\Microsoft Visual Studio 14.0\Common7\To ols;E:\Program Files (x86)\Microsoft Visual Studio 14.0\Team Tools\Performance T ools\x64;E:\Program Files (x86)\Microsoft Visual Studio 14.0\Team Tools\Performa nce Tools;C:\Program Files (x86)\Windows Kits\8.1\bin\x64;C:\Program Files (x86) \Windows Kits\8.1\bin\x86;C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\b in\NETFX 4.6.1 Tools\x64;;C:\windows\system32
SET PWD=/proc/self/cwd
SET PYTHON_BIN_PATH=E:/Python/Python35/python.exe
SET PYTHON_LIB_PATH=E:/Python/Python35/lib/site-packages
SET TEMP=C:\Users\NIN
1\AppData\Local\Temp
SET TF_CUDA_CLANG=0
SET TF_CUDA_COMPUTE_CAPABILITIES=5.0
SET TF_CUDA_VERSION=9.1
SET TF_CUDNN_VERSION=7
SET TF_NEED_CUDA=1
SET TF_NEED_OPENCL_SYCL=0
SET TMP=C:\Users\NIN~1\AppData\Local\Temp
SET USE_MSVC_WRAPPER=1
external/local_config_cc/wrapper/bin/msvc_cl.bat /c tensorflow/core/kernels/ex tract_image_patches_op_gpu.cu.cc /Fobazel-out/x64_windows-py3-opt/bin/tensorflow /core/kernels/objs/extract_image_patches_op_gpu/tensorflow/core/kernels/extract image_patches_op_gpu.cu.o /nologo /DCOMPILER_MSVC /DNOMINMAX /D_WIN32_WINNT=0x0 600 /D_CRT_SECURE_NO_DEPRECATE /D_CRT_SECURE_NO_WARNINGS /D_SILENCE_STDEXT_HASH DEPRECATION_WARNINGS /bigobj /Zm500 /J /Gy /GF /EHsc /wd4351 /wd4291 /wd4250 /wd 4996 -Xcompilation-mode=opt -DGEMMLOWP_ALLOW_SLOW_SCALAR_FALLBACK -w /I. /Ibazel -out/x64_windows-py3-opt/genfiles /Iexternal/nsync /Ibazel-out/x64_windows-py3-o pt/genfiles/external/nsync /Iexternal/bazel_tools /Ibazel-out/x64_windows-py3-op t/genfiles/external/bazel_tools /Iexternal/eigen_archive /Ibazel-out/x64_windows -py3-opt/genfiles/external/eigen_archive /Iexternal/local_config_sycl /Ibazel-ou t/x64_windows-py3-opt/genfiles/external/local_config_sycl /Iexternal/com_google absl /Ibazel-out/x64_windows-py3-opt/genfiles/external/com_google_absl /Iexterna l/gif_archive /Ibazel-out/x64_windows-py3-opt/genfiles/external/gif_archive /Iex ternal/jpeg /Ibazel-out/x64_windows-py3-opt/genfiles/external/jpeg /Iexternal/pr otobuf_archive /Ibazel-out/x64_windows-py3-opt/genfiles/external/protobuf_archiv e /Iexternal/com_googlesource_code_re2 /Ibazel-out/x64_windows-py3-opt/genfiles/ external/com_googlesource_code_re2 /Iexternal/farmhash_archive /Ibazel-out/x64_w indows-py3-opt/genfiles/external/farmhash_archive /Iexternal/fft2d /Ibazel-out/x 64_windows-py3-opt/genfiles/external/fft2d /Iexternal/highwayhash /Ibazel-out/x6 4_windows-py3-opt/genfiles/external/highwayhash /Iexternal/png_archive /Ibazel-o ut/x64_windows-py3-opt/genfiles/external/png_archive /Iexternal/zlib_archive /Ib azel-out/x64_windows-py3-opt/genfiles/external/zlib_archive /Iexternal/snappy /I bazel-out/x64_windows-py3-opt/genfiles/external/snappy /Iexternal/local_config_c uda /Ibazel-out/x64_windows-py3-opt/genfiles/external/local_config_cuda /Iextern al/nsync/public /Ibazel-out/x64_windows-py3-opt/genfiles/external/nsync/public / Iexternal/bazel_tools/tools/cpp/gcc3 /Iexternal/eigen_archive /Ibazel-out/x64_wi ndows-py3-opt/genfiles/external/eigen_archive /Iexternal/gif_archive/lib /Ibazel -out/x64_windows-py3-opt/genfiles/external/gif_archive/lib /Iexternal/gif_archiv e/windows /Ibazel-out/x64_windows-py3-opt/genfiles/external/gif_archive/windows /Iexternal/protobuf_archive/src /Ibazel-out/x64_windows-py3-opt/genfiles/externa l/protobuf_archive/src /Iexternal/farmhash_archive/src /Ibazel-out/x64_windows-p y3-opt/genfiles/external/farmhash_archive/src /Iexternal/png_archive /Ibazel-out /x64_windows-py3-opt/genfiles/external/png_archive /Iexternal/zlib_archive /Ibaz el-out/x64_windows-py3-opt/genfiles/external/zlib_archive /Iexternal/local_confi g_cuda/cuda /Ibazel-out/x64_windows-py3-opt/genfiles/external/local_config_cuda/ cuda /Iexternal/local_config_cuda/cuda/cuda/include /Ibazel-out/x64_windows-py3- opt/genfiles/external/local_config_cuda/cuda/cuda/include /Iexternal/local_confi g_cuda/cuda/cuda/include/crt /Ibazel-out/x64_windows-py3-opt/genfiles/external/l ocal_config_cuda/cuda/cuda/include/crt /DEIGEN_MPL2_ONLY /D__CLANG_SUPPORT_DYN_A NNOTATION__ /DTENSORFLOW_USE_ABSL /DTF_USE_SNAPPY /showIncludes /MD /O2 /DNDEBUG -x cuda -DGOOGLE_CUDA=1 -nvcc_options=relaxed-constexpr -nvcc_options=ftz=true -DGOOGLE_CUDA=1 -DTENSORFLOW_MONOLITHIC_BUILD /D__VERSION__="MSVC" /DPLATFORM_WI NDOWS /DEIGEN_HAS_C99_MATH /DTENSORFLOW_USE_EIGEN_THREADPOOL /DEIGEN_AVOID_STL_A RRAY /Iexternal/gemmlowp /wd4018 /U_HAS_EXCEPTIONS /D_HAS_EXCEPTIONS=1 /EHsc /DN OGDI /DTF_COMPILE_LIBRARY
INFO: Elapsed time: 1063.446s, Critical Path: 244.98s
FAILED: Build did NOT complete successfully

@smoothdvd

This comment has been minimized.

smoothdvd commented Feb 18, 2018

same error here. macOS: 10.13.2, cuda: 9.1, cudnn: 7.0.5

@ghost

This comment has been minimized.

ghost commented Feb 18, 2018

I am seeing the same error. macOS: 10.13.3, cuda: 9.1, cudnn: 7.0.5

I tried to reduce arena.h to something that results in a similar error, but I have not succeeded: https://gist.github.com/dtrebbien/cf50157fe61a59d2a98d780bd2c92de6

What I did to work around this issue is edit arena.h, changing line 666 to rename CreateMessageInternal to CreateMessageInternal_ and changing line 719 to call CreateMessageInternal_ instead. These changes appear to fix the compilation error; however, I am encountering another error:

INFO: From Compiling tensorflow/contrib/image/kernels/image_ops_gpu.cu.cc:
external/eigen_archive/unsupported/Eigen/CXX11/../../../Eigen/src/Core/arch/CUDA/Half.h(508): error: explicit specialization of class "std::__1::numeric_limits" must precede its first use (
(388): here)
@smoothdvd

This comment has been minimized.

smoothdvd commented Feb 18, 2018

@dtrebbien Yes, when I build with tensorflow 1.5, same error of Eigen/src/Core/arch/CUDA/Half.h(508)

@ghost

This comment has been minimized.

ghost commented Feb 18, 2018

The "explicit specialization" error is puzzling. The error message says that the explicit specialization of std::numeric_limits (for Eigen::half) must precede where it's used. That makes sense. However, line 388 of Half.h is the return line in:

EIGEN_STRONG_INLINE EIGEN_DEVICE_FUNC bool (isfinite)(const half& a) {
  return !(isinf EIGEN_NOT_A_MACRO (a)) && !(isnan EIGEN_NOT_A_MACRO (a));
}

I don't see how that is using std::numeric_limits<Eigen::half>.

@apiszcz

This comment has been minimized.

apiszcz commented Feb 18, 2018

arena.h locations and all four md5sums are different.

./tensorflow/contrib/cmake/build/grpc/src/grpc/src/core/lib/support/arena.h
./tensorflow/contrib/cmake/build/grpc/src/grpc/third_party/protobuf/src/google/protobuf/arena.h
./tensorflow/contrib/cmake/build/protobuf/src/protobuf/src/google/protobuf/arena.h
./tensorflow/core/lib/core/arena.h

75d243f7510cc93ab28c088cb9602a0b *./tensorflow/contrib/cmake/build/grpc/src/grpc/src/core/lib/support/arena.h
66acc2ebe4d1d831f460330d4424d0d4 *./tensorflow/contrib/cmake/build/grpc/src/grpc/third_party/protobuf/src/google/protobuf/arena.h
768863b9a7d853bff47c2bea50aaaaa9 *./tensorflow/contrib/cmake/build/protobuf/src/protobuf/src/google/protobuf/arena.h
d78b033cb70a006c012a02690a0d46a1 *./tensorflow/core/lib/core/arena.h
@ghost

This comment has been minimized.

ghost commented Feb 18, 2018

The work around that I mentioned in a previous comment was half-baked. I have created a fork of protobuf containing all of the changes which appear to be necessary. To use this fork, I applied the following patch to tensorflow/workspace.bzl:

--- a/tensorflow/workspace.bzl
+++ b/tensorflow/workspace.bzl
@@ -353,11 +353,11 @@ def tf_workspace(path_prefix="", tf_repo_name=""):
   tf_http_archive(
       name = "protobuf_archive",
       urls = [
-          "https://mirror.bazel.build/github.com/google/protobuf/archive/396336eb961b75f03b25824fe86cf6490fb75e3a.tar.gz",
-          "https://github.com/google/protobuf/archive/396336eb961b75f03b25824fe86cf6490fb75e3a.tar.gz",
+          "https://mirror.bazel.build/github.com/dtrebbien/protobuf/archive/50f552646ba1de79e07562b41f3999fe036b4fd0.tar.gz",
+          "https://github.com/dtrebbien/protobuf/archive/50f552646ba1de79e07562b41f3999fe036b4fd0.tar.gz",
       ],
-      sha256 = "846d907acf472ae233ec0882ef3a2d24edbbe834b80c305e867ac65a1f2c59e3",
-      strip_prefix = "protobuf-396336eb961b75f03b25824fe86cf6490fb75e3a",
+      sha256 = "eb16b33431b91fe8cee479575cee8de202f3626aaf00d9bf1783c6e62b4ffbc7",
+      strip_prefix = "protobuf-50f552646ba1de79e07562b41f3999fe036b4fd0",
   )
 
   # We need to import the protobuf library under the names com_google_protobuf
@ghost

This comment has been minimized.

ghost commented Feb 19, 2018

I have created a pull request for eigen/eigen to fix the "explicit specialization of class "std::__1::numeric_limits" must precede its first use" error: https://bitbucket.org/eigen/eigen/pull-requests/369/

To use the patched eigen, I applied the following patch to tensorflow/workspace.bzl:

--- a/tensorflow/workspace.bzl
+++ b/tensorflow/workspace.bzl
@@ -120,11 +120,11 @@ def tf_workspace(path_prefix="", tf_repo_name=""):
   tf_http_archive(
       name = "eigen_archive",
       urls = [
-          "https://mirror.bazel.build/bitbucket.org/eigen/eigen/get/2355b229ea4c.tar.gz",
-          "https://bitbucket.org/eigen/eigen/get/2355b229ea4c.tar.gz",
+          "https://mirror.bazel.build/bitbucket.org/dtrebbien/eigen/get/374842a18727.tar.gz",
+          "https://bitbucket.org/dtrebbien/eigen/get/374842a18727.tar.gz",
       ],
-      sha256 = "0cadb31a35b514bf2dfd6b5d38205da94ef326ec6908fc3fd7c269948467214f",
-      strip_prefix = "eigen-eigen-2355b229ea4c",
+      sha256 = "fa26e9b9ff3a2692b092d154685ec88d6cb84d4e1e895006541aff8603f15c16",
+      strip_prefix = "dtrebbien-eigen-374842a18727",
       build_file = str(Label("//third_party:eigen.BUILD")),
   )
 
@ningzhihui

This comment has been minimized.

ningzhihui commented Feb 21, 2018

I am encountering another error

[958 / 973] Compiling tensorflow/core/kernels/conv_ops_3d.cc; 38028s local ... (4 actions running)
ERROR: F:/machinelearning/tensorflow/tensorflow/core/BUILD:762:1: C++ compilation of rule '//tensorflow/core:nn_grad' failed (Exit -1073741502): msvc_cl.bat failed: error executing command
cd E:/temp/_bazel_nin/aki74rxt/execroot/org_tensorflow
SET CUDA_COMPUTE_CAPABILITIE=None

@bignamehyp

This comment has been minimized.

Member

bignamehyp commented Feb 23, 2018

Does dtrebbien's fix work for you?

@tralpha

This comment has been minimized.

tralpha commented Feb 25, 2018

Thank you, @dtrebbien. This helped me a lot, even on Mac OS 10.13.3, tensorflow v1.6. I'm using CUDA 9.1 with cudnn 7.

@marcionicolau

This comment has been minimized.

marcionicolau commented Mar 5, 2018

Hi @dtrebbien and @tralpha, how to solve the @rpath problem?
I build with options:

bazel build --config=cuda --config=opt --verbose_failures --action_env PATH --action_env LD_LIBRARY_PATH --action_env DYLD_LIBRARY_PATH //tensorflow/tools/pip_package:build_pip_package

I'm using CUDA 9.1, cudnn 7 and Xcode 9.2 on MacOS 13.3.3.

@ghost

This comment has been minimized.

ghost commented Mar 5, 2018

Do you mean something like:

dyld: Library not loaded: @rpath/libcudart.9.1.dylib

Take a look at this Stack Overflow answer: https://stackoverflow.com/a/40007947/196844

@mdamic0

This comment has been minimized.

mdamic0 commented Mar 7, 2018

Hi @dtrebbien I'm having a little trouble with applying the two patches you posted and build. I get the following error
ERROR: Analysis of target '//tensorflow/tools/pip_package:build_pip_package' failed; build aborted: error loading package 'tensorflow': Encountered error while reading extension file 'protobuf.bzl': no such package '@protobuf_archive//': java.io.IOException: Error downloading [https://mirror.bazel.build/github.com/dtrebbien/protobuf/archive/50f552646ba1de79e07562b41f3999fe036b4fd0.tar.gz, https://github.com/dtrebbien/protobuf/archive/50f552646ba1de79e07562b41f3999fe036b4fd0.tar.gz] to /private/var/tmp/_bazel_temp/737438342ae219a6c1e340312025fb82/external/protobuf_archive/50f552646ba1de79e07562b41f3999fe036b4fd0.tar.gz: All mirrors are down: []

I was able to pull down the tars from github and the checksums match. Any ideas? What commit ID did you build off of?
I'm using CUDA 9.1, cudnn 7, tensorflow 1.6, Xcode 8.2 and bazel 0.10.0 on MacOS 10.12.6. I'm trying to build off of c6a12c7 so it seems like around line 127 and 361 for me.

Appreciate the help, it's not to see others trying to get GPU support on Mac :)

edited this, looks like I have a mis-configured proxy setting for the jdk.
Still would like to know which commit you were able to build successfully off of?

@ghost

This comment has been minimized.

ghost commented Mar 7, 2018

I built off of TensorFlow commit 00ff491 with a few cherry-picked commits and local changes. Just now, I have committed my local changes and pushed this as a new branch to my fork; see https://github.com/dtrebbien/tensorflow/tree/tensorflow-17067-reference-branch

To use this branch:

git remote add dtrebbien-fork https://github.com/dtrebbien/tensorflow.git
git fetch dtrebbien-fork
git checkout -b tensorflow-17067-reference-branch dtrebbien-fork/tensorflow-17067-reference-branch

I am not sure that dtrebbien@993006fa764bbdecfee63f4ceead3d06a2821ce2 is needed; I merged it speculatively.

Here are the configure and build commands that I used:

PYTHON_BIN_PATH=/usr/local/bin/python2 \
USE_DEFAULT_PYTHON_LIB_PATH=1 \
TF_NEED_GCP=0 \
TF_NEED_HDFS=0 \
TF_NEED_S3=0 \
TF_NEED_KAFKA=0 \
TF_ENABLE_XLA=0 \
TF_NEED_GDR=0 \
TF_NEED_VERBS=0 \
TF_NEED_OPENCL_SYCL=0 \
TF_NEED_CUDA=1 \
TF_CUDA_VERSION=9.1 \
USE_DEFAULT_CUDA_TOOLKIT_PATH=1 \
TF_CUDNN_VERSION=7.0.5 \
USE_DEFAULT_CUDNN_INSTALL_PATH=1 \
TF_CUDA_COMPUTE_CAPABILITIES=3.0 \
TF_CUDA_CLANG=0 \
USE_DEFAULT_GCC_HOST_COMPILER_PATH=1 \
TF_NEED_MPI=0 \
USE_DEFAULT_CC_OPT_FLAGS=1 \
TF_SET_ANDROID_WORKSPACE=0 \
./configure

bazel build --config=opt --config=cuda --save_temps --explain=explain.txt --verbose_explanations --verbose_failures --linkopt=-Wl,-rpath,/usr/local/cuda/lib //tensorflow/tools/pip_package:build_pip_package

Note that I am building using Homebrew's Python 2. Also, I am running macOS 10.13.3 and Xcode 9.2.

You might see an error saying something like dyld: Library not loaded: @rpath/libcudart.9.1.dylib. If you do, follow the instructions in this Stack Overflow answer: https://stackoverflow.com/a/40007947/196844

My MacBook Pro has an NVIDIA GeForce GT 750M, which is CUDA compute capability 3.0. You might need to adjust the TF_CUDA_COMPUTE_CAPABILITIES configuration option if your Mac's NVIDIA GPU has a different compute capability.

Disclaimer: I do not know if the binary produced by following these steps is stable. I have observed a SEGFAULT issue with my build, but I do not know if this is specific to my build or if this is a bug in TensorFlow: #9518 (comment)

@ghost

This comment has been minimized.

ghost commented Mar 7, 2018

I just updated Homebrew and noticed that the 'python' formula is now Python 3, 'python2' is now Python 2, and the location of Homebrew Python 2 binaries is /usr/local/opt/python@2/bin.

@marcionicolau

This comment has been minimized.

marcionicolau commented Mar 8, 2018

@dtrebbien I do the tricks from SO but doesn't work, so I proceed with symbolic links in /usr/local/lib of the necessary CUDA libraries.

For the last version r1.6 I have to remove __align__(sizeof(T)) like discribes in TweakMind because of align problems.

sed -i.bu 's/__align__(sizeof(T)) //g' tensorflow/core/kernels/depthwise_conv_op_gpu.cu.cc
sed -i.bu 's/__align__(sizeof(T)) //g' tensorflow/core/kernels/split_lib_gpu.cu.cc
sed -i.bu 's/__align__(sizeof(T)) //g' tensorflow/core/kernels/concat_lib_gpu_impl.cu.cc

After all, I could compile and and test the TF with GPU is loaded. But with some real tranning I have a SEGFAULT issue. The dumps doesn´t show any glue about the problem.

@ghost

This comment has been minimized.

ghost commented Mar 8, 2018

Are you able to run any TensorFlow code, or does everything segfault?

I am not using the work around of removing __align__(sizeof(T)). Instead, I use __align__(sizeof(T) > 16 ? sizeof(T) : 16) which will produce an error if T is ever more than 16 bytes; otherwise, the shared storage is always 16-byte aligned.

@marcionicolau

This comment has been minimized.

marcionicolau commented Mar 9, 2018

@dtrebbien I could run some initial steps, but I couldn't run any real code. When the TF load data to GPU they show the SEGFAULT message and abort the run, even with simple computations on tensors.

P.S. Next week I will try the __align__ sugestion

@marcionicolau

This comment has been minimized.

marcionicolau commented Mar 12, 2018

Last Updates @dtrebbien :

sed -i.bu 's/__align__(sizeof(T)) /__align__(sizeof(T) > 16 ? sizeof(T) : 16) /g' tensorflow/core/kernels/depthwise_conv_op_gpu.cu.cc
sed -i.bu 's/__align__(sizeof(T)) /__align__(sizeof(T) > 16 ? sizeof(T) : 16) /g' tensorflow/core/kernels/split_lib_gpu.cu.cc
sed -i.bu 's/__align__(sizeof(T)) /__align__(sizeof(T) > 16 ? sizeof(T) : 16) /g' tensorflow/core/kernels/concat_lib_gpu_impl.cu.cc

During the compile, I saw this message:

ld: warning: cannot export hidden symbol std::__1::__vector_base<tensorflow::graph_transforms::OpTypePattern, std::__1::allocator<tensorflow::graph_transforms::OpTypePattern> >::__destruct_at_end(tensorflow::graph_transforms::OpTypePattern*) from bazel-out/darwin-py3-opt/bin/tensorflow/tools/graph_transforms/libtransforms_lib.pic.lo(remove_nodes.pic.o)

Loading results in these messages

2018-03-12 11:27:54.282823: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:859] OS X does not support NUMA - returning NUMA node zero
2018-03-12 11:27:54.282993: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212] Found device 0 with properties:
name: GeForce GTX 780M major: 3 minor: 0 memoryClockRate(GHz): 0.784
pciBusID: 0000:01:00.0
totalMemory: 4.00GiB freeMemory: 286.09MiB
2018-03-12 11:27:54.283018: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-03-12 11:27:54.639034: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22 MB memory) -> physical GPU (device: 0, name: GeForce GTX 780M, pci bus id: 0000:01:00.0, compute capability: 3.0)
Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 780M, pci bus id: 0000:01:00.0, compute capability: 3.0
2018-03-12 11:27:54.649140: I tensorflow/core/common_runtime/direct_session.cc:297] Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 780M, pci bus id: 0000:01:00.0, compute capability: 3.0

with a simple example

import tensorflow as tf
with tf.device('/gpu:0'):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
    c = tf.matmul(a, b)

with tf.Session() as sess:
    print (sess.run(c))

results

2018-03-12 11:30:37.822237: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-03-12 11:30:37.822477: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 11 MB memory) -> physical GPU (device: 0, name: GeForce GTX 780M, pci bus id: 0000:01:00.0, compute capability: 3.0)
[[22. 28.]
 [49. 64.]]

No more SEGFAULT !

@marcionicolau

This comment has been minimized.

marcionicolau commented Mar 12, 2018

Trying to run mnist_mlp.py I get a SEGFAULT.

to get some more information I run

sudo dtruss python mnist_mlp.py 2> err_mnist.txt

See attached err_mnist.txt.

@ghost

This comment has been minimized.

ghost commented Mar 12, 2018

Hello @marcionicolau

Trying out mnist_mlp.py, I am also observing a SEGFAULT. Here is a backtrace:

Thread 26 Crashed:
0   libtensorflow_framework.so    	0x0000000102959c10 void tensorflow::gtl::InlinedVector::emplace_back(tensorflow::EventMgr::InUse const&&&) + 176
1   libtensorflow_framework.so    	0x000000010295902c tensorflow::EventMgr::PollEvents(bool, tensorflow::gtl::InlinedVector*) + 412
2   libtensorflow_framework.so    	0x00000001028f4022 tensorflow::EventMgr::ThenExecute(perftools::gputools::Stream*, std::__1::function) + 194
3   libtensorflow_framework.so    	0x00000001028f4bad tensorflow::GPUUtil::CopyCPUTensorToGPU(tensorflow::Tensor const*, tensorflow::DeviceContext const*, tensorflow::Device*, tensorflow::Tensor*, std::__1::function) + 797
4   libtensorflow_framework.so    	0x00000001028f6a05 tensorflow::GPUDeviceContext::CopyCPUTensorToDevice(tensorflow::Tensor const*, tensorflow::Device*, tensorflow::Tensor*, std::__1::function) const + 117
5   libtensorflow_framework.so    	0x00000001029098b5 tensorflow::(anonymous namespace)::CopyHostToDevice(tensorflow::Tensor const*, tensorflow::Allocator*, tensorflow::Allocator*, tensorflow::StringPiece, tensorflow::Device*, tensorflow::Tensor*, tensorflow::DeviceContext*, std::__1::function) + 437
6   libtensorflow_framework.so    	0x0000000102908b58 tensorflow::CopyTensor::ViaDMA(tensorflow::StringPiece, tensorflow::DeviceContext*, tensorflow::DeviceContext*, tensorflow::Device*, tensorflow::Device*, tensorflow::AllocatorAttributes, tensorflow::AllocatorAttributes, tensorflow::Tensor const*, tensorflow::Tensor*, std::__1::function) + 3592
7   libtensorflow_framework.so    	0x0000000102942dbe tensorflow::IntraProcessRendezvous::SameWorkerRecvDone(tensorflow::Rendezvous::ParsedKey const&, tensorflow::Rendezvous::Args const&, tensorflow::Rendezvous::Args const&, tensorflow::Tensor const&, tensorflow::Tensor*, std::__1::function) + 1102
8   libtensorflow_framework.so    	0x000000010294385d std::__1::__function::__func)::$_0, std::__1::allocator)::$_0>, void (tensorflow::Status const&, tensorflow::Rendezvous::Args const&, tensorflow::Rendezvous::Args const&, tensorflow::Tensor const&, bool)>::operator()(tensorflow::Status const&, tensorflow::Rendezvous::Args const&, tensorflow::Rendezvous::Args const&, tensorflow::Tensor const&, bool&&) + 813
9   libtensorflow_framework.so    	0x000000010244e003 tensorflow::LocalRendezvousImpl::RecvAsync(tensorflow::Rendezvous::ParsedKey const&, tensorflow::Rendezvous::Args const&, std::__1::function) + 883
10  libtensorflow_framework.so    	0x000000010294311f tensorflow::IntraProcessRendezvous::RecvAsync(tensorflow::Rendezvous::ParsedKey const&, tensorflow::Rendezvous::Args const&, std::__1::function) + 799
11  _pywrap_tensorflow_internal.so	0x000000010b1a40a9 tensorflow::RecvOp::ComputeAsync(tensorflow::OpKernelContext*, std::__1::function) + 1145
12  libtensorflow_framework.so    	0x00000001028ea458 tensorflow::BaseGPUDevice::ComputeAsync(tensorflow::AsyncOpKernel*, tensorflow::OpKernelContext*, std::__1::function) + 872
13  libtensorflow_framework.so    	0x0000000102918202 tensorflow::(anonymous namespace)::ExecutorState::Process(tensorflow::(anonymous namespace)::ExecutorState::TaggedNode, long long) + 4338
14  libtensorflow_framework.so    	0x000000010292210a std::__1::__function::__func, std::__1::allocator >, void ()>::operator()() + 58
15  libtensorflow_framework.so    	0x000000010255bdff Eigen::NonBlockingThreadPoolTempl::WorkerLoop(int) + 2047
16  libtensorflow_framework.so    	0x000000010255b4ff std::__1::__function::__func)::'lambda'(), std::__1::allocator)::'lambda'()>, void ()>::operator()() + 47
17  libtensorflow_framework.so    	0x00000001025808a0 void* std::__1::__thread_proxy >, std::__1::function > >(void*) + 48
18  libsystem_pthread.dylib       	0x00007fff53e376c1 _pthread_body + 340
19  libsystem_pthread.dylib       	0x00007fff53e3756d _pthread_start + 377
20  libsystem_pthread.dylib       	0x00007fff53e36c5d thread_start + 13

This appears similar to the other SEGFAULT issue that I have observed. There, the SEGFAULT occurs within CopyGPUTensorToCPU(). Here, the SEGFAULT occurs within CopyCPUTensorToGPU().

ggael pushed a commit to eigenteam/eigen-git-mirror that referenced this issue Mar 23, 2018

Daniel Trebbien
Move up the specialization of std::numeric_limits
This fixes a compilation error seen when building TensorFlow on macOS:
tensorflow/tensorflow#17067
@tensorflowbutler

This comment has been minimized.

Member

tensorflowbutler commented Mar 28, 2018

It has been 14 days with no activity and the awaiting response label was assigned. Is this still an issue?

@dinever

This comment has been minimized.

dinever commented Sep 12, 2018

I believe @dtrebbien has removed his repo. If you need a protobuf version that will compile, use mine:

   tf_http_archive(
       name = "protobuf_archive",
       urls = [
-          "https://mirror.bazel.build/github.com/google/protobuf/archive/396336eb961b75f03b25824fe86cf6490fb75e3a.tar.gz",
-          "https://github.com/google/protobuf/archive/396336eb961b75f03b25824fe86cf6490fb75e3a.tar.gz",
+          "https://mirror.bazel.build/github.com/dinever/protobuf/archive/188578878eff18c2148baba0e116d87ce8f49410.tar.gz",
+          "https://github.com/dinever/protobuf/archive/188578878eff18c2148baba0e116d87ce8f49410.tar.gz",
       ],
-      sha256 = "846d907acf472ae233ec0882ef3a2d24edbbe834b80c305e867ac65a1f2c59e3",
-      strip_prefix = "protobuf-396336eb961b75f03b25824fe86cf6490fb75e3a",
+      sha256 = "7a1d96ccdf7131535828cad737a76fd65ed766e9511e468d0daa3cc4f3db5175",
+      strip_prefix = "protobuf-188578878eff18c2148baba0e116d87ce8f49410",
   )
@cc112358

This comment has been minimized.

cc112358 commented Oct 25, 2018

I believe @dtrebbien has removed his repo. If you need a protobuf version that will compile, use mine:

   tf_http_archive(
       name = "protobuf_archive",
       urls = [
-          "https://mirror.bazel.build/github.com/google/protobuf/archive/396336eb961b75f03b25824fe86cf6490fb75e3a.tar.gz",
-          "https://github.com/google/protobuf/archive/396336eb961b75f03b25824fe86cf6490fb75e3a.tar.gz",
+          "https://mirror.bazel.build/github.com/dinever/protobuf/archive/188578878eff18c2148baba0e116d87ce8f49410.tar.gz",
+          "https://github.com/dinever/protobuf/archive/188578878eff18c2148baba0e116d87ce8f49410.tar.gz",
       ],
-      sha256 = "846d907acf472ae233ec0882ef3a2d24edbbe834b80c305e867ac65a1f2c59e3",
-      strip_prefix = "protobuf-396336eb961b75f03b25824fe86cf6490fb75e3a",
+      sha256 = "7a1d96ccdf7131535828cad737a76fd65ed766e9511e468d0daa3cc4f3db5175",
+      strip_prefix = "protobuf-188578878eff18c2148baba0e116d87ce8f49410",
   )

I have got the following error after replacing the tf_http_archive you posted.

error: patch failed: tensorflow/core/common_runtime/gpu/gpu_device.cc:920
error: tensorflow/core/common_runtime/gpu/gpu_device.cc: patch does not apply
error: patch failed: tensorflow/core/framework/variant.h:152
error: tensorflow/core/framework/variant.h: patch does not apply
error: patch failed: tensorflow/core/grappler/clusters/utils.cc:124
error: tensorflow/core/grappler/clusters/utils.cc: patch does not apply
error: patch failed: tensorflow/core/kernels/bucketize_op_gpu.cu.cc:39
error: tensorflow/core/kernels/bucketize_op_gpu.cu.cc: patch does not apply
error: patch failed: tensorflow/core/kernels/concat_lib_gpu_impl.cu.cc:69
error: tensorflow/core/kernels/concat_lib_gpu_impl.cu.cc: patch does not apply
error: patch failed: tensorflow/core/kernels/depthwise_conv_op_gpu.cu.cc:172
error: tensorflow/core/kernels/depthwise_conv_op_gpu.cu.cc: patch does not apply
error: patch failed: tensorflow/core/kernels/split_lib_gpu.cu.cc:121
error: tensorflow/core/kernels/split_lib_gpu.cu.cc: patch does not apply
error: patch failed: tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:856
error: tensorflow/stream_executor/cuda/cuda_gpu_executor.cc: patch does not apply
error: patch failed: tensorflow/workspace.bzl:330
error: tensorflow/workspace.bzl: patch does not apply
error: patch failed: third_party/gpus/cuda/BUILD.tpl:110
error: third_party/gpus/cuda/BUILD.tpl: patch does not apply

ktmud added a commit to ktmud/tensorflow-macos-gpu that referenced this issue Nov 14, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment