Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build fail with cuda: identifier "__builtin_ia32_mwaitx" is undefined #1066

Closed
nlgranger opened this issue Feb 11, 2016 · 15 comments

Comments

Projects
None yet
@nlgranger
Copy link

commented Feb 11, 2016

Using gcc 5 with cuda support results in a compilation error:

INFO: From Compiling tensorflow/core/kernels/adjust_contrast_op_gpu.cu.cc:

# Omitting warnings

/usr/lib/gcc/x86_64-unknown-linux-gnu/5.3.0/include/mwaitxintrin.h(36): error: identifier "__builtin_ia32_monitorx" is undefined

/usr/lib/gcc/x86_64-unknown-linux-gnu/5.3.0/include/mwaitxintrin.h(42): error: identifier "__builtin_ia32_mwaitx" is undefined

2 errors detected in the compilation of "/tmp/tmpxft_00002a59_00000000-7_adjust_contrast_op_gpu.cu.cpp1.ii".

Build process is as follows:

  TF_UNOFFICIAL_SETTING=1 ./configure <<EOF
/usr/bin/python
y
7.5
/opt/cuda
4.0.7
/opt/cuda
5.2
EOF

    bazel build --jobs 4 --config=cuda -c opt --verbose_failures \
        //tensorflow/tools/pip_package:build_pip_package

The environnement is:

  • bazel 0.1.5-1
  • cuda 7.5
  • cudnn 4
  • gcc 5.3.0
  • boost 1.60 (is it even relevant?)
  • python3.5 (/usr/bin/python is python3)

This issue might be related to boost 1.60 or gcc 5 as mentioned here or there.
One suggested fix is to restrict gcc to ansi c++, which can be achieved like so:

diff --git a/third_party/gpus/crosstool/CROSSTOOL b/third_party/gpus/crosstool/CROSSTOOL
index dfde7cd..547441f 100644
--- a/third_party/gpus/crosstool/CROSSTOOL
+++ b/third_party/gpus/crosstool/CROSSTOOL
@@ -46,6 +46,8 @@ toolchain {
   # Use "-std=c++11" for nvcc. For consistency, force both the host compiler
   # and the device compiler to use "-std=c++11".
   cxx_flag: "-std=c++11"
+  cxx_flag: "-D_MWAITXINTRIN_H_INCLUDED"
+  cxx_flag: "-D__STRICT_ANSI__"
   linker_flag: "-lstdc++"
   linker_flag: "-B/usr/bin/"

(btw. thank you @vrv for telling which file to change).

Could anyone please kindly review this, and integrate it (or not)? Note that I am not quite sure wether both flags are actually required and what the side effects might be.

@vrv

This comment has been minimized.

Copy link
Contributor

commented Feb 13, 2016

We'll try to find someone at nvidia to ask what the best practice is here -- this might work but I don't think it's something we'd want to check in unless it's the recommended practice by nvidia. We'll leave this open so others can find it in the meantime.

@KeithBrodie

This comment has been minimized.

Copy link

commented Feb 23, 2016

For what it's worth - build TF 0.7.1 on Ubuntu 15.10 with Cuda 7.5 required
cxx_flag: "-D_MWAITXINTRIN_H_INCLUDED" but not strict ANSI

@chemelnucfin

This comment has been minimized.

Copy link
Contributor

commented Mar 14, 2016

I also got it to build without strict ANSI on Fedora 23.

@drufat

This comment has been minimized.

Copy link

commented Mar 23, 2016

I have a similar archlinux setup.
After applying the suggested patch

diff --git a/third_party/gpus/crosstool/CROSSTOOL b/third_party/gpus/crosstool/CROSSTOOL
index dfde7cd..2482a69 100644
--- a/third_party/gpus/crosstool/CROSSTOOL
+++ b/third_party/gpus/crosstool/CROSSTOOL
@@ -46,6 +46,7 @@ toolchain {
   # Use "-std=c++11" for nvcc. For consistency, force both the host compiler
   # and the device compiler to use "-std=c++11".
   cxx_flag: "-std=c++11"
+  cxx_flag: "-D_MWAITXINTRIN_H_INCLUDED"
   linker_flag: "-lstdc++"
   linker_flag: "-B/usr/bin/"

I now run into a new error:

/usr/include/string.h: In function 'void* __mempcpy_inline(void*, const void*, size_t)':
/usr/include/string.h:652:42: error: 'memcpy' was not declared in this scope
   return (char *) memcpy (__dest, __src, __n) + __n;
                                          ^
ERROR: /home/drufat/Source/tensorflow/tensorflow/core/kernels/BUILD:858:1: output 'tensorflow/core/kernels/_objs/depth_space_ops_gpu/tensorflow/core/kernels/depthtospace_op_gpu.cu.pic.o' was not created.
ERROR: /home/drufat/Source/tensorflow/tensorflow/core/kernels/BUILD:858:1: not all outputs were created.
Target //tensorflow/tools/pip_package:build_pip_package failed to build
@chrisburr

This comment has been minimized.

Copy link

commented Mar 23, 2016

@drufat I ran into a similar issue when compiling on Arch and found the solution in #1346. I'm unsure if -D__STRICT_ANSI__ is actually required but the following patch worked for me:

diff --git a/third_party/gpus/crosstool/CROSSTOOL b/third_party/gpus/crosstool/CROSSTOOL
index dfde7cd..15fa9fd 100644
--- a/third_party/gpus/crosstool/CROSSTOOL
+++ b/third_party/gpus/crosstool/CROSSTOOL
@@ -46,6 +46,9 @@ toolchain {
   # Use "-std=c++11" for nvcc. For consistency, force both the host compiler
   # and the device compiler to use "-std=c++11".
   cxx_flag: "-std=c++11"
+  cxx_flag: "-D_MWAITXINTRIN_H_INCLUDED"
+  cxx_flag: "-D_FORCE_INLINES"
+  cxx_flag: "-D__STRICT_ANSI__"
   linker_flag: "-lstdc++"
   linker_flag: "-B/usr/bin/"
@exrook

This comment has been minimized.

Copy link

commented Apr 1, 2016

This may be a bug in glibc, the change in string.h seems to have only been introduced in the latest version (see here) I have been experiencing the same issue on Arch with other CUDA neural network packages and commenting out the changed lines in /usr/include/string.h has resolved the issue.

@tpruvot

This comment has been minimized.

Copy link

commented Apr 13, 2016

same in ubuntu 16.04 (beta) since the upgrade to gcc 5.3.1

working when deleting this ifdef block.. tx exrook

@zheng-xq zheng-xq removed their assignment Apr 18, 2016

xman added a commit to xman/tensorflow that referenced this issue Jun 8, 2016

Fix compilation problem when CUDA is enabled.
See tensorflow#1066 (comment)

Signed-off-by: ShinYee <shinyee@speedgocomputing.com>
@girving

This comment has been minimized.

Copy link
Contributor

commented Jun 8, 2016

Is this still an issue?

@liaohaofu

This comment has been minimized.

Copy link

commented Jun 9, 2016

@girving Yes, this problem still exits on Ubuntu 15.10 with Cuda 7.5. But @chrisburr's solution works for me and D__STRICT_ANSI__ is actually not required.

xman added a commit to xman/tensorflow that referenced this issue Jun 9, 2016

Fix compilation problem when CUDA is enabled.
See tensorflow#1066 (comment)

Signed-off-by: ShinYee <shinyee@speedgocomputing.com>
@girving

This comment has been minimized.

Copy link
Contributor

commented Jun 9, 2016

I'm worried that either or both of _FORCE_INLINE and _MWAITXINTRIN_H_INCLUDED could cause strange regressions on other platforms. What do they do?

xman added a commit to xman/tensorflow that referenced this issue Jun 13, 2016

Fix compilation problem when CUDA is enabled.
See tensorflow#1066 (comment)

Signed-off-by: ShinYee <shinyee@speedgocomputing.com>
@jbruck

This comment has been minimized.

Copy link

commented Jun 14, 2016

I came across this error on Ubuntu 16.04, otherwise pretty much same config as the others here.
Just these two fixed the compile errors.

  • cxx_flag: "-D_MWAITXINTRIN_H_INCLUDED"
  • cxx_flag: "-D_FORCE_INLINES"

I noticed the force inlines is also recommended by Theano for the same reason http://deeplearning.net/software/theano/install_ubuntu.html

@enedene

This comment has been minimized.

Copy link

commented Jun 23, 2016

Hello, I have a basic question. I have poor knowledge of C++ and make files, and I'm not sure how exactly to tell the compiler to use these cxx_flags @jbruck mentioned.

I'm using bazel, as was mentioned in installation tutorial, running the command:
bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer

to compile tensorflow, and I get this error when trying to compile with cudnn 5 on ubuntu 16.04.

Could you explicitly tell me where to put the flags and how to run the compiler with them on?

Thank you.

@chemelnucfin

This comment has been minimized.

Copy link
Contributor

commented Jun 23, 2016

@enedene Hello, those commands are options that are passed to the compiler,
but they are just specified in the third_party/gpus/crosstool/CROSSTOOL
file above. So you can just put those lines in in that file.

On Thu, Jun 23, 2016, 9:32 AM enedene notifications@github.com wrote:

Hello, I have a basic question. I have poor knowledge of C++ and make
files, and I'm not sure how exactly to tell the compiler to use these
cxx_flags @jbruck https://github.com/jbruck mentioned.

I'm using bazel, as was mentioned in installation tutorial, running the
command:
bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer

to compile tensorflow, and I get this error when trying to compile with
cudnn 5 on ubuntu 16.04.

Could you explicitly tell me where to put the flags and how to run the
compiler with them on?

Thank you.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#1066 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ADzDDO9ZQXxlr9gVZgQG7dQKmreep5Pxks5qOrUsgaJpZM4HYoKD
.

@enedene

This comment has been minimized.

Copy link

commented Jun 23, 2016

@chemelnucfin thank you!

@andydavis1

This comment has been minimized.

Copy link
Member

commented Jul 1, 2016

I added instructions for Ubuntu users to add these flags if they see this issue in g3doc/get_started/os_setup.md (we do not want to set these flags for all OSes at this point). Closing this issue.

@andydavis1 andydavis1 closed this Jul 1, 2016

@civilman628 civilman628 referenced this issue Dec 3, 2016

Closed

build fail #39

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.