Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build fails on Ubuntu 16.04 LTS, CUDA Toolkit 8.0, cuDNN 5.0.5, and Bazel 0.3.0-jdk7 #3526

adam-erickson opened this issue Jul 27, 2016 · 5 comments


Copy link

@adam-erickson adam-erickson commented Jul 27, 2016

Hi Everyone,

I've downgraded my gcc to 5.3.0 by building from source in order to install CUDA Toolkit 8.0 with cuDNN 5.0.5. I also installed OpenCL freeglut3 and mesa libraries via apt-get. I then built Bazel from source using the installer script. Next, I installed the TensorFlow and Google Cloud Platform Python dependencies. I then cloned the tensorflow GitHub repository and modified the CROSSTOOL file variable cxx_builtin_include_directory to include the gcc location for 5.3.0. I then ran ./configure with default settings and tried to build with Bazel, but it always fails with an error like this, which appears to be a gcc issue:

WARNING: /root/.cache/bazel/_bazel_root/fbc06f9baef46cade6e35d9e4137e37c/external/protobuf/WORKSPACE:1: Workspace name in /root/.cache/bazel/_bazel_root/fbc06f9baef46cade6e35d9e4137e37c/external/protobuf/WORKSPACE (@main) does not match the name given in the repository's definition (@protobuf); this will cause a build error in future versions.

ERROR: /root/.cache/bazel/_bazel_root/fbc06f9baef46cade6e35d9e4137e37c/external/zlib_archive/BUILD:7:1: undeclared inclusion(s) in rule '@zlib_archive//:zlib'

This rule is missing dependency declarations for the following files included by 'external/zlib_archive/zlib-1.2.8/inftrees.c':
Target //tensorflow/cc:tutorials_example_trainer failed to build

If I change the gcc version to 4.8 (installed via apt-get) in ./configure and revert CROSSTOOL I get many warnings:
INFO: ... warning: variable 'parsed_colon' set but not used

This warning is followed by an error:
ERROR: /opt/tensorflow/tensorflow/core/kernels/BUILD:1527:1: undeclared inclusion(s) in rule '//tensorflow/core/kernels:depth_space_ops_gpu':
this rule is missing dependency declarations for the following files included by 'tensorflow/core/kernels/':

This time, it appears to be an issue with CUDA Toolkit 8.0. Everything seems to work flawlessly up until building tensorflow from source.



@adam-erickson adam-erickson changed the title TensorFlow build fails on Ubuntu 16.04 LTS, CUDA Toolkit 8.0, cuDNN 5.0.5, and Bazel 0.3.0-jdk7 Build fails on Ubuntu 16.04 LTS, CUDA Toolkit 8.0, cuDNN 5.0.5, and Bazel 0.3.0-jdk7 Jul 27, 2016
Copy link

@JohnAllen JohnAllen commented Jul 27, 2016

Try the one-earlier release of Bazel 0.2.3 or whatever it is. 0.3 never worked for me. Rolling back twice worked for me on Ubuntu 16 and 14, with same exact CUD* versions. Make sure bazel 0.3 is uninstalled, obviously.

Copy link

@yaroslavvb yaroslavvb commented Jul 27, 2016

I worked around similar issue by adding cxx_builtin_include_directory

Copy link

@michaelisard michaelisard commented Jul 27, 2016

@adam-erickson please update if the above comments don't help.

Copy link

@adam-erickson adam-erickson commented Jul 28, 2016

It looks like the solution of @yaroslavvb should work for the install with gcc 4.8.

Since I'm not running Pascal architecture GPUs, but rather a node with four GeForce GTX Titan X GPUs, I ended up installing the latest CUDA 367.35 display drivers from ppa (the display drivers included with CUDA Toolkit 7.5 cause nvidia-smi to freeze on Ubuntu 16.04), CUDA Toolkit 7.5 from Ubuntu 16.04 LTS package management, and cuDNN 5.0.5 from the Nvidia site. I then built and ran the samples from source. One function appears to error in tests, but maybe that's because the samples are intended for Ubuntu 15.04 with cuDNN 4. I'm happily back to only the standard gcc now. TensorFlow is functioning well with the standard Python distribution. Here was my full process, after removing previous installations:

Recommended: Install OpenCL libraries
Update list:
apt-get update
Install OpenCL libraries:
apt-get install mesa-common-dev freeglut3-dev
apt-get install libxmu-dev libxi-dev

Install CUDA Toolkit 7.5 and 367.xx display driver from Ubuntu 16.04 apt-get
Install Python dependencies:
apt-get install python-pip python-dev
Remove existing CUDA installation:
apt-get purge nvidia-*
Install CUDA display driver 367.35:
add-apt-repository ppa:graphics-drivers/ppa
apt-get update
apt-get install nvidia-367
Install CUDA Toolkit 7.5
apt-get install nvidia-cuda-toolkit
apt-get install nvidia-nsight
apt-get install nvidia-profiler
apt-get install libcupti-dev zliblg-dev
Link files:
mkdir /usr/local/cuda
ln -s /usr/lib/x86_64-linux-gnu/ lib64
ln -s /usr/include/ include
ln -s /usr/bin/ bin
ln -s /usr/lib/x86_64-linux-gnu/ nvvm
mkdir -p extras/CUPTI
cd extras/CUPTI
ln -s /usr/lib/x86_64-linux-gnu/ lib64
ln -s /usr/include/ include

Install cuDNN 5.0.5
cd /opt
Download cuDNN 5.0.5 from: [(]
tar xvf cudnn-8.0-linux-x64-v5.0-ga.tar
Add to ~/.bashrc: export LD_LIBRARY_PATH=/opt/cuda:$LD_LIBRARY_PATH
Copy cudnn files to the default CUDA directories and set permissions:
cp cuda/include/cudnn.h /usr/local/cuda/include/
cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/
chmod a+r /usr/local/cuda/include/cudnn.h
chmod a+r /usr/local/cuda/lib64/libcudnn*
Download and run the CUDA Toolkit 7.5 samples:
Get the full run file here: []
Extract to path:
mkdir /opt/cuda/cudatoolkit
sh -extract=/opt/cuda/cudatoolkit
cd /opt/cuda/cudatoolkit
Install only the samples to /opt/cuda/samples:
cd ..
rm -rf cudatoolkit
Run the Device Query tool:
cd samples/1_Utilities/deviceQuery
Run the bandwidth test:
cd /opt/cuda/samples/1_Utilities/bandwidthTest
Run the n-body sample with nvprof:
cd ../..
cd 5_Simulations/nbody
nvprof ./nbody -benchmark -numdevices=4
nvprof --print-gpu-trace ./nbody -benchmark -numdevices=4

Install TensorFlow from binary for CUDA Toolkit 7.5
-Set new variable and install:
cd /opt
pip install --upgrade $TF_BINARY_URL

Copy link

@michaelisard michaelisard commented Jul 28, 2016

Good to hear it! I'm closing the issue but please reopen if there is something that needs to be addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.