New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel build missing dependencies error with MPI #17437

Closed
DavidBrayford opened this Issue Mar 5, 2018 · 4 comments

Comments

Projects
None yet
3 participants
@DavidBrayford
Copy link

DavidBrayford commented Mar 5, 2018

OS: SLES12
Python version: 3.6
Bazel version: Build label: 0.11.0- (@non-git)
gcc version 7.2.0 (GCC)
No GPU
No CUDA

With MPI enabled in configure everything else disabled.

When I run the command: bazel build --config=mkl --copt="-DEIGEN_USE_VML" --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-msse4.1 --copt=-msse4.2 -s -c opt //tensorflow/tools/pip_package:build_pip_package --verbose_failures

I get the error below, I guess I could modify the build file to include the additional dependencies:

ERROR: /home/hpc/pr28fa/di72giz/TENSORFLOW/tensorflow/tensorflow/contrib/mpi_collectives/BUILD:40:1: undeclared inclusion(s) in rule '//tensorflow/contrib/mpi_collectives:python/ops/_mpi_ops.so':
this rule is missing dependency declarations for the following files included by 'tensorflow/contrib/mpi_collectives/kernels/mpi_ops.cc':
'/home/hpc/pr28fa/di72giz/TENSORFLOW/tensorflow/tensorflow/stream_executor/lib/statusor.h'
'/home/hpc/pr28fa/di72giz/TENSORFLOW/tensorflow/tensorflow/stream_executor/platform/port.h'
'/home/hpc/pr28fa/di72giz/TENSORFLOW/tensorflow/tensorflow/stream_executor/lib/error.h'
'/home/hpc/pr28fa/di72giz/TENSORFLOW/tensorflow/tensorflow/stream_executor/lib/status.h'
'/home/hpc/pr28fa/di72giz/TENSORFLOW/tensorflow/tensorflow/stream_executor/lib/stringpiece.h'
'/home/hpc/pr28fa/di72giz/TENSORFLOW/tensorflow/tensorflow/stream_executor/platform/logging.h'
tensorflow/contrib/mpi_collectives/kernels/mpi_ops.cc:128:6: warning: 'bool tensorflow::contrib::mpi_collectives::{anonymous}::IsGPUDevice() [with T = Eigen::GpuDevice]' defined but not used [-Wunused-function]
bool IsGPUDevice() {
^~~~~~~~~~~~~~~~~~~~~~
Target //tensorflow/tools/pip_package:build_pip_package failed to build

@tensorflowbutler

This comment has been minimized.

Copy link
Member

tensorflowbutler commented Mar 6, 2018

Thank you for your post. We noticed you have not filled out the following field in the issue template. Could you update them if they are relevant in your case, or leave them as N/A? Thanks.
Have I written custom code
OS Platform and Distribution
TensorFlow installed from
TensorFlow version
CUDA/cuDNN version
GPU model and memory
Exact command to reproduce

@DavidBrayford

This comment has been minimized.

Copy link
Author

DavidBrayford commented Mar 6, 2018

OS Platform and Distribution: OS: SLES12
TensorFlow: installed from Github 05-Mar-2018
TensorFlow version: 1.6
CUDA/cuDNN version: N/A CPU only
GPU model and memory: N/A CPU only
Exact command to reproduce: bazel build --config=mkl --copt="-DEIGEN_USE_VML" --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-msse4.1 --copt=-msse4.2 -s -c opt //tensorflow/tools/pip_package:build_pip_package --verbose_failures

configure:
Anaconda Python 3.6
MPI enable
Every other option disabled

@wei-v-wang

This comment has been minimized.

Copy link

wei-v-wang commented Mar 13, 2018

Can you please try the patch below?

--- a/tensorflow/contrib/mpi_collectives/BUILD
+++ b/tensorflow/contrib/mpi_collectives/BUILD
@@ -53,6 +53,7 @@ tf_custom_op_library(
":mpi_defines",
":mpi_message_proto_cc",
"//third_party/mpi",
(+) "//tensorflow/stream_executor",
],
)

diff --git a/tensorflow/tensorflow.bzl b/tensorflow/tensorflow.bzl
index 23d11c88ed..12512ae6df 100644
--- a/tensorflow/tensorflow.bzl
+++ b/tensorflow/tensorflow.bzl
@@ -1247,7 +1247,7 @@ def tf_custom_op_library(name, srcs=[], gpu_srcs=[], deps=[], linkopts=[]):
deps=deps + if_cuda(cuda_deps),
disallowed_deps=[
clean_dep("//tensorflow/core:framework"),
(-) clean_dep("//tensorflow/core:lib")
(+)# clean_dep("//tensorflow/core:lib")
])
tf_cc_shared_object(
name=name,

Credit goes to https://github.com/aburden5 in the comments of a similar issue: baidu-research/tensorflow-allreduce#5

@DavidBrayford

This comment has been minimized.

Copy link
Author

DavidBrayford commented Mar 19, 2018

Thanks the patch fixed the build errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment