New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot launch TensorBoard from source due to debugger plugin #431

Closed
wchargin opened this Issue Aug 28, 2017 · 24 comments

Comments

Projects
None yet
7 participants
@wchargin
Member

wchargin commented Aug 28, 2017

TensorBoard master, with TensorFlow 1.3.0 from pip, cannot run: it fails to import a Python library related to gRPC.

The error is:

Traceback (most recent call last):
  File "/home/wchargin/.cache/bazel/_bazel_wchargin/3f99396cfb979f2f5a2059c1fd233f92/execroot/org_tensorflow_tensorboard/bazel-out/local-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/main.py", line 38, in <module>
    from tensorboard.plugins.debugger import debugger_plugin as debugger_plugin_lib
  File "/home/wchargin/.cache/bazel/_bazel_wchargin/3f99396cfb979f2f5a2059c1fd233f92/execroot/org_tensorflow_tensorboard/bazel-out/local-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_plugin.py", line 35, in <module>
    from tensorboard.plugins.debugger import debugger_server_lib
  File "/home/wchargin/.cache/bazel/_bazel_wchargin/3f99396cfb979f2f5a2059c1fd233f92/execroot/org_tensorflow_tensorboard/bazel-out/local-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_server_lib.py", line 33, in <module>
    from tensorflow.python.debug.lib import grpc_debug_server
ImportError: cannot import name grpc_debug_server

The first bad commit is (unsurprisingly) a856e61, which I identified by using git bisect with the following script:

#!/bin/bash
! bazel run tensorboard 2>&1 | grep -F 'cannot import name grpc_debug_server'

Steps to reproduce:

$ virtualenv /tmp/tensorflow-1.3.0-fresh
$ source /tmp/tensorflow-1.3.0-fresh/bin/activate
$ pip install tensorflow==1.3.0
$ git checkout b1a4d2586a0eae1ce7f3a18b4db188b62c4daaee  # current origin/master
$ bazel run tensorboard -- --logdir /tmp/data

The following patch fixes the problem:

diff --git a/tensorboard/main.py b/tensorboard/main.py
index ec84e25..fb5d2cd 100644
--- a/tensorboard/main.py
+++ b/tensorboard/main.py
@@ -35,7 +35,7 @@ from tensorboard.backend import application
 from tensorboard.backend.event_processing import event_file_inspector as efi
 from tensorboard.plugins.audio import audio_plugin
 from tensorboard.plugins.core import core_plugin
-from tensorboard.plugins.debugger import debugger_plugin as debugger_plugin_lib
+#from tensorboard.plugins.debugger import debugger_plugin as debugger_plugin_lib
 from tensorboard.plugins.distribution import distributions_plugin
 from tensorboard.plugins.graph import graphs_plugin
 from tensorboard.plugins.histogram import histograms_plugin
@@ -240,11 +240,12 @@ def main(unused_argv=None):
     efi.inspect(FLAGS.logdir, event_file, FLAGS.tag)
     return 0
   else:
-    def ConstructDebuggerPluginWithGrpcPort(context):
-      debugger_plugin = debugger_plugin_lib.DebuggerPlugin(context)
-      if FLAGS.debugger_data_server_grpc_port is not None:
-        debugger_plugin.listen(FLAGS.debugger_data_server_grpc_port)
-      return debugger_plugin
+    pass
+    #def ConstructDebuggerPluginWithGrpcPort(context):
+    #  debugger_plugin = debugger_plugin_lib.DebuggerPlugin(context)
+    #  if FLAGS.debugger_data_server_grpc_port is not None:
+    #    debugger_plugin.listen(FLAGS.debugger_data_server_grpc_port)
+    #  return debugger_plugin
 
     plugins = [
         core_plugin.CorePlugin,
@@ -258,7 +259,7 @@ def main(unused_argv=None):
         projector_plugin.ProjectorPlugin,
         text_plugin.TextPlugin,
         profile_plugin.ProfilePlugin,
-        ConstructDebuggerPluginWithGrpcPort,
+        #ConstructDebuggerPluginWithGrpcPort,
     ]
 
     tb = create_tb_app(plugins)

Versions:

$ bazel version
Build label: 0.5.4
Build target: bazel-out/local-fastbuild/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Fri Aug 25 10:00:00 2017 (1503655200)
Build timestamp: 1503655200
Build timestamp as int: 1503655200
$ pip --version
pip 9.0.1 from /tmp/tensorflow-1.3.0-fresh/local/lib/python2.7/site-packages (python 2.7)
$ lsb_release -a
No LSB modules are available.
Distributor ID:	LinuxMint
Description:	Linux Mint 18.2 Sonya
Release:	18.2
Codename:	sonya
@wchargin

This comment has been minimized.

Show comment
Hide comment
@wchargin
Member

wchargin commented Aug 28, 2017

@chihuahua

This comment has been minimized.

Show comment
Hide comment
@chihuahua

chihuahua Aug 28, 2017

Member

Hmm, I'm trying to repro. I ran

pip install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.3.0rc2-cp27-none-linux_x86_64.whl

and TensorBoard at master HEAD seems to run fine.

Member

chihuahua commented Aug 28, 2017

Hmm, I'm trying to repro. I ran

pip install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.3.0rc2-cp27-none-linux_x86_64.whl

and TensorBoard at master HEAD seems to run fine.

@chihuahua

This comment has been minimized.

Show comment
Hide comment
@chihuahua

chihuahua Aug 28, 2017

Member

One thing to note: TensorBoard used to fail to start for me, but I fixed by pip installing grpcio. However, the error I got from that looked different.

INFO: Running command line: bazel-bin/tensorboard/tensorboard '--logdir=~/Desktop/pr_curve_demo'
Traceback (most recent call last):
File "/private/var/tmp/_bazel_chizeng/1b1399fef0aaaae96df4708880f141bb/execroot/org_tensorflow_tensorboard/bazel-out/darwin_x86_64-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/main.py", line 38, in
from tensorboard.plugins.debugger import debugger_plugin as debugger_plugin_lib
File "/private/var/tmp/_bazel_chizeng/1b1399fef0aaaae96df4708880f141bb/execroot/org_tensorflow_tensorboard/bazel-out/darwin_x86_64-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_plugin.py", line 35, in
from tensorboard.plugins.debugger import debugger_server_lib
File "/private/var/tmp/_bazel_chizeng/1b1399fef0aaaae96df4708880f141bb/execroot/org_tensorflow_tensorboard/bazel-out/darwin_x86_64-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_server_lib.py", line 33, in
from tensorflow.python.debug.lib import grpc_debug_server
File "/Users/chizeng/anaconda/lib/python3.6/site-packages/tensorflow/python/debug/lib/grpc_debug_server.py", line 27, in
import grpc
ModuleNotFoundError: No module named 'grpc'

The error you noted seems to instead indicate that the grpc_debug_server module is unavailable.

Member

chihuahua commented Aug 28, 2017

One thing to note: TensorBoard used to fail to start for me, but I fixed by pip installing grpcio. However, the error I got from that looked different.

INFO: Running command line: bazel-bin/tensorboard/tensorboard '--logdir=~/Desktop/pr_curve_demo'
Traceback (most recent call last):
File "/private/var/tmp/_bazel_chizeng/1b1399fef0aaaae96df4708880f141bb/execroot/org_tensorflow_tensorboard/bazel-out/darwin_x86_64-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/main.py", line 38, in
from tensorboard.plugins.debugger import debugger_plugin as debugger_plugin_lib
File "/private/var/tmp/_bazel_chizeng/1b1399fef0aaaae96df4708880f141bb/execroot/org_tensorflow_tensorboard/bazel-out/darwin_x86_64-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_plugin.py", line 35, in
from tensorboard.plugins.debugger import debugger_server_lib
File "/private/var/tmp/_bazel_chizeng/1b1399fef0aaaae96df4708880f141bb/execroot/org_tensorflow_tensorboard/bazel-out/darwin_x86_64-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_server_lib.py", line 33, in
from tensorflow.python.debug.lib import grpc_debug_server
File "/Users/chizeng/anaconda/lib/python3.6/site-packages/tensorflow/python/debug/lib/grpc_debug_server.py", line 27, in
import grpc
ModuleNotFoundError: No module named 'grpc'

The error you noted seems to instead indicate that the grpc_debug_server module is unavailable.

@wchargin

This comment has been minimized.

Show comment
Hide comment
@wchargin

wchargin Aug 28, 2017

Member

Using 1.3.0rc2 instead of 1.3.0, with the link that you provided, does not fix the problem.

Additionally installing grpcio does not fix the problem.

In my site packages, the tensorflow.python.debug.lib package contains no file grpc_debug_server.py, so it is no wonder that the import fails. You don't seem to have this problem: could you please post your output for

from tensorflow.python.debug.lib import grpc_debug_server
print(grpc_debug_server.__file__)

Note that this file does exist in nightly TensorFlow. However, (a) I'd thought that we no longer wanted to depend on nightly since the 1.3 release (correct me if wrong?), and (b) the import still fails because a transitive dependency is missing: if I write

$ virtualenv /tmp/tensorflow-nightly-20170828
$ source /tmp/tensorflow-nightly-20170828/bin/activate
$ pip install 'https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-1.3.0-cp27-none-linux_x86_64.whl'
$ bazel run tensorboard

then the error is

Traceback (most recent call last):
  File "/home/wchargin/.cache/bazel/_bazel_wchargin/3f99396cfb979f2f5a2059c1fd233f92/execroot/org_tensorflow_tensorboard/bazel-out/local-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/main.py", line 38, in <module>
    from tensorboard.plugins.debugger import debugger_plugin as debugger_plugin_lib
  File "/home/wchargin/.cache/bazel/_bazel_wchargin/3f99396cfb979f2f5a2059c1fd233f92/execroot/org_tensorflow_tensorboard/bazel-out/local-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_plugin.py", line 35, in <module>
    from tensorboard.plugins.debugger import debugger_server_lib
  File "/home/wchargin/.cache/bazel/_bazel_wchargin/3f99396cfb979f2f5a2059c1fd233f92/execroot/org_tensorflow_tensorboard/bazel-out/local-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_server_lib.py", line 33, in <module>
    from tensorflow.python.debug.lib import grpc_debug_server
  File "/tmp/tensorflow-nightly-20170828/local/lib/python2.7/site-packages/tensorflow/python/debug/lib/grpc_debug_server.py", line 26, in <module>
    from concurrent import futures
ImportError: No module named concurrent
Member

wchargin commented Aug 28, 2017

Using 1.3.0rc2 instead of 1.3.0, with the link that you provided, does not fix the problem.

Additionally installing grpcio does not fix the problem.

In my site packages, the tensorflow.python.debug.lib package contains no file grpc_debug_server.py, so it is no wonder that the import fails. You don't seem to have this problem: could you please post your output for

from tensorflow.python.debug.lib import grpc_debug_server
print(grpc_debug_server.__file__)

Note that this file does exist in nightly TensorFlow. However, (a) I'd thought that we no longer wanted to depend on nightly since the 1.3 release (correct me if wrong?), and (b) the import still fails because a transitive dependency is missing: if I write

$ virtualenv /tmp/tensorflow-nightly-20170828
$ source /tmp/tensorflow-nightly-20170828/bin/activate
$ pip install 'https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-cpu/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/lastSuccessfulBuild/artifact/pip_test/whl/tensorflow-1.3.0-cp27-none-linux_x86_64.whl'
$ bazel run tensorboard

then the error is

Traceback (most recent call last):
  File "/home/wchargin/.cache/bazel/_bazel_wchargin/3f99396cfb979f2f5a2059c1fd233f92/execroot/org_tensorflow_tensorboard/bazel-out/local-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/main.py", line 38, in <module>
    from tensorboard.plugins.debugger import debugger_plugin as debugger_plugin_lib
  File "/home/wchargin/.cache/bazel/_bazel_wchargin/3f99396cfb979f2f5a2059c1fd233f92/execroot/org_tensorflow_tensorboard/bazel-out/local-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_plugin.py", line 35, in <module>
    from tensorboard.plugins.debugger import debugger_server_lib
  File "/home/wchargin/.cache/bazel/_bazel_wchargin/3f99396cfb979f2f5a2059c1fd233f92/execroot/org_tensorflow_tensorboard/bazel-out/local-fastbuild/bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_server_lib.py", line 33, in <module>
    from tensorflow.python.debug.lib import grpc_debug_server
  File "/tmp/tensorflow-nightly-20170828/local/lib/python2.7/site-packages/tensorflow/python/debug/lib/grpc_debug_server.py", line 26, in <module>
    from concurrent import futures
ImportError: No module named concurrent
@wchargin

This comment has been minimized.

Show comment
Hide comment
@wchargin

wchargin Aug 28, 2017

Member

To summarize, the only configuration that I have found to work is to install both TensorFlow nightly and the separate grpcio package, which provides the concurrent package. The former might be acceptable, but the latter isn't and should be fixed.

Member

wchargin commented Aug 28, 2017

To summarize, the only configuration that I have found to work is to install both TensorFlow nightly and the separate grpcio package, which provides the concurrent package. The former might be acceptable, but the latter isn't and should be fixed.

@ioeric

This comment has been minimized.

Show comment
Hide comment
@ioeric

ioeric Aug 28, 2017

Contributor

FYI, I ran into the same problem, and I did pip install grpc which seemed to fix the problem.

Contributor

ioeric commented Aug 28, 2017

FYI, I ran into the same problem, and I did pip install grpc which seemed to fix the problem.

@caisq

This comment has been minimized.

Show comment
Hide comment
@caisq

caisq Aug 28, 2017

Contributor

I think this may have to do with the recent update in the tensorboard version that tensorflow 1.3.0 depends on. The new version includes the PR that open-sourced plugin/debugger: #310.

But plugin/debugger depends on grpc_debug_server, which is not available in tensorflow 1.3.0. It is available in tensorflow HEAD, though.

So we have a few options:

  1. Put out a patch release of tensorboard with the PR reverted.
  2. Put out a patch release of tensorflow with the grpc_debug_server cherry picked.

@jart

Contributor

caisq commented Aug 28, 2017

I think this may have to do with the recent update in the tensorboard version that tensorflow 1.3.0 depends on. The new version includes the PR that open-sourced plugin/debugger: #310.

But plugin/debugger depends on grpc_debug_server, which is not available in tensorflow 1.3.0. It is available in tensorflow HEAD, though.

So we have a few options:

  1. Put out a patch release of tensorboard with the PR reverted.
  2. Put out a patch release of tensorflow with the grpc_debug_server cherry picked.

@jart

@caisq

This comment has been minimized.

Show comment
Hide comment
@caisq

caisq Aug 28, 2017

Contributor

@wchargin, I may have misunderstood the issue in my previous comment. Now I realize that the issue happens only for developers working at tensorboard master HEAD. For this developer workflow, the way to resolve this issue is to install the nightly tensorflow, instead of tensorflow 1.3.0. tensorflow 1.3.0 doesn't have the grpc_debug_server. The nightly install instructions can be found here:
https://github.com/tensorflow/tensorflow#installation

Note that the Travis testing we have is performed against nightly tensorflow, not latest-release tensorflow.

Contributor

caisq commented Aug 28, 2017

@wchargin, I may have misunderstood the issue in my previous comment. Now I realize that the issue happens only for developers working at tensorboard master HEAD. For this developer workflow, the way to resolve this issue is to install the nightly tensorflow, instead of tensorflow 1.3.0. tensorflow 1.3.0 doesn't have the grpc_debug_server. The nightly install instructions can be found here:
https://github.com/tensorflow/tensorflow#installation

Note that the Travis testing we have is performed against nightly tensorflow, not latest-release tensorflow.

@luchensk

This comment has been minimized.

Show comment
Hide comment
@luchensk

luchensk Aug 29, 2017

I also met the issue before and fixed it by using the master branch of tensorflow as @caisq said as above.

luchensk commented Aug 29, 2017

I also met the issue before and fixed it by using the master branch of tensorflow as @caisq said as above.

@luchensk

This comment has been minimized.

Show comment
Hide comment
@luchensk

luchensk Aug 29, 2017

BTW,if you work on MAC OS, please refer to tensorflow/tensorflow#12123, which includes a workaround to compile tensorflow on MAC by replacing -Werror with -Wno-excessive-errors in
add_boringssl_s390x.patch.

luchensk commented Aug 29, 2017

BTW,if you work on MAC OS, please refer to tensorflow/tensorflow#12123, which includes a workaround to compile tensorflow on MAC by replacing -Werror with -Wno-excessive-errors in
add_boringssl_s390x.patch.

@RenatoUtsch

This comment has been minimized.

Show comment
Hide comment
@RenatoUtsch

RenatoUtsch Sep 1, 2017

Just update to Bazel 0.5.4, the -Werror hack is not needed anymore.

RenatoUtsch commented Sep 1, 2017

Just update to Bazel 0.5.4, the -Werror hack is not needed anymore.

@wchargin

This comment has been minimized.

Show comment
Hide comment
@wchargin

wchargin Oct 3, 2017

Member

Bump—this issue continues to occur on a fresh clone (repro below), and using TF nightly does not fix the issue. @caisq

Here is a revised repro script:

#!/bin/sh
set -eux
tmpdir="$(mktemp -d --suffix _tensorflow)"
virtualenv "${tmpdir}"
. "${tmpdir}/bin/activate"
pip install 'https://ci.tensorflow.org/view/tf-nightly/job/tf-nightly-linux/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/52/artifact/pip_test/whl/tf_nightly-1.head-cp27-none-linux_x86_64.whl'
# pip install futures
# pip install grpc
bazel build //tensorboard
./bazel-bin/tensorboard/tensorboard --logdir ~/data/

This yields:

Traceback (most recent call last):
  File "/home/wchargin/git/tensorboard/bazel-bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/main.py", line 38, in <module>
    from tensorboard.plugins.debugger import debugger_plugin as debugger_plugin_lib
  File "/home/wchargin/git/tensorboard/bazel-bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_plugin.py", line 35, in <module>
    from tensorboard.plugins.debugger import debugger_server_lib
  File "/home/wchargin/git/tensorboard/bazel-bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_server_lib.py", line 33, in <module>
    from tensorflow.python.debug.lib import grpc_debug_server
  File "/tmp/tmp.xP0p6ZLUpx_tensorflow/local/lib/python2.7/site-packages/tensorflow/python/debug/lib/grpc_debug_server.py", line 26, in <module>
    from concurrent import futures
ImportError: No module named concurrent

Uncommenting the first commented line yields:

Traceback (most recent call last):
  File "/home/wchargin/git/tensorboard/bazel-bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/main.py", line 38, in <module>
    from tensorboard.plugins.debugger import debugger_plugin as debugger_plugin_lib
  File "/home/wchargin/git/tensorboard/bazel-bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_plugin.py", line 35, in <module>
    from tensorboard.plugins.debugger import debugger_server_lib
  File "/home/wchargin/git/tensorboard/bazel-bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_server_lib.py", line 33, in <module>
    from tensorflow.python.debug.lib import grpc_debug_server
  File "/tmp/tmp.2juuukpm8w_tensorflow/local/lib/python2.7/site-packages/tensorflow/python/debug/lib/grpc_debug_server.py", line 27, in <module>
    import grpc
ImportError: No module named grpc

Uncommenting the second line works, although there is still a spurious log entry:

Import grpc:No module named gevent.socket

Note that I've had to go back to TensorFlow build 52 because of a regression introduced recently (#595 (comment)).

Surely this must be fixed. We have dependencies that we are failing to express; I just don't know what the right place to put them is. cc @jart

Member

wchargin commented Oct 3, 2017

Bump—this issue continues to occur on a fresh clone (repro below), and using TF nightly does not fix the issue. @caisq

Here is a revised repro script:

#!/bin/sh
set -eux
tmpdir="$(mktemp -d --suffix _tensorflow)"
virtualenv "${tmpdir}"
. "${tmpdir}/bin/activate"
pip install 'https://ci.tensorflow.org/view/tf-nightly/job/tf-nightly-linux/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/52/artifact/pip_test/whl/tf_nightly-1.head-cp27-none-linux_x86_64.whl'
# pip install futures
# pip install grpc
bazel build //tensorboard
./bazel-bin/tensorboard/tensorboard --logdir ~/data/

This yields:

Traceback (most recent call last):
  File "/home/wchargin/git/tensorboard/bazel-bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/main.py", line 38, in <module>
    from tensorboard.plugins.debugger import debugger_plugin as debugger_plugin_lib
  File "/home/wchargin/git/tensorboard/bazel-bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_plugin.py", line 35, in <module>
    from tensorboard.plugins.debugger import debugger_server_lib
  File "/home/wchargin/git/tensorboard/bazel-bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_server_lib.py", line 33, in <module>
    from tensorflow.python.debug.lib import grpc_debug_server
  File "/tmp/tmp.xP0p6ZLUpx_tensorflow/local/lib/python2.7/site-packages/tensorflow/python/debug/lib/grpc_debug_server.py", line 26, in <module>
    from concurrent import futures
ImportError: No module named concurrent

Uncommenting the first commented line yields:

Traceback (most recent call last):
  File "/home/wchargin/git/tensorboard/bazel-bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/main.py", line 38, in <module>
    from tensorboard.plugins.debugger import debugger_plugin as debugger_plugin_lib
  File "/home/wchargin/git/tensorboard/bazel-bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_plugin.py", line 35, in <module>
    from tensorboard.plugins.debugger import debugger_server_lib
  File "/home/wchargin/git/tensorboard/bazel-bin/tensorboard/tensorboard.runfiles/org_tensorflow_tensorboard/tensorboard/plugins/debugger/debugger_server_lib.py", line 33, in <module>
    from tensorflow.python.debug.lib import grpc_debug_server
  File "/tmp/tmp.2juuukpm8w_tensorflow/local/lib/python2.7/site-packages/tensorflow/python/debug/lib/grpc_debug_server.py", line 27, in <module>
    import grpc
ImportError: No module named grpc

Uncommenting the second line works, although there is still a spurious log entry:

Import grpc:No module named gevent.socket

Note that I've had to go back to TensorFlow build 52 because of a regression introduced recently (#595 (comment)).

Surely this must be fixed. We have dependencies that we are failing to express; I just don't know what the right place to put them is. cc @jart

@jart

This comment has been minimized.

Show comment
Hide comment
@jart

jart Oct 3, 2017

Member

It's assumed that, when working from source, you'll pip install futures and grpcio manually into your virtualenv, because it's nontrivial to express them in our Bazel build.

It's hard to integrate futures because, in the pip world, installing that package on Python3 is treated as a no-op. I'm not quite certain how to express that in a Bazel build. Integrating grpcio would require a lot of BUILD configuration and a lot of time spent compiling on Travis. It's not a beautiful thing.

I will however note that I've encountered some other strange errors relating to the debugger plugin and grpc. Please see this comment. It seems like our Travis build might be broken and I'm not sure why.

Member

jart commented Oct 3, 2017

It's assumed that, when working from source, you'll pip install futures and grpcio manually into your virtualenv, because it's nontrivial to express them in our Bazel build.

It's hard to integrate futures because, in the pip world, installing that package on Python3 is treated as a no-op. I'm not quite certain how to express that in a Bazel build. Integrating grpcio would require a lot of BUILD configuration and a lot of time spent compiling on Travis. It's not a beautiful thing.

I will however note that I've encountered some other strange errors relating to the debugger plugin and grpc. Please see this comment. It seems like our Travis build might be broken and I'm not sure why.

@wchargin

This comment has been minimized.

Show comment
Hide comment
@wchargin

wchargin Oct 3, 2017

Member

@jart: Thanks for the summary. That's quite unfortunate. I'll add that to DEVELOPMENT.md, but I propose that this issue remain open: if we have some opportunity to fix it (a fixit day, or someone just feels like it some time), then that will be nice.

I linked to that comment of yours near the end of my comment; I can reproduce the issues when using TensorFlow nightly, and I have not found a resolution (though I have not looked too deeply, either).

Member

wchargin commented Oct 3, 2017

@jart: Thanks for the summary. That's quite unfortunate. I'll add that to DEVELOPMENT.md, but I propose that this issue remain open: if we have some opportunity to fix it (a fixit day, or someone just feels like it some time), then that will be nice.

I linked to that comment of yours near the end of my comment; I can reproduce the issues when using TensorFlow nightly, and I have not found a resolution (though I have not looked too deeply, either).

@caisq

This comment has been minimized.

Show comment
Hide comment
@caisq

caisq Oct 3, 2017

Contributor

@wchargin, @jart: futures and grpcio are listed as dependencies of the tensorboard pip package in setup.py. setup.py does not affect bazel runs obviously, which is the reason for the ImportErrors that @wchargin mentioned. The ImportErrors do not occur when pip package is built and installed in a virtualenv.

As for the weird issue that @jart mentioned, I just ran bazel test tensorboard/... on my machine in a virtualenv with futures and grpcio installed. I saw some breakage related to SummaryMetadata, but not the one that @jart pasted:

AttributeError: 'SymbolDatabase' object has no attribute 'RegisterServiceDescriptor'

@jart, can you let me know which test shows this particular error?

Contributor

caisq commented Oct 3, 2017

@wchargin, @jart: futures and grpcio are listed as dependencies of the tensorboard pip package in setup.py. setup.py does not affect bazel runs obviously, which is the reason for the ImportErrors that @wchargin mentioned. The ImportErrors do not occur when pip package is built and installed in a virtualenv.

As for the weird issue that @jart mentioned, I just ran bazel test tensorboard/... on my machine in a virtualenv with futures and grpcio installed. I saw some breakage related to SummaryMetadata, but not the one that @jart pasted:

AttributeError: 'SymbolDatabase' object has no attribute 'RegisterServiceDescriptor'

@jart, can you let me know which test shows this particular error?

@wchargin

This comment has been minimized.

Show comment
Hide comment
@wchargin

wchargin Oct 3, 2017

Member

@caisq I reproduce @jart's exact error with the script in #431 (comment), by changing the build number from 52 to 56. Moreover, 56 is the earliest bad build. Simply running bazel run tensorboard triggers the error.

That is, the following script reproduces:

#!/bin/sh
set -eux
tmpdir="$(mktemp -d --suffix _tensorflow)"
virtualenv "${tmpdir}"
. "${tmpdir}/bin/activate"
pip install 'https://ci.tensorflow.org/view/tf-nightly/job/tf-nightly-linux/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/56/artifact/pip_test/whl/tf_nightly-1.head-cp27-none-linux_x86_64.whl'
pip install futures
pip install grpc
bazel build //tensorboard
./bazel-bin/tensorboard/tensorboard --logdir ~/data/
Member

wchargin commented Oct 3, 2017

@caisq I reproduce @jart's exact error with the script in #431 (comment), by changing the build number from 52 to 56. Moreover, 56 is the earliest bad build. Simply running bazel run tensorboard triggers the error.

That is, the following script reproduces:

#!/bin/sh
set -eux
tmpdir="$(mktemp -d --suffix _tensorflow)"
virtualenv "${tmpdir}"
. "${tmpdir}/bin/activate"
pip install 'https://ci.tensorflow.org/view/tf-nightly/job/tf-nightly-linux/TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=cpu-slave/56/artifact/pip_test/whl/tf_nightly-1.head-cp27-none-linux_x86_64.whl'
pip install futures
pip install grpc
bazel build //tensorboard
./bazel-bin/tensorboard/tensorboard --logdir ~/data/
@wchargin

This comment has been minimized.

Show comment
Hide comment
@wchargin

wchargin Oct 3, 2017

Member

Here's the commit diff from 54→56 (there is no build 55); one of these changes causes the regression: tensorflow/tensorflow@e3ceea3...64f0ebd

Member

wchargin commented Oct 3, 2017

Here's the commit diff from 54→56 (there is no build 55); one of these changes causes the regression: tensorflow/tensorflow@e3ceea3...64f0ebd

@chihuahua

This comment has been minimized.

Show comment
Hide comment
@chihuahua

chihuahua Oct 3, 2017

Member

Have we tried changing the version of protobuf?
GoogleCloudPlatform/google-cloud-python#3967

I think I've seen that AttributeError before while using TensorFlow, and I resolved by installing protobuf 3.1.0. https://www.tensorflow.org/versions/r0.12/get_started/os_setup#protobuf_library_related_issues

Member

chihuahua commented Oct 3, 2017

Have we tried changing the version of protobuf?
GoogleCloudPlatform/google-cloud-python#3967

I think I've seen that AttributeError before while using TensorFlow, and I resolved by installing protobuf 3.1.0. https://www.tensorflow.org/versions/r0.12/get_started/os_setup#protobuf_library_related_issues

@wchargin

This comment has been minimized.

Show comment
Hide comment
@wchargin

wchargin Oct 3, 2017

Member

@chihuahua downgrading protobuf from 3.4.0 to 3.1.0 does not fix the issue.

Member

wchargin commented Oct 3, 2017

@chihuahua downgrading protobuf from 3.4.0 to 3.1.0 does not fix the issue.

@wchargin

This comment has been minimized.

Show comment
Hide comment
@wchargin

wchargin Oct 3, 2017

Member

I observe the following commit in the list: "Update protobuf to 3.4.1" (tensorflow/tensorflow@d16262d). It seems probable that this is related.

Member

wchargin commented Oct 3, 2017

I observe the following commit in the list: "Update protobuf to 3.4.1" (tensorflow/tensorflow@d16262d). It seems probable that this is related.

@caisq

This comment has been minimized.

Show comment
Hide comment
@caisq

caisq Oct 3, 2017

Contributor

I have some rough ideas of what might be the cause and how to fix it from the tensorflow side. Will give it a shot tomorrow.

Contributor

caisq commented Oct 3, 2017

I have some rough ideas of what might be the cause and how to fix it from the tensorflow side. Will give it a shot tomorrow.

@jart

This comment has been minimized.

Show comment
Hide comment
@jart

jart Oct 3, 2017

Member

Upgrading grpc and protobuf doesn't fix the issue either. How stable is grpc? I'm concerned that issues like these could cause problems for TensorBoard and TensorFlow users if we make it a dependency. Should we rework the debugger code so that it can survive if importing grpc fails? Then have an "inactive plugin" page that tells the user to pip install grpc if he/she wants to use it?

Member

jart commented Oct 3, 2017

Upgrading grpc and protobuf doesn't fix the issue either. How stable is grpc? I'm concerned that issues like these could cause problems for TensorBoard and TensorFlow users if we make it a dependency. Should we rework the debugger code so that it can survive if importing grpc fails? Then have an "inactive plugin" page that tells the user to pip install grpc if he/she wants to use it?

@caisq

This comment has been minimized.

Show comment
Hide comment
@caisq

caisq Oct 3, 2017

Contributor

@jart That sounds good to me, too. I will look into that rework.

Contributor

caisq commented Oct 3, 2017

@jart That sounds good to me, too. I will look into that rework.

@caisq

This comment has been minimized.

Show comment
Hide comment
@caisq

caisq May 24, 2018

Contributor

This issue is obsolete now. Closing it.

Contributor

caisq commented May 24, 2018

This issue is obsolete now. Closing it.

@caisq caisq closed this May 24, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment