Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a way to build a Tensorflow wheel without a dependency on Tensorboard #20280

Closed
orf opened this issue Jun 25, 2018 · 16 comments
Closed
Assignees
Labels
stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author stat:contribution welcome Status - Contributions welcome type:build/install Build and install issues

Comments

@orf
Copy link
Contributor

orf commented Jun 25, 2018

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): N/A
  • TensorFlow installed from (source or binary): Source
  • TensorFlow version (use command below): 1.8
  • Python version: 3
  • Bazel version (if compiling from source): 0.14.1
  • GCC/Compiler version (if compiling from source): 6.3.0
  • CUDA/cuDNN version: N/A
  • GPU model and memory: N/A
  • Exact command to reproduce: N/A

Describe the problem

When building a custom Tensorflow wheel from source there is currently no way to disable the resulting wheels dependency on tensorboard. Tensorboard is great during development but is not useful when running in a headless server environment. It brings with it a large number of dependencies, and as I believe a custom tensorflow build is likely to be used in this kind of environment and it would be great to be able to disable the install_requires dependency on tensorboard in this case.

@marshalhayes
Copy link
Contributor

Hi, @orf. I don't believe that TensorBoard will ever be separated from TensorFlow. The TensorFlow team has worked hard to keep the package small in size, and they want TensorFlow to come with a nice visualization tool.

Please see tensorflow#12567. It doesn't solve your question, but it may be useful in deciding how to continue.

@orf
Copy link
Contributor Author

orf commented Jun 26, 2018

Hey @marshalhayes, thank you for your quick reply. Let me just elaborate a bit.

I have no problem with Tensorboard being bundled with the default Tensorflow package but in the context of a server environment specifically I've elaborated on a few issues below. I only ask for the idea that when building a custom Tensorflow package the option is given to disable the dependency link somehow.

  1. The current stable tensorflow, via tensorboard, pulls in vulnerable versions of html5lib and bleach, in the case of html5lib there is a CVE from 2016 (CVE-2016-9909). This causes headaches with compliance - "yes sir we are indeed running vulnerable data sanitization packages with known CVEs, but trust us it's not used. we swear".

  2. These sub dependencies can cause conflicts in a server environment, especially with bleach and htm5lib, but also with werkzeug and markdown.

  3. Python packaging itself is a headache, and if you use something like pipenv it's not easy or even supported to exclude a subdependency in the way we want. Running pip uninstall xyz afterwards is a hack at best and one only exclusively used for this situation. Not great.

All of these are exacerbated in my mind as Tensorboard is not used or required in the situations we are using it. As we are building a custom Tensorflow package that matches our CPU features I would simply love a build flag to disable the dependency link.

@tensorflowbutler
Copy link
Member

Nagging Assignee @michaelisard: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly.

@martinwicke
Copy link
Member

This would be an opportunity to create two packages: tensorflow-core, which only installs tensorflow (and which may in fact result in some tensorflow python endpoints not working) and tensorflow (which is a virtual package that has dependencies on tensorflow-core and tensorboard. In the future, we may split tensorflow-core into more packages.

I am not fundamentally opposed to this, but

  • there haven't been terribly compelling reasons (beyond uninstalling after installing, there are other ugly methods to exclude dependencies during installation)
  • some functionality in TensorFlow does depend on TensorBoard. All of the imports are guarded and should not lead to defects beyond the actual things that won't work, but we'd like to distribute a fully functional version. This can be fixed/minimized further.

Most importantly though, I am hesitant to offer another package that people will come to depend on for a fixed set of functionality. For example, I wouldn't want people to get used to tensorflow-core coming with Estimators. We're in the process of separating tf.estimator into its own package. In the future, more such splits may happen (kernel libraries, ops libraries), and I think the exact dependency tree of all those packages needs to be an implementation detail. Maybe that's ok by calling them tensorflow_private_seekrit_kernels_do_not_depend_directly_on_this, but I think depending directly on this is exactly what you want to do.

I think a tensorflow-server-only package which only contains what you need to run a tensorflow server is really what you want? As in, none of the Python API either?

@orf
Copy link
Contributor Author

orf commented Jul 18, 2018

This would be an opportunity to create two packages: tensorflow-core, which only installs tensorflow (and which may in fact result in some tensorflow python endpoints not working) and tensorflow (which is a virtual package that has dependencies on tensorflow-core and tensorboard. In the future, we may split tensorflow-core into more packages.

I think this might be going too far, all I'm asking for is a TENSORFLOW_DONT_INCLUDE_TENSORBOARD bazel config option or something to just remove it from the build setup.py. Nothing too complex. Splitting them like you suggest does sound like a good idea in theory, but as you said it can come with problems. For now just allow us to remove a dependency if we so require.

You distribute Tensorflow as a lowest common denominator binary. Ok, that's fine and makes sense. So we should build it ourselves for our specific CPU features when running it at any scale. Ok, that's also fine and makes sense. But we should always always always include Tensorboard as a dependency? That's not fine, especially when it leads to conflicting dependencies and those dependencies have known CVE's against them.

I'd love to switch to tensorflow-server, and I'm pushing for it, but... you know. Time.

As a side note, the new Tensorflow release (congratulations!) triggers a defect in the pipenv dependency resolver. The culprit? ... tensorboard (and a non-backtracking resolver). Please let me remove this from our custom builds without having to resort to horrible sed post build hacks.

@martinwicke martinwicke added the stat:contribution welcome Status - Contributions welcome label Jul 23, 2018
@martinwicke
Copy link
Member

Yeah, that's reasonable. @yifeif do you have a sense of how easy this is to achieve using bazel? I suppose we could give setup.py a flag which we set using some bazel option, or something.

Or, we could simply add a flag to setup.py, and you execute setup.py bdist_wheel yourself? That's definitely feasible and looks easy to do.

I've added contributions welcome if you want to propose a specific way to do it.

@yifeif
Copy link
Contributor

yifeif commented Jul 23, 2018

That should be possible. We are already using flags to enable/disable things in setup.py for nightly. Is that something you are looking for @orf?

@orf
Copy link
Contributor Author

orf commented Jul 23, 2018

Yep, that's exactly what I am looking for. Thanks!

Maybe this is a separate ticket but customizing the name of the package via a bazel flag would be awesome as well. Currently I am relying on a particular bash script that is sourced before setup.py is executed to overwrite a bash variable in the script that calls it.

I'm currently a bit short on time but I will try and propose a PR to do both, if that is OK?

@martinwicke
Copy link
Member

That sounds good, thanks!

@tensorflowbutler
Copy link
Member

Please remove the assignee, as this issue is inviting external contributions. Otherwise, remove the contributions welcome label. Thank you.

@mohantym mohantym self-assigned this Nov 3, 2022
@mohantym
Copy link
Contributor

mohantym commented Nov 16, 2022

Hi @orf!
Could you test the instruction in this thread which suggests removing the Tensorboard dependency from pip_package/build file and re-build it from source using Bazel.

Please post on TF-Forum for further assistance.

Thank you!

@mohantym mohantym added type:build/install Build and install issues stat:awaiting response Status - Awaiting response from author labels Nov 16, 2022
@google-ml-butler
Copy link

This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.

@google-ml-butler google-ml-butler bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Nov 23, 2022
@tekumara
Copy link

bump

@google-ml-butler
Copy link

Closing as stale. Please reopen if you'd like to work on this further.

@google-ml-butler
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

@mohantym
Copy link
Contributor

mohantym commented Nov 30, 2022

Hi @tekumara !
You can route support or custom build issues to TF forum .(It is actively monitored and answered by respective code owners/maintainers).
Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author stat:contribution welcome Status - Contributions welcome type:build/install Build and install issues
Projects
None yet
Development

No branches or pull requests

9 participants