Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requesting information on prebuild strategy #2206

Closed
kmalakoff opened this issue Dec 8, 2016 · 9 comments
Closed

Requesting information on prebuild strategy #2206

kmalakoff opened this issue Dec 8, 2016 · 9 comments
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: feature request

Comments

@kmalakoff
Copy link

I've been using bazel to build tensorflow in docker and on macOS as I'm experimenting with porting it to Node.js.

Ideally, I would like to pull down the prebuilt headers and libraries if when they have been published for my platform at the released version (eg. r0.10.0, r0.11.0, etc) with the desired variants (eg. gpu, non-gpu, etc) possibly using similar tooling as in the Node.js ecosystem: node-pre-gyp, node-pre-gyp-github, and electron-prebuilt.

I was wondering if there is a strategy is for prebuilding published releases in the bazel ecosystem as a release / distribution mechanism. I understand that it might not be the responsibility of the core of bazel, but as I've been waiting for builds to complete, it is something I've been I have grown to appreciate the need for...just wondering if this is planned, in the works, etc.

@iirina iirina added category: misc > release / binary P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) type: feature request labels Dec 9, 2016
@damienmg damienmg added P2 We'll consider working on this in future. (Assignee optional) and removed P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) labels Dec 12, 2016
@damienmg damienmg added this to the 0.8 milestone Dec 12, 2016
@damienmg
Copy link
Contributor

TL;DR: there is no strategy yet, but we want to have one.

@damienmg
Copy link
Contributor

/cc @klimek

@damienmg damienmg removed their assignment Dec 12, 2016
@ghost
Copy link

ghost commented Dec 12, 2016

There are multiple questions that you could be asking here:

  1. how to include a dependent bazel project in your own bazel project easily
  2. how to download / install a released bazel project
    Those will require significantly different answers.
    For 1) we need to make the c++ builds work better in a virtualized environment; I believe bazel team is working on that in various ways
    for 2) I expect this will mainly stay "install via your distribution", be that macports, ubuntu, or a bsd flavor; that'll require having a good "bazel install" story though, which we don't have yet

I expect you asked for the second, as you mentioned prebuilt headers, but wanted to be sure :)

@kmalakoff
Copy link
Author

@klimek Thank you for confirming! I am more interested in 2.

To clarify on "install via your distribution", I am not looking for an install from source, but a repository of prebuilt libraries that can be pulled from avoiding: 1) bazel needing to be installed on the user's machine and 2) avoiding the compile time. The use case is like with node-pre-gyp where the user uses "npm install" and the libraries are just copied down to their machine if they have been released for the user's configuration and if they are unavailable, then the compilation / install from source is triggered (where they would need to install bazel, etc).

FYI: I have a script for 1 and vendor folder structure that you might find interesting since I spent quite a bit of time trying to figure out how to build tensorflow using bazel without adding my project to the tensorflow repo. I was able to use a simple command like "bazel build @org_tensorflow//tensorflow:all_files" to bootstrap the source code in my bazel cache.

@ghost
Copy link

ghost commented Dec 13, 2016

@kmalakoff - yep, for that we need to basically allow a bazel built repository to be "bazel install"able (that's what binary distributors require), then the distributions will start to pick up the projects you're interested in by themselves (in my experience).

@kmalakoff
Copy link
Author

@klimek interesting...I'm still trying to connect the dots on what an ideal solution would look like.

When I started to play around with tensorflow, it sent me down the path learn about bazel and then to try to find a solution to configuring a project to build the binaries. What I really wanted was the binaries and headers for specific releases that I could just use, but I needed to spend time on learning tooling, setting up build scripts, trial and error to get them working in a non-intrusive way, and waiting for builds with each iteration...and I'm still not satisfied because if I distribute my library which has a dependency on tensorflow, it will make other people have to go down the same learning path (eg. what is bazel, how do I install it, etc).

If there is a common solution for the distribution and installation of binaries already with wrappers in various languages, that would be perfect. Unfortunately, I am not very familiar with the ecosystem around binary distribution. Is there a binary distributor today where I can find and install the tensorflow binaries and header without python and bazel? If there isn't, it seems like there is a missing solution around distribution.

More recently have been working with Node.js which focusses on packages so my thinking is coming from that perspective. More on the distribution process than build process. In this issue, there is a discussion on python dependencies in node-gyp which spawned a project called cmake-js and a discussion around binaries.

Perhaps the solution space is in having bazel-as-a-service to build the binaries off machine, cache them, serve the binaries and headers, and have light client libraries in various languages to fetch dependencies.

From the opening of this issue, I'm wondering what the "strategy is for prebuilding published releases in the bazel ecosystem as a release / distribution mechanism". Unfortunately, I'm not very familiar with the ecosystem around bazel, but this seems like a great opportunity to connect bazel with binary distributors, to make a package manager for bazel, etc.

@ghost
Copy link

ghost commented Dec 14, 2016

@kmalakoff - which platform are you talking about? For linux, we have a long history of the distribution (debian, ubuntu, etc) shipping all the binary packages you want. You just install them and you're done.

If you want to install something from source, the project usually is git clone, configure, make install (or similar); this is something that bazel needs to learn. This will probably also be a precondition for the packages being picked up by the linux distributions (as the scripts there work with that process).

For mac, some projects do their own binary distributions, others rely on macports. On windows, you usually get prepackaged things.

@kmalakoff
Copy link
Author

kmalakoff commented Dec 14, 2016

@klimek all platform that use npm / yarn to install Node.js modules. The interesting part about this is that the target could be more than just the machine installed on, but also in the case for example of React Native, it could be to install for iOS or Android targets (eg. npm install tensorflow-react-native-ios).

This request for clarification on strategy is definitely to prevent the need to install from source during installation as opposed to during development where it is more acceptable (but still binaries and headers for linking against specific released version would be an important use case and speedup in development).

If you try to run brew install tensorflow, there is no released module and the tensorflow team only gives prebuilt binaries in Python packages. Unfortunately, the platform install method wouldn't cover the case of distributing React Native modules that depend on "other platform" code.

Also, if you look at how developers who are trying to write tensorflow extensions are adding their projects to a clone of the whole tensorflow repository (often under contrib) because of the bazel workflow, this gives an indication of the importance of the distribution problem. This was an unacceptable solution for me which is why I worked on the install script using the vendor directory as I mentioned above.

There are three really important philosophies that npm-based use cases are built on:

  1. project local installs - avoiding global system installations, pulling down prebuilt binaries whenever possible for the application at hand, etc

  2. keeping things simple - it is much more of an accessible experience to avoid people having people fussy around native binary distributions. For example, if you have a project where artists and designers also need to install an application to implement and test their work, you want to make it as simple as possible for them to get up and running. Some teams are now resorting to Docker for development just to avoid the setup time which indicates that there is a problem here that still needs to be solved.

  3. cross-platform - it should be simple to write and distribute a module that runs on multiple platforms. I personally develop on a Mac, but I see with bazel being cross-platform, I would like to be able to easily compile for all targets without having to learn much about the tooling for each platform, eg. treat native like a cross-platform JavaScript module whenever possible.

IMHO, I think the important aspect of the bazel distribution strategy is to focus on the installation use cases for a variety of package managers (not global platform installers), for prebuilt binaries with headers that case be linked to locally and in off-platform (eg. mobile) builds, across a wide variety of package consumers with different levels of technical backgrounds. I think that would be a much better user experience.

@aiuto aiuto added P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Bazel General Bazel product/strategy issues and removed P2 We'll consider working on this in future. (Assignee optional) category: misc > release / binary labels Dec 7, 2018
@jmmv jmmv removed this from the 0.8 milestone Mar 14, 2019
@dslomov dslomov added team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. and removed team-Bazel General Bazel product/strategy issues labels Jul 23, 2019
@philwo philwo added the team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website label Jun 15, 2020
@philwo philwo removed the team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website label Nov 29, 2021
@sgowroji
Copy link
Member

Hi there! We're doing a clean up of old issues and will be closing this one. Please reopen if you’d like to discuss anything further. We’ll respond as soon as we have the bandwidth/resources to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: feature request
Projects
None yet
Development

No branches or pull requests

8 participants