New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable unbundling dependencies and linking to the system libraries instead. #20284
Conversation
@angersson so you can make sure this doesn't conflict with your changes. |
@jart Can you have a look at this as well? The gist of this looks good to me, but I'm not certain of the implications for other systems, etc. I'm working on updating my internal proposal to better work with this. |
fail("\n%sSystem Library Configuration Error:%s %s\n" % (red, no_color, msg)) | ||
|
||
|
||
def _enable_syslibs(repository_ctx): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is Linux specific feature, can you disable it on Windows?
You can check if it's building on Windows with a function like:
tensorflow/third_party/py/python_configure.bzl
Lines 31 to 36 in fb3ce04
def _is_windows(repository_ctx): | |
"""Returns true if the host operating system is windows.""" | |
os_name = repository_ctx.os.name.lower() | |
if os_name.find("windows") != -1: | |
return True | |
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, I added this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a comment about Windows.
LGTM from Bazel side, thanks for the great work!
I pushed an update:
The header file paths for jsoncpp are different from the system paths so I had to symlink them. bazel complains when I tried doing includes=["/usr/include/jsoncpp"]. Maybe in a future bazel version we can add some configs to allow non- hermetic builds using the system headers or searching pkg-config or something. I've run this patchset in the gentoo ebuild both bundled and unbundled and its working so far :) |
Thank you for working on this! I've been trying to use this for Anaconda's packages, too. I'm having some difficulty with the hardcoding of paths, such as with cython: https://github.com/tensorflow/tensorflow/pull/20284/files#diff-c8176cf1a58a40560176239b30ef7e52R8 Is there a good way to have that be more dynamic? For example, all of our build paths are relative to an environment variable, |
I have pushed sci-libs/tensorflow-1.9.0_rc1-r2 to the gentoo tree with these patches. the system-libs USE-flag enabled the system ones, disabling the flag uses the bundled stuff like before in case there are any issues. Hopefully I dont get too many bugs filed because of it :D |
@msarahan yeah, the prefixing is something that I forsaw as problematic. Normal gentoo uses $ROOT=/, but we definitely do support both cross-compiling (eg $ROOT=/usr/arm-linux-gnu/) and also installing things as a whole alongside another distro/machine/whatever (eg $PREFIX=/home/jason/root/). Currently those cases are probably also broken. I was thinking to write some simple scripts like find-binary.sh and find-python.sh and change the genrule()'s to use those instead of directly cp'ing fixed paths. Once there are scripts then it would be quite easy to add extra logic per-distro or to search pkg-config for whatever flags. Not every distro puts binaries in the same place too so scripts would be more robust. About $PREFIX specifically, bazel strips basically everything from the environment currently. There are issues filed with the bazel team and it will eventually more variables. In the mean time you'd need to add something like build --action_env=PREFIX. This patch set is already getting pretty large so I think we should get this basic functionality merged in then later patchsets will be smaller and much easier to review :) Other than editing some paths in third_party/systemlibs/*.BUILD, has the rest mostly been working for you? If some of the core bits fail horribly for your case, that would probably warrant fixing first. You can look at the gentoo ebuilds for inspiration too but they aren't pretty yet: https://github.com/gentoo/gentoo/tree/master/sci-libs/tensorflow |
Thanks @perfinion. The problem with hardcoded paths at all is that we don't really use them for conda packages. $PREFIX points to some path that looks like /usr or /usr/local (with include, bin, and lib subdirs), but it is not a fixed path. For example, during a package build, it will be something like
with (pad) representing enough placeholder characters to get us to a 255 character padding for any fixed paths that get baked into binaries. That's essential to relocatability. Not having dynamic paths is a pretty big non-starter for us. Adding --action_env=PREFIX is definitely something I knew we'd need to use - I'm just curious about how to incorporate that into the BUILD files. To put headers and libs onto the search paths, I first symlinked my $PREFIX var in my build script to a location within the Tensforflow third_party folder, and added that folder to the include and link paths:
That worked OK, seemingly. Next I was hung up on the hard-coded paths as I mentioned before. I began to try something a little different, based on what I found at https://stackoverflow.com/questions/43928653/call-llvm-config-prefix-and-use-it-in-a-build-rule/43936381#43936381. I removed the BUILD files completely, and instead tried to generate them more dynamically:
I have not gotten this working yet, but I think it might work. It would still require --action_env=PREFIX, but that's fine with me. Any thoughts on the viability of this approach? It need not be part of this PR, if it is better to get this initial work in and then adapt. |
PS: for the executables that needed to be copied, I symlinked after using which to find them:
|
@msarahan for cython specifically, it used to prefix it with $PYTHON_BIN_PATH which made things not work on gentoo since we have a special thing to select which python version is the default. that has been reverted so looks like its really easy now. You dont need to bother with any of the generating stuff, the cmd= and copts and linkopts can all do make and bash variable expansion. You should be able to have just cython.BUILD with this:
Alternatively, in .bazelrc set putting an In copts and linkopts you can probably do "-I$(PREFIX)/include". |
Seems very close to working, but I'm confused on an error with the external grpc. I have put up my patch (I am applying it to 1.9.0rc1 right now): https://gist.github.com/msarahan/e38ccd45521617356a099d50b172f2b8 The error is:
My build script calls bazel with:
Before I was passing in --define as you recommended, it was dying in earlier externals, such as png. That gives me confidence in the overall approach, but it seems I have something wrong. If I remove grpc from the list of externals, it gets further, but dies on missing the license from LMDB. It should not be necessary to list actual files in the external stuff, right? |
@msarahan I rebased everything and also added the $(which foo) stuff about your grpc error: its probably cuz you need to use double $$'s here: I added those parts tho already so hopefully it should work if you just add the linkopts and includes. |
94e87b9
to
140c34b
Compare
I think I'm satisfied now with this patchset so far. It should be ready for review then merging when the workflow rejig is done. I've pushed the older version with the gentoo _rc1-r2 ebuild. The _rc2 ebuild with the latest patches will be pushed to gentoo once some of the deps are bumped too. |
Can you fix the buildifier failure? |
@perfinion Thanks for keeping up with updating this! Would you mind commenting with some commands to demonstrate how to test this on Ubuntu? It'll double as documentation for later. I'm far from an expert on builds, and as a user I wouldn't know how to go from 'jpeg is one of the packages that can come unbundled' to 'I need to |
Signed-off-by: Jason Zaman <jason@perfinion.com>
…em_lib macros Signed-off-by: Jason Zaman <jason@perfinion.com>
Signed-off-by: Jason Zaman <jason@perfinion.com>
The grpc license files are deeper in the heirarchy so are hard to nop out in the system version of the BUILD file. Signed-off-by: Jason Zaman <jason@perfinion.com>
The jsoncpp headers are included with a different path so we have to symlink them so the are in the dir structure that is expected. Signed-off-by: Jason Zaman <jason@perfinion.com>
…resh Signed-off-by: Jason Zaman <jason@perfinion.com>
@martinwicke @angersson I fixed the buildifier style fix and rebased. can you re-trigger the tests? I think the nobuild error was unrelated, looks like this commit fixes it. The rebase should pick that up too.
|
The changes here need manual intervention to work with the new PR import process, which I am currently working on handling. |
PiperOrigin-RevId: 204539752
Done and merged! Thanks! |
Yeah! Thank you all for vastly improving the TensorFlow experience on Gentoo Linux. |
/cc @dslomov |
tensorflow#20284 Added the framework to unbundle deps and use system libraries. The TF_SYSTEM_LIBS variable needed to be added manually. This makes configure handle it for convenience. There is no prompt yet, that will be added later. Signed-off-by: Jason Zaman <jason@perfinion.com>
tensorflow#20284 Added the framework to unbundle deps and use system libraries. The TF_SYSTEM_LIBS variable needed to be added manually. This makes configure handle it for convenience. There is no prompt yet, that will be added later. Signed-off-by: Jason Zaman <jason@perfinion.com>
tensorflow#20284 Added the framework to unbundle deps and use system libraries. The TF_SYSTEM_LIBS variable needed to be added manually. This makes configure handle it for convenience. There is no prompt yet, that will be added later. Signed-off-by: Jason Zaman <jason@perfinion.com>
@perfinion do you observe a significant change in build time on gentoo when using system libraries? I tried to adopt this for the nix package, but to my surprise the build time barely changed (around 1h on a 12 core machine, 3.5h on a 4 core). |
tensorflow#20284 Added the framework to unbundle deps and use system libraries. The TF_SYSTEM_LIBS variable needed to be added manually. This makes configure handle it for convenience. There is no prompt yet, that will be added later. Signed-off-by: Jason Zaman <jason@perfinion.com>
@timokau Sorry, didnt see this earlier, yeah its a fair bit faster if you unbundle all of them. GRPC, protobuf and several others are decently big but you're right stuff like libpng are fairly small so are not a huge compile time. The biggest part I've noticed tho is the linking stage is a lot faster and uses a lot less RAM than when static linking. Also the python deps were really annoying unless unbundled. Also feel free to email me (jason at perfinion dot com) if you have any issues with packaging, i'd love to help but I might not see this thread. :) |
Thank you for your feedback and the offer :) We have since managed to get the source build working and made use of most of the decoupled dependencies. Bundled deps are not quite as painful with nix (as everything has its own FHS etc.), but it's still nice to be able to unbundle them. Thank you for your work on this! |
This series adds a framework to be able to unbundle the dependencies from tensorflow. Previously, bazel rebuilds every single dep from scratch. This allows distro packages to disable bundling on a per-dep basis and link with the libraries that already exist on the system.
For more information why deps should be unbundled, see these links: https://wiki.gentoo.org/wiki/Why_not_bundle_dependencies
https://fedoraproject.org/wiki/Bundled_Libraries
The deps I have unbundled so far work on my Gentoo machine. Thanks to dennisjenkins@google.com for testing this on Gentoo as well. There are still more that need to be unbundled but this covers enough to be useful already. This has not been tested on any other distro than Gentoo but I don't forsee any big incompatibilities.
The libs that are unbundled are configured by setting
build --action_env TF_SYSTEM_LIBS="comma,sep,list"
eg:
Once configured, the tarball for the dep will not be downloaded or extracted, and the BUILD file will be swapped to the system_build_file instead which contains different rules to compile and link against the system package.
There are also macros if_system_lib(name, a, b) and if_not_system_lib(name, a, b) to configure other things in the build.
DO NOT MERGE THIS YET. I'm opening this for comments are review here before its ready for merging.