Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Control linking of dependencies for cc_library #492

Closed
ghost opened this issue Oct 2, 2015 · 24 comments
Closed

Control linking of dependencies for cc_library #492

ghost opened this issue Oct 2, 2015 · 24 comments
Labels
P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) team-Rules-CPP Issues for C++ rules type: feature request

Comments

@ghost
Copy link

ghost commented Oct 2, 2015

cc_library currently does not link its dependencies, but cc_binary does. It whould be nice if I could control this behaviour for each dependency to create cc_libraries that do contain the symbols from their dependencies. For example by using a flag like linkwithlibrary=true in each dependency or giving a list of "must-link" dependencies inside the final cc_library definition

See also: http://stackoverflow.com/questions/32845940/symbols-from-static-cc-library-dependency-in-so-missing

@ulfjack
Copy link
Contributor

ulfjack commented Oct 2, 2015

Why not use cc_binary if you want to build a dynamic library?

On Fri, Oct 2, 2015 at 9:33 AM, jcremer notifications@github.com wrote:

cc_library currently does not link its dependencies, but cc_binary does.
It whould be nice if I could control this behaviour for each dependency to
create cc_libraries that do contain the symbols from their dependencies.
For example by using a flag like linkwithlibrary=true in each dependency or
giving a list of "must-link" dependencies inside the final cc_library
definition

See also:
http://stackoverflow.com/questions/32845940/symbols-from-static-cc-library-dependency-in-so-missing


Reply to this email directly or view it on GitHub
#492.

@ghost
Copy link
Author

ghost commented Oct 2, 2015

This seems to work for me, but i am not sure if the generation of cc_binary is always the wanted behaviour; For example if you want to have more control over the linked libraries or generate intermediates that i can then use as a build result and additionally to generate more dependent libraries or binaries.
Additionally cc_binary seems to already bind a specifing glibc version.

@hanwen
Copy link
Contributor

hanwen commented Oct 8, 2015

The relevant documentation is here: http://bazel.io/docs/build-encyclopedia.html#cc_binary.linkstatic

Intermediates for what? If you are building an intermediate for Bazel, you should be using cc_library, and everything should work. If you are building an intermediate for use elsewhere, it would help if you further specified what you want.

@ghost
Copy link
Author

ghost commented Oct 28, 2015

I have a main application that loads some other .so files as modules. These modules use symbols already available inside the main application and i dont want to have these symbols also inside every module. But I have some other dependencies inside the module whose symbols I do want inside the module binary. So I need to control which symbols may be undefined inside my module lib at compile time and which are not.

@hanwen
Copy link
Contributor

hanwen commented Oct 28, 2015

I'm pretty sure this is possible, I think it would work like this:

cc_binary(name = "main",
srcs = ["main.cc"],
deps = [ ":lib"])

cc_binary(name ="module",
srcs = ["mod.cc",],
deps = [":lib"])

cc_library(name ="lib",
srcs = [ "lib.so" ]
hdrs = [ "lib.h" ])

adding @ulfjack in case I'm totally off.

@ulfjack
Copy link
Contributor

ulfjack commented Oct 30, 2015

Going through .so files may work, but seems a bit roundabout and increases number of pieces to deploy. That said, we don't have a really good mechanism for this; I think the most obvious way to do it is to have a header-only library that doesn't have implementation .cc files, and compile against that in the modules. (And of course, make sure that the implementation is in the binary that wants to load the modules.)

@damienmg
Copy link
Contributor

Is there anything left for addressing this issue?

@ghost
Copy link
Author

ghost commented Nov 18, 2015

As ulfjack already said, the current options are suboptimal, but it is possible to get the wanted behaviour by defining some cc_libraries twice: complete and only-header.
It whould be nice if i was somehow able to specifically exclude/include symbol links from each dependency inside my artifact. Now the behaviour seems to be: link everything for cc_binary and leave unnamed everything for cc_library which often makes sense but not allways.

@damienmg damienmg added type: feature request P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) Native rules (Java / C++ / Python) and removed under investigation labels Nov 18, 2015
@zonr
Copy link

zonr commented May 28, 2016

I have the same problem when I added linker_flag: "-Wl,--no-undefined" to my CROSSTOOL and then trying to build a cc_library alone. I'm quite surprised that cc_library doesn't link its deps. Removing that flag can workaround this issue (like we did in #67 (efd5d31) before).

IMHO, linking with the dependent libraries (specified in deps) for cc_library target would be a desired behavior. This can also pull those dependencies into DT_NEEDED entries of output .so so the output .so can be loaded with dlopen(). Currently, this is achieved by using cc_binary with linkshared (this flag imposes a limitation on binary name). What do you think to change cc_library's behavior and deprecates the latter approach? Using cc_binary to build a library (more specifically ... complement the missing behaviors in cc_library) seems a hack to me.

@ulfjack
Copy link
Contributor

ulfjack commented Jun 7, 2016

Use a cc_binary (right now, we may introduce more / better rules to do that) to link dependencies.

For larger code bases, linking the transitive closure at every cc_library would be prohibitively expensive (I haven't actually collected data on this, but I'm very confident). cc_library isn't intended as a top-level rule for external consumption - it's not a 'library' in the same sense as, say, openssl is a library. We generally expect cc_library rules to be much more fine-grained, in the limit down to having a single .cc file per cc_library.

@zonr
Copy link

zonr commented Jun 8, 2016

I agree with you that linking takes more time and more memory for more dependencies. This becomes a serious problem when dealing with large code bases. That's why incremental linking is included in many people's most-wanted feature list.

Thanks for reiterating the semantic and design of cc_library. IIRC, cc_library is used to produce an intermediate result for a collection of files that share the same build configuration (e.g., compiler flags). However, the intermediate result comes in 2 forms on Linux:

  1. When building a cc_binary that depends on the cc_library: the cc_library is built to a .a which by its nature doesn't make sense to link against the dependencies.
  2. When building a cc_test that depends on the cc_library without setting linkstatic: the cc_library is built to a .so. In this case, I hope that all dependencies could be pulled to produce an intact dynamic library (i.e., exhaustive list of DT_NEEDED entries such that there's no undefined symbols).

IMHO, whether cc_library links the dependencies depends on what it is producing.

Besides, given such design of cc_library and the following project,

  WORKSPACE
  app_a/
    BUILD
    main.c
  app_b/
    BUILD
    main.c
  libfoo/
    BUILD
    foo.c

Both app_a and app_b depends on libfoo. Is it possible to build libfoo.so such that it can be shared by both app_a and app_b in Bazel? I.e., libfoo.so is no longer an intermediate result. It should be included in the output image and is linked by the executables built from app_a and app_b.

@ulfjack
Copy link
Contributor

ulfjack commented Jun 15, 2016

We tried incremental linking, and it was a small performance win, but not significant enough to warrant the complexity.

We don't actually build .a files on Linux, where we instead pass the original .o files to the final link (or at least, we have code to do that, and I think it's enabled by default). This is faster than building intermediate .a files and uses less disk space.

In the given example, it's not really possible right now to build libfoo.so such that you can install all three independently. One of the problems with dynamic linking is that you have to carefully make sure to never statically link common dependencies into multiple outputs.

For example, consider two binaries A and B, depending on a library LIB, which depends on a library LIB2. Let's say A also directly depends on LIB2. If you link LIB as a .so, what do you do with LIB2? You could link it into LIB.so, but then you have to make sure to not link it into A. Or you could require linking it into it's own LIB2.so, in which case you have to make sure that both LIB and A link it dynamically, and the downside is that you then can't use multiple rules for LIB2 - you have to force a single rule to contain everything that you want linked into LIB2.so.

A simple, and usually workable policy is to never link dependencies into libraries. At Google, we don't ship pieces of applications independently of each other, we always ship the entire app. This works great for production systems, but doesn't match how Linux distributions work today. Though note that there are efforts to move closer to that, e.g., Ubuntu snappy packages.

For Bazel, we'd like to offer a mode that covered this case better, but note that it's inherently brittle under changes to the structure of the dependency graph. We haven't yet been able to come up with a good model: in order to avoid it being brittle, we need to enforce global properties on the build graph, which turns it into a scalability issue.

We briefly discussed allowing users to determine the subdivision between linked units manually, and Bazel enforcing correctness at the binary level. It's still brittle, but at least the build will fail if you break it, rather than discovering issues at runtime.

@knightmj
Copy link

If you have multiple dependent shared libs is there a way to maintain those dependencies? It seems like I would need to list all needed shared libs for any rule. for example look at test. Is there a work around for this?

cc_test(
    name="test",
    srcs=glob(["tests/*.cc"]) + [":lib2.so", ":lib1.so"],
    deps=[":lib_hdrs"]
)
cc_binary(
    name="lib1.so",
    srcs=glob(["lib1.cc", "lib1.hh"]),
    visibility=["//visibility:public"],
    linkshared=1
)

cc_binary(
    name="lib2.so",
    srcs=glob(["lib2.cc", "lib2.hh", "lib1.hh"]) + [":lib1.so"],
    visibility=["//visibility:public"],
    linkshared=1
)

@lberki lberki added P2 We'll consider working on this in future. (Assignee optional) and removed P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) labels Mar 3, 2017
@hlopko
Copy link
Member

hlopko commented Apr 13, 2017

After 2 years we're going to make this happen! Please take a look and comment on the transitive libraries design doc: https://docs.google.com/document/d/1-tv0_79zGyBoDmaP_pYWaBVUwHUteLpAs90_rUl-VY8/edit?usp=sharing

@mwoehlke-kitware
Copy link

See also https://stackoverflow.com/questions/44723821/why-does-bazel-under-link-and-how-do-i-fix-it. This is just broken; Bazel should be able to produce one .so that can be used by both internal (i.e. within Bazel) and external (i.e. not using Bazel) consumers. (Bonus points for not forcing the Bazel target to have a stupid name.)

@EricCousineau-TRI
Copy link
Contributor

My takeaway from this is that for development, Bazel-style internal library linking is awesome because of the granularity it yields; however, for deployment, if you have all your Bazel internal libraries / binaries use linkstatic = 0 / linkshared = 1, and if you install these items, you'll have the weird dependencies like @mwoehlke-kitware mentioned, such as _solib_k8/_U@vtk _S_S_Cvtkglew___Uexternal_Svtk_Slib/libvtkglew-8.0.so.1, which probably would not have been listed in some install rule / manifest (and you probably wouldn't want it to be).

Since patchelf supports --remove-needed and --add-needed, and it's available in Homebrew on Mac, could the following be done for shared library deployment, keeping Bazel's granular targets for development, but linking against deployment libraries at install time?

  • Ensure that all internal libraries which might have diamond dependency problems are linkstatic = 0, to avoid ODR via the *.sos
  • Define common deployment libraries which consume these internal libraries, and link them into one *.so.
    • Whitelisting seems like it'd be the easiest for static dependencies into the deployment library - perhaps "reexport_deps" would be the solution here?
      * In Bazel development land, when you run the binary, it will run everything as normal; it will link against the internal library solibs without a hitch. This still enables granular development (you don't have to build a whole submodule when testing just a small potion; keep it Bazel-style rather than CMake-style.)
      * In a custom / built-in deployment step (like @drake//:install), when you install an artifact, you check the upstream solibs. For each internal version, you check the other deployed libraries (much less granular), remove the solibs from the binary (be it an executable or another solib) via patchelf --remove-needed, and add the containing deployed libraries via patchelf --add-needed.

My naive understanding is that this is similar to 5.d. in the Attack Plan, but instead of distributing the top-level library with its upstream components (possibly with or without odd names) in a solib directory, this would use the merged solib with the upstream solibs, and any components downstream of the original solibs would now use the merged deployment solib.

Can I also ask if this understanding is correct? (Specifically, would this be a valid usage of reexport_deps for shared libraries?)

Quick example (leveraging Matt's install setup in Drake as an example deployment mechanism):

# In //common:
cc_library(name = "math_utils", ...)
cc_library(name = "io_utils", ...)
# Deployment library. This would include transitive headers too.
cc_binary(name = "common", output = "libmy_project_common.so", reexport_deps = [":math_utils", ":io_utils"])
# - Deployment mechanism
install(name = "install", targets = [":common"])

# In //apps/example
cc_binary(name = "main", deps = ["//common:math_utils", "//common:io_utils"])
# ^ Running this would link directly against the internal Bazel libraries
# - Deployment mechanism. When "main" is installed, it is now linked directly against "libmy_project_common.so"; no other artifacts are needed for deployment.
install(name = "install", targets = [":main"], deps = ["//common:install"])

@filippobrizzi
Copy link

Hello! Two years have passed since the last message and I was wandering if there is any plan to add the cc_static_library rule to Bazel.
Otherwise, does any one have a custom implementation of the rule that can share?
Thanks!

@hlopko
Copy link
Member

hlopko commented Oct 25, 2019

Hi @filippobrizzi, it shouldn't be too hard to tweak https://github.com/bazelbuild/rules_cc/blob/master/examples/my_c_archive/my_c_archive.bzl to collect transitive static libraries from CcInfo and create an archiving action for all of them.

@sgowroji
Copy link
Member

Hi there! We're doing a clean up of old issues and will be closing this one. Please reopen if you’d like to discuss anything further. We’ll respond as soon as we have the bandwidth/resources to do so.

@sgowroji sgowroji closed this as not planned Won't fix, can't repro, duplicate, stale Jan 31, 2023
@goldencz
Copy link

@sgowroji I think we should keep this open. cc_shared_library resolves the transitive depdency particially. I think there are still many issues to address:

  1. cc_shared_library can't be dependent of other cc_rules
  2. the static library issue is not solved.

@fmeum
Copy link
Collaborator

fmeum commented Apr 3, 2024

I think we can close this now:

  1. cc_shared_library can't be dependent of other cc_rules

cc_binary has the dynamic_deps attribute which accepts cc_shared_librarys. For cc_library, depending on transitive libraries isn't generally safe as it could result in duplicate symbols. If needed, the user can take on the risk by wrapping a cc_shared_library in a cc_import.

  1. the static library issue is not solved.

This is covered by #1920, which has an open PR pending implementing an accepted design.

If there are any use cases that aren't covered by other open issues, it would be very helpful to have dedicated issues with smaller scope opened for them. These large umbrella issues just tend to get stale as they aren't actionable.

@fmeum fmeum closed this as not planned Won't fix, can't repro, duplicate, stale Apr 3, 2024
@GabrielDracula
Copy link

GabrielDracula commented Apr 16, 2024

@fmeum Hello, I'm building a dynamic library for my proto message cause there is a proto instance problem happend when we collaborate two dynamic libraries that static link to this proto. In order to reduce the difficulty of managing the library, I want to build a dynamic proto library which static link to its dependencies for proto.
I'm trying to use cc_shared_library wrapped in a cc_import, but I have been troubled by the following problem for a long time and have not found a solution, while bazel-out/k8-opt/bin/some_path/proto/test_proto_message.pb.h has been created. Is it feasible to use this method? I sincerely want to get some advice.

ERROR: /some_path/BUILD:6:11: Compiling some_path/test_target.cc failed: undeclared inclusion(s) in rule '//some_path:test_target':
this rule is missing dependency declarations for the following files included by 'some_path/test_target.cc':
 'bazel-out/k8-opt/bin/some_path/proto/test_proto_message.pb.h'

We use version 6.5.0 of bazel. Here is my BUILD under some_path/proto and some_path.

In some_path/proto/BUILD:

proto_library(
    name = test_proto_message_proto,
    srcs = ["test_proto_message.proto"],
)
cc_proto_library(
    name = "cc_test_proto_message_proto",
    deps = [":test_proto_message_proto"],
)
cc_shared_library(
    name = "test_proto_dynamic_lib",
    deps = [
        ":cc_test_proto_message_proto",
    ],
)
cc_import(
    name = "test_proto_lib",
    # hdrs = ["test_proto_message.pb.h"], # I tried to add hdrs in different ways but it also failed.
    shared_library = ":test_proto_dynamic_lib",
)

In some_path/BUILD:

cc_library(
    name = "test_target",
    srcs = ["test_target.cc"],
    hdrs = ["test_target.h"],
    deps = [
         "//some_path/proto:test_proto_lib",
    ],
)

@GabrielDracula
Copy link

GabrielDracula commented Apr 17, 2024

Well, I add ":cc_test_proto_message_proto" to deps of cc_import and it works. There is no redundant dynamic library built, and proto instance is unified.
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) team-Rules-CPP Issues for C++ rules type: feature request
Projects
None yet
Development

No branches or pull requests