New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Bazel build system #3129

Closed
jwnimmer-tri opened this Issue Aug 11, 2016 · 38 comments

Comments

Projects
None yet
8 participants
@jwnimmer-tri
Collaborator

jwnimmer-tri commented Aug 11, 2016

The Bazel build system is a relatively new tool that provides correct, reproducible, fast builds. It is the open-source version of google's own build tool.

I believe that Bazel would solve many of TRI's challenges in using and supporting Drake, and thus we should experiment with adding Bazel support into Drake. For starters, it should be enough to merely add Bazel support for the newer C++ core of Drake (common, math, systems_framework).

This would start as workstation-only test, then graduate to a post-merge build only, and not in the officially supported set. If and when we demonstrate that it is reliable and useful for developers, we can consider making Bazel support official.

It would be nice to start on this soon, in order to help prepare and test the upcoming Bazel 0.5 official support for Windows in a few months, so that we can help drive it to meet our needs.

@david-german-tri

This comment has been minimized.

Show comment
Hide comment
@david-german-tri

david-german-tri Aug 11, 2016

Contributor

I'm in.

Contributor

david-german-tri commented Aug 11, 2016

I'm in.

@jamiesnape

This comment has been minimized.

Show comment
Hide comment
@jamiesnape

jamiesnape Aug 12, 2016

Contributor
  1. CI system would need to be completely refactored.
  2. Multiple build systems are a bad idea.
  3. Adding features to CMake that you may need in future is easy, for Bazel?

Most of your problems are due to legacy code and the unusual PODs conventions therein. If you rewrote the build system from scratch in any language it would be an improvement.

Contributor

jamiesnape commented Aug 12, 2016

  1. CI system would need to be completely refactored.
  2. Multiple build systems are a bad idea.
  3. Adding features to CMake that you may need in future is easy, for Bazel?

Most of your problems are due to legacy code and the unusual PODs conventions therein. If you rewrote the build system from scratch in any language it would be an improvement.

@mwoehlke-kitware

This comment has been minimized.

Show comment
Hide comment
@mwoehlke-kitware

mwoehlke-kitware Aug 16, 2016

Contributor

Some critical features that I'm not seeing offhand in Bazel:

  • Ability to locate and use external libraries. ("Use" seems to be there, at least partly, but "locate" is missing.)
  • Ability to install things.
  • Any Windows support at all. (It looks like Bazel may assume a GCC-like compiler.)

To fully take advantage of it, we'd probably also have to port all of our dependencies.

Contributor

mwoehlke-kitware commented Aug 16, 2016

Some critical features that I'm not seeing offhand in Bazel:

  • Ability to locate and use external libraries. ("Use" seems to be there, at least partly, but "locate" is missing.)
  • Ability to install things.
  • Any Windows support at all. (It looks like Bazel may assume a GCC-like compiler.)

To fully take advantage of it, we'd probably also have to port all of our dependencies.

@jwnimmer-tri

This comment has been minimized.

Show comment
Hide comment
@jwnimmer-tri

jwnimmer-tri Aug 16, 2016

Collaborator

Multiple build systems are a bad idea.

This is a compelling argument against attempting a port. Supporting two systems in parallel during the transition will be somewhat painful. We can perhaps localize the pain to only the bazel-porters (i.e., have a separate CI that is non-authoritative, and nobody but the porters care about).

Having a good plan of action here would be an important part of this pursuit (and I don't have one yet).

CI system would need to be completely refactored.

I'm not too worried about this. It will be effort and cost, but not risk.

Adding features to CMake that you may need in future is easy, for Bazel?

As with SCons, Bazel lets you escape into (limited) Python if you really want to, and is open-source should you need to modify the actual core.

Most of your problems are due to legacy code and the unusual PODs conventions therein. If you rewrote the build system from scratch in any language it would be an improvement.

I agree that's a big part of the problem, and that nuking the current build system code with a rewrite is a big win. Still, if we stick with CMake, the lack of reproducibility and caching for builds and tests is a big hole CMake, that Bazel and SCons both fill.

[Lacks the] Ability to locate and use external libraries. ("Use" seems to be there, at least partly, but "locate" is missing.)

Bazel often tends towards "build your libraries from source" (as we do with drake externals), which actually helps reproducibility, but you can also get required libraries from the system without much trouble.

And for "locate" (searching), I actually think that's a misfeature. There should be one way to build Drake-the-entireity on a given platform, which means that the dependency will be in one well-known, hard-coded place. Rooting around in system paths, the user's homedir, and finding something that "smells like" the right version of a library or tool only leads to confusing bug reports and wasted time. Or if the developer really wants a different path to some library, they can update their Bazel file to explicitly reference it.

[Lacks the] Ability to install things.

Hmm? Are you saying we'd have to roll your own "zip up these files into a release" logic, but in the CMake case that's already built-in? That's fair, but doesn't seem like a huge difference, compared to the effort of ongoing build system upkeep and developer downtime.

[Lacks] Any Windows support at all. (It looks like Bazel may assume a GCC-like compiler.)

This is in progress (http://bazel.io/roadmap.html), scheduled for a couple months out. Probably slips a bit more, but still relatively near-term compared to the scope of this ticket.

To fully take advantage of it, we'd probably also have to port all of our dependencies.

Yes, this is worth a bit of effort-assessment. Dependencies that are just a pile of C++ code are easy to port (I've done it). Dependencies that have bespoke -DTHIS_AND_THAT autoconf-like logic are harder, but there are ways to cope.

Collaborator

jwnimmer-tri commented Aug 16, 2016

Multiple build systems are a bad idea.

This is a compelling argument against attempting a port. Supporting two systems in parallel during the transition will be somewhat painful. We can perhaps localize the pain to only the bazel-porters (i.e., have a separate CI that is non-authoritative, and nobody but the porters care about).

Having a good plan of action here would be an important part of this pursuit (and I don't have one yet).

CI system would need to be completely refactored.

I'm not too worried about this. It will be effort and cost, but not risk.

Adding features to CMake that you may need in future is easy, for Bazel?

As with SCons, Bazel lets you escape into (limited) Python if you really want to, and is open-source should you need to modify the actual core.

Most of your problems are due to legacy code and the unusual PODs conventions therein. If you rewrote the build system from scratch in any language it would be an improvement.

I agree that's a big part of the problem, and that nuking the current build system code with a rewrite is a big win. Still, if we stick with CMake, the lack of reproducibility and caching for builds and tests is a big hole CMake, that Bazel and SCons both fill.

[Lacks the] Ability to locate and use external libraries. ("Use" seems to be there, at least partly, but "locate" is missing.)

Bazel often tends towards "build your libraries from source" (as we do with drake externals), which actually helps reproducibility, but you can also get required libraries from the system without much trouble.

And for "locate" (searching), I actually think that's a misfeature. There should be one way to build Drake-the-entireity on a given platform, which means that the dependency will be in one well-known, hard-coded place. Rooting around in system paths, the user's homedir, and finding something that "smells like" the right version of a library or tool only leads to confusing bug reports and wasted time. Or if the developer really wants a different path to some library, they can update their Bazel file to explicitly reference it.

[Lacks the] Ability to install things.

Hmm? Are you saying we'd have to roll your own "zip up these files into a release" logic, but in the CMake case that's already built-in? That's fair, but doesn't seem like a huge difference, compared to the effort of ongoing build system upkeep and developer downtime.

[Lacks] Any Windows support at all. (It looks like Bazel may assume a GCC-like compiler.)

This is in progress (http://bazel.io/roadmap.html), scheduled for a couple months out. Probably slips a bit more, but still relatively near-term compared to the scope of this ticket.

To fully take advantage of it, we'd probably also have to port all of our dependencies.

Yes, this is worth a bit of effort-assessment. Dependencies that are just a pile of C++ code are easy to port (I've done it). Dependencies that have bespoke -DTHIS_AND_THAT autoconf-like logic are harder, but there are ways to cope.

@jwnimmer-tri

This comment has been minimized.

Show comment
Hide comment
@jwnimmer-tri

jwnimmer-tri Oct 14, 2016

Collaborator

FYI, active development prototype is ongoing here https://github.com/jwnimmer-tri/drake/commits/bazelspike. It is able to build the automotive slice of the code.

The current "off the top of my head" plan is something like "allow BUILD files to be PR'd in-tree without CI yet" as we bring this up. This lets other bazel workspaces use Drake-as-a-library, without Drake-as-a-project needing to adopt bazel for its demos, tests, etc. Probably the next step after that is to get some kind of CI for Drake-as-a-library using bazel, but still rely on CMake for tests, demos, matlab, etc. I plan to iterate with David's review on a real plan offline, and then post here for comments.

Collaborator

jwnimmer-tri commented Oct 14, 2016

FYI, active development prototype is ongoing here https://github.com/jwnimmer-tri/drake/commits/bazelspike. It is able to build the automotive slice of the code.

The current "off the top of my head" plan is something like "allow BUILD files to be PR'd in-tree without CI yet" as we bring this up. This lets other bazel workspaces use Drake-as-a-library, without Drake-as-a-project needing to adopt bazel for its demos, tests, etc. Probably the next step after that is to get some kind of CI for Drake-as-a-library using bazel, but still rely on CMake for tests, demos, matlab, etc. I plan to iterate with David's review on a real plan offline, and then post here for comments.

@jwnimmer-tri

This comment has been minimized.

Show comment
Hide comment
@jwnimmer-tri

jwnimmer-tri Oct 20, 2016

Collaborator

Next-up proposal... for C++ libraries that bazel knows about, teach CMake to obtain the list of sources from the BUILD file, instead of repeating the list in two places.

WIP at https://github.com/jwnimmer-tri/drake/tree/bazel-reuse2

Collaborator

jwnimmer-tri commented Oct 20, 2016

Next-up proposal... for C++ libraries that bazel knows about, teach CMake to obtain the list of sources from the BUILD file, instead of repeating the list in two places.

WIP at https://github.com/jwnimmer-tri/drake/tree/bazel-reuse2

@jamiesnape

This comment has been minimized.

Show comment
Hide comment
@jamiesnape

jamiesnape Oct 20, 2016

Contributor

Not sure that is going to work. I am trying to work out if you would get extra cmake re-configures when file lists have not changed or not enough re-configures when they do change.

This is all kind of reversed when the purpose of CMake is to generate the build files for other systems. There is no bazel support now, but there could be.

Contributor

jamiesnape commented Oct 20, 2016

Not sure that is going to work. I am trying to work out if you would get extra cmake re-configures when file lists have not changed or not enough re-configures when they do change.

This is all kind of reversed when the purpose of CMake is to generate the build files for other systems. There is no bazel support now, but there could be.

@jwnimmer-tri

This comment has been minimized.

Show comment
Hide comment
@jwnimmer-tri

jwnimmer-tri Oct 20, 2016

Collaborator

When we discussed Bazel vs CMake a few months ago during the visit, the Kitware consensus in the room at that time was that Bazel is higher-level than CMake, and the one-feeds-the-other direction should go as I've done here.

In any case, what is the best solution in CMake for a list of library sources to be dynamically computed? I had done 6a1a042 originally -- which asks a python program to emit a list of sources -- but the approach in the reuse2 branch seemed like it would scale to generating even more of the listfile content directly from the BUILD.

Collaborator

jwnimmer-tri commented Oct 20, 2016

When we discussed Bazel vs CMake a few months ago during the visit, the Kitware consensus in the room at that time was that Bazel is higher-level than CMake, and the one-feeds-the-other direction should go as I've done here.

In any case, what is the best solution in CMake for a list of library sources to be dynamically computed? I had done 6a1a042 originally -- which asks a python program to emit a list of sources -- but the approach in the reuse2 branch seemed like it would scale to generating even more of the listfile content directly from the BUILD.

@jwnimmer-tri

This comment has been minimized.

Show comment
Hide comment
@jwnimmer-tri

jwnimmer-tri Oct 20, 2016

Collaborator

The other option I've considered is to fully omit the listfiles for core Drake (common, math, systems, solvers, etc.) and just have CMake delegate to bazel to build the core. That is probably better long term, but I was hoping to get there incrementally.

Collaborator

jwnimmer-tri commented Oct 20, 2016

The other option I've considered is to fully omit the listfiles for core Drake (common, math, systems, solvers, etc.) and just have CMake delegate to bazel to build the core. That is probably better long term, but I was hoping to get there incrementally.

@jamiesnape

This comment has been minimized.

Show comment
Hide comment
@jamiesnape

jamiesnape Oct 20, 2016

Contributor

That would certainly be cleanest.

Contributor

jamiesnape commented Oct 20, 2016

That would certainly be cleanest.

@jamiesnape

This comment has been minimized.

Show comment
Hide comment
@jamiesnape

jamiesnape Oct 20, 2016

Contributor

Then CMake would provide an external interface for CMake consumers, build CMake externals, and other legacy code. We have integrated with external tools like Ant and Gradle for Java before, so this may not be that different.

Contributor

jamiesnape commented Oct 20, 2016

Then CMake would provide an external interface for CMake consumers, build CMake externals, and other legacy code. We have integrated with external tools like Ant and Gradle for Java before, so this may not be that different.

@jamiesnape

This comment has been minimized.

Show comment
Hide comment
@jamiesnape

jamiesnape Oct 21, 2016

Contributor

(Might be an idea to create a ticket for the CMake side and assign it to Kitware since we have done similar many times before.)

Contributor

jamiesnape commented Oct 21, 2016

(Might be an idea to create a ticket for the CMake side and assign it to Kitware since we have done similar many times before.)

@jwnimmer-tri

This comment has been minimized.

Show comment
Hide comment
@jwnimmer-tri

jwnimmer-tri Oct 21, 2016

Collaborator

Yeah. I have to give more thought and discussion to the optimal path forward. For now perhaps I'll just keep the two spellings of the list-of-sources duplicated.

Collaborator

jwnimmer-tri commented Oct 21, 2016

Yeah. I have to give more thought and discussion to the optimal path forward. For now perhaps I'll just keep the two spellings of the list-of-sources duplicated.

@jamiesnape

This comment has been minimized.

Show comment
Hide comment
@jamiesnape

jamiesnape Oct 21, 2016

Contributor

Sure. My general opinion is that whatever the merits of Bazel or CMake, having two build systems duplicating each other or working in unusual ways to accommodate each other is going to cause a lot of extra maintenance at best, confusion or bugs at the worst. I also think there is some scope for upstreaming some useful Bazel support to CMake which might make things easier.

Contributor

jamiesnape commented Oct 21, 2016

Sure. My general opinion is that whatever the merits of Bazel or CMake, having two build systems duplicating each other or working in unusual ways to accommodate each other is going to cause a lot of extra maintenance at best, confusion or bugs at the worst. I also think there is some scope for upstreaming some useful Bazel support to CMake which might make things easier.

@jamiesnape

This comment has been minimized.

Show comment
Hide comment
@jamiesnape

jamiesnape Jun 27, 2017

Contributor

CMake Removal TODO

drake-superbuild:

  • cmake (refactor) - #6443 (hooks, packaging, tools)

externals:

drake:

  • automotive - #6445
  • bindings
  • common
  • doc (refactor)
  • examples - #6441 (except Atlas, kuka_iiwa_arm)
  • geometry - #6475
  • lcm
  • lcmtypes
  • manipulation - #6469
  • math
  • matlab (refactor)
  • multibody
  • systems
  • thirdParty
  • util
Contributor

jamiesnape commented Jun 27, 2017

CMake Removal TODO

drake-superbuild:

  • cmake (refactor) - #6443 (hooks, packaging, tools)

externals:

drake:

  • automotive - #6445
  • bindings
  • common
  • doc (refactor)
  • examples - #6441 (except Atlas, kuka_iiwa_arm)
  • geometry - #6475
  • lcm
  • lcmtypes
  • manipulation - #6469
  • math
  • matlab (refactor)
  • multibody
  • systems
  • thirdParty
  • util
@jamiesnape

This comment has been minimized.

Show comment
Hide comment
@jamiesnape

jamiesnape Jun 28, 2017

Contributor

Three proposals regarding CI for PRs:

  1. Switch LSan builds from CMake to Bazel.
  2. Switch off standalone cpplint builds (or replace with a Bazel one, but lint runs on all Bazel builds currently, anyway).
  3. Switch on at least one "everything" Bazel build.
Contributor

jamiesnape commented Jun 28, 2017

Three proposals regarding CI for PRs:

  1. Switch LSan builds from CMake to Bazel.
  2. Switch off standalone cpplint builds (or replace with a Bazel one, but lint runs on all Bazel builds currently, anyway).
  3. Switch on at least one "everything" Bazel build.
@mwoehlke-kitware

This comment has been minimized.

Show comment
Hide comment
@mwoehlke-kitware

mwoehlke-kitware Jun 28, 2017

Contributor

I was thinking that a Bazel lint-only build might be useful if we could also switch off the lint tests on the other Bazel CI builds. This would make it easier to tell lint failures from "real" failures.

...Just a thought; feel free to hate it 😄.

Contributor

mwoehlke-kitware commented Jun 28, 2017

I was thinking that a Bazel lint-only build might be useful if we could also switch off the lint tests on the other Bazel CI builds. This would make it easier to tell lint failures from "real" failures.

...Just a thought; feel free to hate it 😄.

@jamiesnape

This comment has been minimized.

Show comment
Hide comment
@jamiesnape

jamiesnape Jun 28, 2017

Contributor

At the moment switching off lint gets ugly because --test_tag_filters do not accumulate intelligently.

Contributor

jamiesnape commented Jun 28, 2017

At the moment switching off lint gets ugly because --test_tag_filters do not accumulate intelligently.

@jwnimmer-tri

This comment has been minimized.

Show comment
Hide comment
@jwnimmer-tri

jwnimmer-tri Jun 28, 2017

Collaborator

To me, its important that CI builds match user builds, so we shouldn't add or remove tests in CI. The build variants should merely cover the matrix of supported platforms and documented command-line flags (such as --compiler=.)

Collaborator

jwnimmer-tri commented Jun 28, 2017

To me, its important that CI builds match user builds, so we shouldn't add or remove tests in CI. The build variants should merely cover the matrix of supported platforms and documented command-line flags (such as --compiler=.)

@jamiesnape

This comment has been minimized.

Show comment
Hide comment
@jamiesnape

jamiesnape Jun 28, 2017

Contributor

Any opinion on the three proposals, and do you have a preference for the particular "everything" build? We can add others once CMake is turned down.

Contributor

jamiesnape commented Jun 28, 2017

Any opinion on the three proposals, and do you have a preference for the particular "everything" build? We can add others once CMake is turned down.

@jwnimmer-tri

This comment has been minimized.

Show comment
Hide comment
@jwnimmer-tri

jwnimmer-tri Jun 28, 2017

Collaborator

Switch LSan builds from CMake to Bazel.

Sure.

Switch off standalone cpplint builds

Sure.

Switch on at least one "everything" Bazel build.

Anything Xenial & Everything would be fine by me.

Collaborator

jwnimmer-tri commented Jun 28, 2017

Switch LSan builds from CMake to Bazel.

Sure.

Switch off standalone cpplint builds

Sure.

Switch on at least one "everything" Bazel build.

Anything Xenial & Everything would be fine by me.

@jwnimmer-tri

This comment has been minimized.

Show comment
Hide comment
@jwnimmer-tri

jwnimmer-tri Jan 31, 2018

Collaborator

The checklists upthread probably need a refresh at some point.

Collaborator

jwnimmer-tri commented Jan 31, 2018

The checklists upthread probably need a refresh at some point.

@jamiesnape

This comment has been minimized.

Show comment
Hide comment
@jamiesnape

jamiesnape Jan 31, 2018

Contributor

We could probably split off the remaining Jenkins TODOs into separate issues and close this, I think. Other than those, I just see hermetic Gurobi remaining, and that does not see worth keeping the issue open for now (or necessarily even bothering about).

Contributor

jamiesnape commented Jan 31, 2018

We could probably split off the remaining Jenkins TODOs into separate issues and close this, I think. Other than those, I just see hermetic Gurobi remaining, and that does not see worth keeping the issue open for now (or necessarily even bothering about).

@jwnimmer-tri

This comment has been minimized.

Show comment
Hide comment
@jwnimmer-tri

jwnimmer-tri Jan 31, 2018

Collaborator

I agree we don't need an issue tracking Gurobi stuff. If its a problem in some context, we can fix it if / when needed.

Collaborator

jwnimmer-tri commented Jan 31, 2018

I agree we don't need an issue tracking Gurobi stuff. If its a problem in some context, we can fix it if / when needed.

@jwnimmer-tri jwnimmer-tri assigned jamiesnape and unassigned stonier Apr 23, 2018

@jwnimmer-tri jwnimmer-tri removed the team: cars label Jun 9, 2018

@RussTedrake

This comment has been minimized.

Show comment
Hide comment
@RussTedrake

RussTedrake Sep 23, 2018

Contributor

@jamiesnape, @jwnimmer-tri -- i think it's safe to close this? ;-)

Contributor

RussTedrake commented Sep 23, 2018

@jamiesnape, @jwnimmer-tri -- i think it's safe to close this? ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment