Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert "plan build" code over to Rust #6258

Open
christophermaier opened this issue Mar 4, 2019 · 14 comments
Open

Convert "plan build" code over to Rust #6258

christophermaier opened this issue Mar 4, 2019 · 14 comments
Labels
Focus:Architecture Focus:Supervisor Related to the Habitat Supervisor (core/hab-sup) component

Comments

@christophermaier
Copy link
Contributor

christophermaier commented Mar 4, 2019

This is work that's been on our long wish list for a while, but the advent of #6257 has crystallized the fact that it's time to revisit the decision to implement hab-plan-build in Bash (and PowerShell on Windows, though the linked issue pertains specifically to Bash).

The initial decision to implement hab plan build in Bash was a pragmatic one. It's easy to get started, it provides us a plan.sh syntax that's familiar and easy to write (and gives us parsing for free), it has low runtime and binary size overhead, etc. However, it is also an incredibly subtle language full of edge cases, many of which are non-obvious. The excellent shellcheck has helped us greatly in improving our code (honestly, use of that tool should be mandatory for all shell scripts everywhere). However, having excellent linting cannot magically give you safe code. While Bash is easy to get started with, becoming well-versed in the subtleties of its more advanced features (many of which we use) is not easy. Additionally, it does not have many of the most basic features that one needs to build non-trivial applications (the fact that we can't return data from a function, have to rely on global data structures everywhere, and have to resort to esoteric gymnastics to use something approximating a map data structure are just three red flags.)

The irony has not escaped us that the Habitat project is using one of the safest programming languages available (Rust) and one of the most dangerous programming languages (Bash) at the same time, each for key components of the Habitat ecosystem.

Aside from the unsafe nature of Bash, there is also the fact that any change to the plan build logic must necessarily be re-implemented in PowerShell for Windows builds. In addition to being an unfortunate duplication of effort, we also have a far shallower bench of PowerShell expertise than we do for Bash.

Given that we now have considerable expertise in Rust, and the fact that we have actually already implemented a number of things that hab-plan-build does in Rust already (e.g. reading and taking action on package metadata files), it is time to seriously give thought to a rewrite of hab-plan-build entirely in Rust.

This would yield a number of benefits, some of which have already been mentioned, but bear repeating:

  • Increased safety
  • Better testability
  • More familiarity with current team members
  • No duplication of effort (single codebase)

We're actually in a good position for a rewrite, and much of it can be transparent to outside users. One way we could proceed would be to wrap library calls in Rust in bespoke binaries to begin replacing the Bash (and PowerShell) functions we currently have. Our existing Bash infrastructure would simply call Rust binaries rather than their shell script counterparts. In this way, we could easily offload the most complicated operations into Rust, giving us quick safety and stability returns. This would also allow us to centralize more and more of our logic, reducing the current Bash and PowerShell duplication.

This would likely be a short-term situation, though we could go quite a long way with this approach.

In the long term, we will also have to contend with the fact that Habitat plans themselves are written in Bash / PowerShell, and can include non-trivial callback functions written by end-users. I think our experience at Chef has shown that having the power of a programming language for such situations is a Good Thing, as opposed to using a data-only approach (e.g., creating a plan.yml file or something similar). Only a sadist would force users to write their plans in Rust, though! 😄 One thought has been to look at something like Lua as a plan language. Lua is commonly used for these kinds of pluggable scripting tasks, is multi-platform, and (importantly for this discussion), has Rust bindings.

As terrifying as it sounds, something like shellfn might also be useful in transitioning from Bash / PowerShell to Rust.

(There may be other viable approaches, as well; using Lua immediately springs to mind, but that shouldn't be the end of the discussion.)

As stated earlier, I think we can greatly improve the safety of plan building code by re-implementing the most complex bits in Rust, and this does not have to take place at the same time as implementing a new plan language. Both things are important to consider in the scope of a rewrite, however, and both will be necessary if there is ever a day when the plan building process is completely implemented in Rust. This is also a change that would greatly impact existing users of Habitat; we would need to devote considerable thought to migration strategies, and would realistically need to straddle both worlds for some period of time. Providing users an easy way to opt into the new plan syntax would be a great help (this could also probably be done automatically; after all, if you've got a plan.lua file in place, that's probably a pretty strong signal that you'd want the new version of the code 😄)

(It should go without saying that as more functionality is implemented in Rust, we should take the time to really strengthen the testing --- unit, integration, and functional --- around this core component of our platform.)

UPDATE: As @baumanj pointed out in chat, another issue not explicitly called out here is the brittleness of the Bash code. Making changes is far more difficult and fraught than it needs to be. Moving to Rust would be a big step in the right direction for reducing the brittleness of the code.

@baumanj
Copy link
Contributor

baumanj commented Mar 4, 2019

Rather than force a particular language like bash or lua on plan authors, what about taking the same approach to callbacks we do for lifecycle hooks? Implement them as standalone programs which can be implemented with the authors’ language of choice. The remaining parts of the plan would be pure data and amenable to a format like TOML.

@christophermaier
Copy link
Contributor Author

christophermaier commented Mar 4, 2019

@baumanj definitely worth digging into... the separation of static from active data is nice aspect of that.

@baumanj
Copy link
Contributor

baumanj commented Mar 4, 2019

Another issue I don’t see mentioned here is performance. It’s not clear that moving to rust would be a big performance win for plan build, but since moving to rust gives us the potential for higher performance, it may be worth seeing if we could achieve the kinds of gains that would make using habitat itself for locally developing habitat appropriate.

@irvingpop
Copy link

Please don't get rid of bash/powershell plans, this could majorly set back adoption of Habitat. I would like to argue that a hybrid approach makes more sense for our users.

Rationale:

Bash and Powershell are accessible to plan writers new and experienced. It's a huge part of what makes Habitat easy to get started with - it puts experienced ops/sysadmins and Developers who would say "just let me write a Dockerfile" onto even footing and then shows them an excellent way forwar.

We cannot discount this fact, particularly in light of our experience teaching people Chef and hearing that "Chef is hard" largely because of its Ruby-based DSL but also the Ruby ecosystem that comes with it.

It's not just the "getting started" experience, but more advanced plan writers also really appreciate how easy to read and understand hab-plan-build.sh/ps1 and override functions as needed. There are countless plans inside and outside of core-plans that depend on this behavior.

Proposal:

I agree that there are some really brittle and difficult parts of hab-plan-build, the dep solving functions are a great example of that. What if we replaced the guts of many of these functions with single-purpose binaries (compiled from Rust)?

For example, take _resolve_dependencies() and all of its dependent functions, what if we replaced all the guts of those with a calls to a habitat-provided binaries. We could then set all of these arrays with one or more compiled binaries which are called out to.

# * `$pkg_build_deps_resolved`: A package-path array of all direct build
#    dependencies, declared in `$pkg_build_deps`.
# * `$pkg_build_tdeps_resolved`: A package-path array of all direct build
#    dependencies and the run dependencies for each direct build dependency.
# * `$pkg_deps_resolved`: A package-path array of all direct run dependencies,
#    declared in `$pkg_deps`.
# * `$pkg_tdeps_resolved`:  A package-path array of all direct run dependencies
#    and the run dependencies for each direct run dependency.
# * `$pkg_all_deps_resolved`: A package-path array of all direct build and
#    run dependencies, declared in `$pkg_build_deps` and `$pkg_deps`.
# * `$pkg_all_tdeps_resolved`: An ordered package-path array of all direct
#    run and build dependencies, and the run dependencies for each direct
#    dependency. Further details in the `_populate_dependency_arrays()`

@christophermaier
Copy link
Contributor Author

@irvingpop That's good feedback regarding the plan language, thanks!

This issue is more of an umbrella issue to capture discussion and, ultimately, concrete tasks around moving to a less brittle and more maintainable plan build process. The first stage of that will involve refactorings that will be transparent to end users, and will in all likelihood proceed as you have described, offloading complex tasks to custom Rust binaries that can be called from the existing Bash script framework.

In this course of this, it is also worth thinking about how further changes to the plan build process may or may not affect how plans are written, which is why I called it out. It could be Lua plans, it could be a hybrid approach as advocated by @baumanj in this comment, or it could be something else entirely. Your feedback here will be an important piece of input as we work through this, though.

@irvingpop
Copy link

irvingpop commented Mar 4, 2019

@christophermaier @baumanj thank you both so much for all the things you do to make Habitat awesome.

sending hugs

@mwrock
Copy link
Contributor

mwrock commented Mar 4, 2019

Just to add another down vote for breaking away from bash and powerfull please include me as passionately against that. However I TOTALLY AGREE with the sentiment of this issue. My feeling is in line with @irvingpop in that we could move a ton of code into a hab build cli that can produce and consume json and move to very minimal PS/bash scripts.

@baumanj
Copy link
Contributor

baumanj commented Mar 4, 2019

@mwrock and @irvingpop: would your concerns be addressed so long as plan authors still have the option to write plan callback logic in bash/pwsh?

For the stuff that's strictly data, is it important to have that in shell as well, or can things like pkg_name and pkg_deps just as easily be TOML (or YAML or some other non-executable data format)?

@mwrock
Copy link
Contributor

mwrock commented Mar 4, 2019

yes @baumanj I think that could work as long as both user supplied data and some hab generated data (like PKG_PREFIX) are exposed/accessible to the shell script.

@smacfarlane
Copy link
Contributor

I'm in agreement with @irvingpop and @mwrock. I think moving away from bash/powershell for the language would be huge setback for plan authors. I am in 100% agreement though, that moving shared functionality (like the the dep solving) into a Rust binary so it can leverage our libraries and share implementation between platforms is a good move.

@baumanj I've thought about proposing moving metadata to TOML (for consistency with the rest of our kit) in the past, but IMO enough of the metadata needs to be mutable inside of the plan.sh/plan.ps1 that you'd only be able to move a subset of the metadata out. In addition, any gains from moving it aren't worth what, to me, would be a worse user experience when authoring and reading the plan.sh/plan.ps1.

@raskchanky
Copy link
Contributor

I wonder if there would be a need to force people to choose between plan languages.

Maybe it would be possible to support bash and powershell for people that want to stay with what they know, and also support something cross platform (e.g. Lua) for those that don't mind learning something new, know they will need to support multiple platforms and aren't interested in maintaining multiple plans, or want the greater safety guarantees that can be offered by something that can be embedded in Rust.

@baumanj
Copy link
Contributor

baumanj commented Mar 8, 2019

In thinking about how much code (as opposed to data) exists in plans, I did a quick analysis of the 626 plan.sh files in the core-plans repo, and here's what I found:

Screen Shot 2019-03-08 at 2 38 52 PM

Outside of do_build, do_install, do_prepare and do_check, the callback functions are fairly rarely used. And when you rule out implementations that just return 0 to disable the default functionality (there are probably more, as this can be written many ways, but my search was simple), there are even fewer. If a data-based approach allowed the disabling of specified callbacks, and we put a bit of effort into sussing out some common patterns that plans could opt into at the data level, the data/code separation approach starts to look a bit more reasonable.

@stale
Copy link

stale bot commented Apr 2, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. We value your input and contribution. Please leave a comment if this issue still affects you.

@stale stale bot added the Stale label Apr 2, 2020
@christophermaier christophermaier added Platform: Linux Deals with Linux-specific behavior and removed I-linux labels Jul 24, 2020
@stale stale bot removed the Stale label Jul 24, 2020
@christophermaier christophermaier added Platform: Windows Deals with Windows-specific behavior Type: Feature Issues that describe a new desired feature and removed I-windows labels Jul 24, 2020
@rahulgoel1 rahulgoel1 added Focus:Architecture Focus:Supervisor Focus:Supervisor Related to the Habitat Supervisor (core/hab-sup) component and removed Focus :Plan Build Type:Technical Debt No functional changes; just about cleaning up and reorganizing Epic L-rust Platform: Linux Deals with Linux-specific behavior Platform: Windows Deals with Windows-specific behavior Type:Additional Discussion Type:Stability Type: Feature Issues that describe a new desired feature labels Jul 23, 2021
@stale
Copy link

stale bot commented Oct 16, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. We value your input and contribution. Please leave a comment if this issue still affects you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Focus:Architecture Focus:Supervisor Related to the Habitat Supervisor (core/hab-sup) component
Projects
None yet
Development

No branches or pull requests

7 participants