New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Consolidate binaries #7471
Comments
That perhaps makes sense for the images that contain multiple large binaries. But could you still make it such that small client-side binaries remain small, and don't contain all of the unnecessary server-side code? e.g. -rwxr-xr-x 1 vitess vitess 16M Jan 26 16:29 vtctlclient
-rwxr-xr-x 1 vitess vitess 32M Jan 26 16:29 vtexplain |
I'm not sure what $ docker run vitess/lite:v9.0.0 ls -lhSr /vt/bin/
total 416M
-rwxr-xr-x 1 vitess vitess 16M Jan 26 13:53 vtctlclient
-rwxr-xr-x 1 vitess vitess 17M Jan 26 13:53 vtctldclient
-rwxr-xr-x 1 vitess vitess 25M Jan 26 13:53 mysqlctld
-rwxr-xr-x 1 vitess vitess 41M Jan 26 13:53 vtbackup
-rwxr-xr-x 1 vitess vitess 54M Jan 26 13:53 vtgate
-rwxr-xr-x 1 vitess vitess 55M Jan 26 13:53 vtorc
-rwxr-xr-x 1 vitess vitess 58M Jan 26 13:53 vtworker
-rwxr-xr-x 1 vitess vitess 74M Jan 26 13:53 vtctld
-rwxr-xr-x 1 vitess vitess 80M Jan 26 13:53 vttablet Other small programs, from base: $ docker run vitess/base:v9.0.0 ls -lhSr /vt/bin/
-rwxr-xr-x 1 vitess vitess 3.8M Jan 26 16:29 vttlstest
-rwxr-xr-x 1 vitess vitess 6.2M Jan 26 16:26 query_analyzer
-rwxr-xr-x 1 vitess vitess 6.5M Jan 26 16:29 codegen
-rwxr-xr-x 1 vitess vitess 16M Jan 26 16:26 automation_client
-rwxr-xr-x 1 vitess vitess 16M Jan 26 16:29 vtworkerclient
-rwxr-xr-x 1 vitess vitess 16M Jan 26 16:29 vtctlclient
-rwxr-xr-x 1 vitess vitess 17M Jan 26 16:29 vtctldclient
-rwxr-xr-x 1 vitess vitess 18M Jan 26 16:26 automation_server
-rwxr-xr-x 1 vitess vitess 18M Jan 26 16:28 vtclient
-rwxr-xr-x 1 vitess vitess 19M Jan 26 16:28 vtbench
-rwxr-xr-x 1 vitess vitess 19M Jan 26 16:28 vtaclcheck
-rwxr-xr-x 1 vitess vitess 20M Jan 26 16:29 vttestserver
-rwxr-xr-x 1 vitess vitess 21M Jan 26 16:28 vtadmin
-rwxr-xr-x 1 vitess vitess 24M Jan 26 16:29 vtgateclienttest
-rwxr-xr-x 1 vitess vitess 32M Jan 26 16:29 vtexplain |
I think it's reasonable to have a smaller client, and is a reasonable expectation from mysql users. I'd leave it at |
I really like this proposal. It will make it much easier for new users to get started with Vitess. |
I'm a fan of this idea. We circulated a similar idea internally a few months ago. I'll add some writeup early next week with preliminary thoughts we had. Somewhat related, we can also work on reducing the binary size. Some initial look I took back when: #6578. I've had mixed result with |
I'd like to clarify that I don't have any bandwidth to tackle this, I've just had this thought for a while and wanted to start a discussion. This may be a decent project to have an intern start that would help them gain an understanding for the breadth of Vitess.
That's going to take a little work to combine, but once the legwork is done, it'd be much easier for other maintainers to make suggestions on commands that should stay/go, flags, etc. |
One of our thoughts was that we could create the single binary, e.g. With this in mind, we can just do a big Just a thought. |
Sorry I'm late to this party, but I'm not convinced this actually addresses the stated goal. Let's say I want to run a vtadmin container. I need the core Vitess libraries ( If we adopt this RFC, then in order to run vtadmin (this is true of any individual vitess component, I'm just picking my personal example), I have to pull in all that extra code, which I don't need and will never call, in order to only ever execute a subset of it, namely the If we take @shlomi-noach's approach, and make the I think both of these things make writing vitess binaries (or, the singular binary, as the case may be) easier, but I don't think they make them friendlier to use for new users. My hypothesis is that the underlying issue is there is no clear separation between CLI code and library code. Essentially we're missing a boundary between a functional core and an imperative shell. Core libraries like I think if we do a lengthy refactor to draw a clear boundary between flags and the libraries and disentangle these two fundamentally different concerns, we'll end up with cleaner, well-scoped, and consequently, more user-friendly binaries. And they might end up being smaller, too. |
That's the opposite of what Shlomi is suggesting. He is saying that you can alias
Without actually looking at what your dependencies are, there's a decent chance you're already pulling in a lot of vtctld / vtgate anyways, given how connected all our imports end up being. Even if it doesn't, the vtadmin binary alone is probably still going to be ~30 MB.
This is targeted mostly for users and not for writing binaries. We currently have 20+ binaries, many of which are almost exact duplicates of each other (vttestserver / vtcombo) because core Vitess maintainers were unaware that the other binaries existed. I have personally never seen or heard of probably half of them until this post, despite running Vitess in production and being a maintainer for a few years. IMO, it's a lot easier for a new user to run The core tradeoff of this proposal is that a single larger binary is easier to discover and operate than many binaries. Yes, you may technically know that it has 50 MB of "unused" code if you only intend it to run vtadmin, but that is the least important metric IMO. It's a one-time download cost for users, and if you are running containers in k8s, having a single container referenced means that you're much more likely to have the container cached locally, likely speeding up your container start time vs the extra bandwidth slowing it down. As I'm typing this out, it should be very easy to add build flags for each subcommand, so if you really wanted to be specific, you could run
I'm in total agreement, which is why I proposed moving to cobra and auditing flags as the long-term solution, like you already started with vtadmin.
That's a great idea, but just watching your many many PRs roll in as you're doing some of that refactoring to make vtadmin work underscores just how big of a project that would be. I'm not sure that the project is likely to have that bandwidth in the near future. At the end of the day, it's less about the size of the binaries than the sheer quantity of them. |
tl;dr: I agree with @ajm188 moving to a single binary doesn't feel like it would address many of the primary concerns that we seem to have.
I'm pretty sure we're all are aligned that the flag situation needs a bunch of love and even on how we address that (audit -> necessary refactoring to get away from The other question: should we build a uni-binary feels like a combination of several questions:
1 + 3 both seem like a documentation / paved road problem more than something a technical solution is going to address. Let's say that we go with the uni-binary. That simplifies the question of how to find out what binaries/commands are available (to a degree -- I believe all binaries should already be included in the tagged release assets) but nothing about that helps me understand how to run vitess. I still need to read through docs except the guidance is now Once I do that however I don't know what
And then an infinitely long list of flags that I don't understand and (some of which) aren't relevant / applicable bc of our flag situation. All told that leaves us in a "we need to fix our documentation and help text" territory which isn't a "how do we bundle" question. I don't have a lot of experience with Docker builds and will concede that it may make things easier to manage for the project with a single binary if y'all say it is.
Let's look at another project which is functionally a collection of an arbitrarily large / complex list of sub-commands: git! I'm not going to claim it's easy to learn/use but it is certainly thoughtful in its construction and this is their approach to handling a unified "entry point" + 8 billion actual commands (ref)
We could take an approach like that if we wanted to provide a cohesive story about "what can vitess do" and it's a significantly lighter (and safer, I believe the refactoring necessary to unify the binaries would be fairly painful/dangerous). An extremely simplistic approach to get us started might look something like:
Per earlier conversations around binary size I don't think reductions of this type are worth the any potential debugability trade off. |
In your opinion, what are the primary concerns? Is your main argument against unifying binaries the amount of work / possibly unsafe refactoring that would entail?
I appreciate the context of your comment, especially git as an example. I was definitely unaware that git shelled out to other commands. As a counterpoint, AWS, GCP, and Azure ship each a mostly unified CLI (I'm more familiar with GCP, which does have standalone binaries for BigQuery and GCS). Those have orders of magnitude more subcommands, flags, etc., but wherever you find a command on the internet, there's not a question of which binary needs to be invoked. |
My reading of the RFC is that these were the concerns we wanted to address with the single binary:
Nah, I deeply believe that we need to do 90% of the work of unifying just to get the flag situation under control + better separation of concerns between "component runtime" and "common library" as both you and @ajm188 call out above. My main argument is that as a solution it doesn't actually do much to address the ergonomics / discoverability around running Vitess which feels a lot more like a documentation/onboarding problem than a binary packaging problem. |
I may be off base here but I think most of of those PaaS binaries are primarily CLI tools that wrap interactions with an infrastructure API? The share a unity of purpose that isn't there between Also going from "how to run this command from stackoverflow" to "how do I get this binary in my PATH" feels like a fairly minimal part of the complexity of running vitess. I do think we can/should make a lot of improvements in packaging, distribution, and documentation to make understanding what binaries are available / what they do available. All that work is still necessary even if we have everything living as |
The main issue with the multiple binaries is that the build packages most of the code into every single binary anyway so they ballon in size. Here is the file size for a recent build of Vitess
This really isn't vitess's fault as go is designed to statically link the whole world in each binary. Right now the
Also from a usability standpoint being able to tell someone to download one 70Mb file and drop it in their path, and then they have the whole of Vitess is a much easier getting started path then having to install 300Mb of different binaries, and still not having all the dependencies to get started. Multiple users have compared our getting started to other database systems which deploy as a single binary and have a |
Ah, if the primary driver here is binary size then I don't have any better ideas than bundling them together 😀. @derekperkins Does it make sense to bake a formal suggestion that we adopt something similar to the factoring that is used in https://github.com/vitessio/vitess/blob/master/go/cmd/rulesctl/main.go as a pattern for what a path on the road to the uni-binary may look like? |
Just to follow up, the more I think about this the more I like something resembling that approach. It walks the line making both build styles viable and a nice route to migration into a single binary if that's the path that folks want. Specifically:
In this world folks who want to focus on a single binary containing all component entrypoints can There is also a story about how we maintain both states as we start digging into the long process of porting thing into a new approach to flags/binaries. |
I 100% agree, I love your approach you suggested in #7712 (comment), and baking a formal suggestion sounds the like the next step. I'm not sure what elevates a proposal to VEP status - @deepthi? Other questions that we need to answer in the proposal:
vitess
|
We haven't discussed that, and so far we have had VEPs only for Release cycle and upgrade order. Functional enhancements / new features have all been proposed as RFCs. |
My initial thought on this is to make an incremental step from implicit flag definitions to explicit, which comes at the cost of some increased verbosity. I think/hope that will be worth it though. Essentially, we would move all the
Then, every binary (or, subcommand of the uni-binary) would need to be aware of exactly which flags it requires, and explicitly declare them (in I can file an issue for this part of the approach in the enhancements repo and get started on a VEP for it. |
As long as we are in design mode, should we just plan on going full viper from the beginning? That feels good to me, especially since we have already started moving some config into yaml. Not sure if we could make that backwards compatible or not, though admittedly I haven't used it or seen the code for it, but it seems like it should be doable. |
viper isn't quite backwards compatible, which is why i structured that example in a very particular way. if you want to bind a config var to both a cli flag and a viper-backed config, you need to access it via viper, so it can resolve all the possible input sources and give you the final, concrete value. so, by moving all our flag vars first to package-private and accessing them via getters, allows to swap the implementation of those getters from func GetX() string { return x } to func GetX() string { return viper.GetString("some-flag") } if we want to use viper, but we also get the incremental improvement of well-scoped flags if we should decide halfway through that we don't want viper (or, if viper v2 comes out with a different API, for example). I think this is potentially orthogonal to the binary consolidation, though ,in that we should revisit how flags in vitess work regardless of the outcome of this RFC, so does it make more sense to keep the two discussions separate? |
I'm on board with your proposal and your hesitation on full viper integration makes sense. I'm in favor of keeping the binary consolidation / flag scoping as a single topic as they are tightly connected. My original hope was that we came out of this process with something that felt very intentionally designed vs the organic sprawl that we have now, and I think we're really close. |
As a newcomer I think this would be a game changer for adoption- CockroachDB and Citus' strengths are simplicity of deployment in comparison to Vitess. Having to coordinate all of these binaries setting up your cluster is high friction. The side benefits of not overlapping development/discovery of features also sounds really useful. Love the proposal. |
From a recent conversation in Slack
|
One Binary to Rule Them All
This came up in #7433 about consolidating Docker images. We're artificially inflating the amount of installation and operational complexity by having all the separate binaries, since the majority of each binary is all the same code and a ton of dependencies. If we had a single binary that "shelled" out to all the existing binaries: e.g.
vitess vtgate
,vitess vttablet
, etc, you'd have everything you needed in a single 100 MB binary instead of 10 80 MB files. It would take some orchestration, but could simplify things considerably, both for new users and for orchestration / cluster management.Installation
There's a lot of value on the Cockroach side of having the install be a single binary vs trying to navigate through the many components Vitess has, with varying availability in Docker.
Discoverability
There are often questions on Slack about how to check query compatibility, that inevitably lead to having the user download
vtexplain
, vs just runningvitess --help
and seeing that there is avitess explain
options. I've been using Vitess for 3-4 years now, and I just learned about 6-7 binaries from the list @jmoldow pasted below. In that same vein, @dkhenry ended up working onvtcombo
and giving it near feature parity withvttestserver
simply because he didn't know that the latter existed.Simpler build / release process + customization
There's an open question about how much external vendor support should exist inside Vitess core. The same thing is true about different
toposerver
implementations, as thek8s
topo adds significant weight to the binaries due to bringing inclient-go
dependencies. We don't do much with build tags today, probably because of our build complexities. Unifying everything should simplify the build both for Vitess itself and for users with their own release pipelines. We could allow more "plugins" into Vitess, but potentially only compile subsets of them by default. On the Docker side, there could be a handful of configurations we tag::all-plugins
,no-k8s-topo
, etc. If we wanted to get really fancy, we could work out a build system like Caddy that does dynamic builds with specific plugins. https://caddyserver.com/downloadSeparate client binary
As brought up in the comments, it's probably a good idea to have a smaller client available, similar to
mysql-server
vsmysql-client
packages.vitess
- main server binary with all code includedvitessclient
- client binary that is mainly a gRPC stub to the serverDeprecation period
We are currently rewriting the interface to
vtctld
to be a real gRPC api, so we already are going to have a need for deprecating the current communication layer. If we are going to do that, it would make sense to widen that deprecation for all the older individual binaries in favor of these two. Given our deprecation policies, that probably puts a target for this to land in 13.0 next January.Docker implications
Per #7433, we have a very unfocused Docker strategy with a lot of historical cruft. We would simultaneously deprecate most of the existing repos in favor of new
vitess
andvitessclient
repos, which will be far less confusing to new users than seeing names likelite
,base
,k8s
, andboostrap
. An operational benefit of using a single Docker image for the various components is caching if you are using k8s. When spinning up a new component, this increases the likelihood that the image has already been pulled to the node, reducing startup time and cost, while reducing cache pollution with all the individual images.Work to be done
Making this functional probably wouldn't take a huge amount of effort for a PoC. We could just copy the
main
files from each binary and load them into one. We already have a serious cross-binary flag contamination problem, so that wouldn't necessarily have to be fixed in the initial scope. Longer term, it probably makes sense to follow whatvtadmin
is doing with https://github.com/spf13/cobra and properly scope all the existing flags.cc @deepthi @enisoc
The text was updated successfully, but these errors were encountered: