Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFE: Split CLI, Agent, Proxy, and Server binaries #73

Open
cipherboy opened this issue Jan 27, 2024 · 10 comments
Open

RFE: Split CLI, Agent, Proxy, and Server binaries #73

cipherboy opened this issue Jan 27, 2024 · 10 comments
Labels
help wanted Extra attention is needed

Comments

@cipherboy
Copy link
Member

Upstream has heretofore avoided splitting the binaries. IMO, there are four candidates for separation:

  • The user-facing CLI, which operators and users interact with to talk to OpenBao instances directly.
  • The service-facing CLI, which runs the server and should probably allow linting of server configurations.
  • The (auto-authing) Proxy, which transparently handles authentication on behalf of local apps.
  • The templating Agent, which fetches secrets on behalf of local apps and writes them out.

The latter two could be left as a single unit as they were only recently differentiated in the code base.

I believe there was mostly a reluctance to change and desire to avoid breaking users' workflows. OpenBao may or may not have changes anyways (depending on #55 and #64), so it probably warrants another discussion here about it. This could potentially serve to decrease binary sizes again (as e.g., things like Cloud SDKs are reshuffled).

Additionally, we might be able to support both modes (a fat binary and several smaller binaries).

@joewxboy
Copy link
Contributor

EdgeX has proposed a smaller binaries approach, but I am unsure if what they want is compatible with what you have outlined, or is orthogonal.

@cipherboy
Copy link
Member Author

@joewxboy Do you happen to have a link to that proposal?

@joewxboy
Copy link
Contributor

@joewxboy Do you happen to have a link to that proposal?

That was purely verbal and informal. Let me open a conversation with James, Lenny, and team.

@bnevis-i
Copy link

bnevis-i commented Jan 29, 2024

Joe and Lenny asked that I contribute to the discussion regarding EdgeX usage of OpenBao:

  • User-facing CLI: EdgeX scripts all its interactions via custom Go code, thus splitting it out into a separate binary won't make that big an impact. It is nice when debugging to be able to enter the server and have the user CLI available. If it reduced the runtime footprint considerably that might be a good reason for splitting it out. A split-out user CLI might be a good candidate for static linking.
  • Auto-authing proxy: I didn't even know that existed.
  • Templating agent: We would not have had to write so much code in our security-bootstrapper if we had that as a small statically-linked standalone binary. (Read: didn't use that either.) It is very useful to have a templating agent to write config files with Vault-injected secrets that can be called from a container's entrypoint script.
  • Service-facing CLI. I assume this is the "vault server" subset of the CLI? I think a "vault server" interface to spin up the server is nice. Would it make much difference to runtime memory usage to split it out?

@cipherboy
Copy link
Member Author

\o Hi @bnevis-i, thanks for the comments :-)

It is very useful to have a templating agent to write config files with Vault-injected secrets that can be called from a container's entrypoint script.

I think this can be done with a Kubernetes operator (as discussed in #40) or similar, without needing to embed the agent in each container image. I've not used this myself, but if it sounds like there's general interest in it, its probably worth starting the forking process into our namespace here.

Service-facing CLI. I assume this is the "vault server" subset of the CLI? I think a "vault server" interface to spin up the server is nice. Would it make much difference to runtime memory usage to split it out?

I doubt it'd help (much) with runtime memory usage. I'm thinking it would mostly help with binary size, especially if #64 is adopted to prune the number of shipped plugins. There are other tricks (with #77 in particular) that would help server image size.

@bnevis-i
Copy link

bnevis-i commented Jan 29, 2024

I think this can be done with a Kubernetes operator (as discussed in #40) or similar, without needing to embed the agent in each container image. I've not used this myself, but if it sounds like there's general interest in it, its probably worth starting the forking process into our namespace here.

EdgeX can run on plain 'ol Docker. We don't embed the agent in each image. Rather, we populate a docker volume with startup scripts and tools, and point the entrypoint scripts of 3rd party containers at it. Anything on that volume can only be a script or a binary that depends on only the kernel ABI in order to survive glibc/musl variations across containers.

I doubt it'd help (much) with runtime memory usage. I'm thinking it would mostly help with binary size, especially if #64 is adopted to prune the number of shipped plugins. There are other tricks (with #77 in particular) that would help server image size.

My vote would be to prioritize that "low" (let those other issues decide the direction).

@cipherboy
Copy link
Member Author

Anything on that volume can only be a script or a binary that depends on only the kernel ABI in order to survive glibc/musl variations across containers.

Ah, very good. Existing binaries should already be statically linked, hence the size issues and I wouldn't expect this to change any time soon.

My vote would be to prioritize that "low".

This would likely come easily with splitting the other binaries off. Currently it is a single monolithic binary for everything. The upstream binary is around 365MB, so reducing this is rather important, IMO.

naphelps pushed a commit that referenced this issue Feb 2, 2024
* workflows: add bulk dep update job

* remove dependabot
@naphelps naphelps added this to the GA milestone Feb 5, 2024
@cipherboy
Copy link
Member Author

Syncing some commentary from matrix by me:


Hmmm, splicing all the binaries up will be a little interesting without tree shaking. We'll probably need to restructure more than a few files to actually see size benefits, given that there's a few files with broad imports...

$ ls -alh bao*
-rwxr-xr-x. 1 cipherboy cipherboy 139M Feb  5 22:12 bao
-rwxr-xr-x. 1 cipherboy cipherboy 107M Feb  5 22:11 bao-agent
-rwxr-xr-x. 1 cipherboy cipherboy 131M Feb  5 22:11 bao-cli
-rwxr-xr-x. 1 cipherboy cipherboy 104M Feb  5 22:11 bao-proxy
-rwxr-xr-x. 1 cipherboy cipherboy 132M Feb  5 22:11 bao-server

A very, very blunt:

//go:build !agent && !cli && !proxy

in vault/ and http/ (minus some required constants in http)

reduces the binary size to:

-rwxr-xr-x. 1 cipherboy cipherboy 94M Feb  6 17:36 bao-agent

(for agent only)

But this still includes all of /builtin and other places, so we'd really need to work on tag conditionals throughout nearly the entire code base for this to be useful.


But on second thought, perhaps we could take a top-down approach and rather than adding build tag limits in the destination packages, do the opposite and limit it in the command/ package only. This might require some additional splitting of code out into separate packages, but is worth a try.

@cipherboy
Copy link
Member Author

cipherboy commented Mar 1, 2024

Cross-posting my comment from #162:

I had started down this path earlier before realizing that it will be substantial work. I think what needs to happen is, for each file in command/, the file is evaluated for needing to be in a particular binary, and adding tags to the effect of removing it from the ones that don't need it. I don't think that's sufficient though, as binary sizes will still be large... I think we've gotta evaluate the imports of the files as well, and where the import isn't necessary for all modes that requires a file, split the file, so that the import isn't present for the other modes (and thus, isn't included in the build for the other modes that don't need it).

What I'm hoping to avoid is having to push build tags all the way into the other parts of Vault (past command/ into builtin/ or vault/)...

If anyone is interested in helping out, happy to to have it! Feel free to use (or not use, whichever is better :D) my code in that commit.

@naphelps naphelps added the help wanted Extra attention is needed label Mar 1, 2024
@cipherboy
Copy link
Member Author

cipherboy commented Mar 16, 2024

Some more notes: I think configuration needs to be split out from Agent/Proxy and Server; the latter includes seal wrapping which brings in go-kms-wrapping and thus the large cloud SDKs, but Agent/Proxy do not need these. Similar judicious pruning of imports will likely let us get to smaller binary sizes.

But the branch has been updated if anyone wants to keep experimenting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants