Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Socket args from the CLI #6747

Open
sipsma opened this issue Feb 27, 2024 · 5 comments
Open

Support Socket args from the CLI #6747

sipsma opened this issue Feb 27, 2024 · 5 comments
Assignees
Milestone

Comments

@sipsma
Copy link
Contributor

sipsma commented Feb 27, 2024

Right now you can't invoke Functions that accept args of type Socket from the CLI. In theory, this should be as simple as supporting e.g. dagger call fn --sock unix:///var/run/docker/docker.sock or dagger fn --sock tcp://localhost:1234, etc.

However, we need to be very careful when implementing this to not accidentally create a backdoor where any (malicious) module could gain access to any host socket. I think this will require explicit tracking of which modules have access to a given socket based on being passed it as an arg (or being passed an arg like a Container that has a socket embedded somewhere in its DAG definition).

  • Also need to make sure this is done in such a way that we don't invalidate cache with any random IDs

Related: #6726 (comment)
Also related (plugs previous hole that made it possible to access these sockets from a module): #6748

@jedevc
Copy link
Member

jedevc commented Feb 27, 2024

I think this will require explicit tracking of which modules have access to a given socket based on being passed it as an arg

I think this sounds similar to what I was suggesting for secrets as well - #6601 (comment). Like I mentioned there, I think the tricky part is needing to handle the case where you pass objects around that contain secret fields (in a potentially nested way).

But I do really like this as the way that we perform isolation between these sensitive types, Sockets, Secrets, potentially even Files/Containers (not sure if these can be accessed via raw graphql today).

Regardless, kinda neat to have both thought up the same way of performing better module isolation, it definitely has my vote.

@sipsma
Copy link
Contributor Author

sipsma commented Mar 20, 2024

Just to update, been slowly chipping away at the pre-reqs needed to support this (#6806 and #6836). There's one last PR needed to add support; it involves some internal refactoring that I need to coordinate with other refactorings going on now, but should make it to a release in the near future.

@helderco
Copy link
Contributor

@sipsma, can you give an update on the pre-reqs here?

@sipsma
Copy link
Contributor Author

sipsma commented May 10, 2024

@helderco Yeah sorry there's been a ton of work going on that I've been mentioning in other various PRs but I will summarize the status here.

Basically, in order to support this while also not creating a backdoor that would allow any malicious module to access to any socket on the host, we need fine grain isolation of session resources. But before I started this effort, the lack of isolation was a very deeply embedded assumption over almost the entire codebase.

  • Worth noting that this same problem applies to secrets, host services, registry auth, etc. so those holes are all gonna get plugged alongside this effort. The same general problem also applies to cache volumes once we want to isolate those.

There was a path to do this that just layered more hacks and confusing code on top of our existing pile of hacks and confusing code (which had grown "organically" during our various efforts over the past year, as is natural), but this felt like the right time to do things cleanly, so I have been going with that.

Thankfully it's already benefitted quite a few seemingly unrelated issues too.

The things already merged as part of this effort:

What I'm working on now:

Other things that will be made possible once this is finished:

  • We will no longer need the shim at all, which is currently a source of performance overhead and extreme convolution
  • We can stop doing insane grpc-in-grpc tunneling (something we've picked up from buildkit) just for modules/nested execs to connect back to the engine, which is another source of performance overhead and confusing code/behavior.
  • Storage drivers, specifically cache volumes, have a much simpler path to implementation via the new buildkit worker added to support all this

cc @aweris @shykes I know you have been eagerly anticipating this and it's been taking a while, but it's all nearing it's end now and in the big scheme of things will benefit quite a bit besides the issue itself

@helderco
Copy link
Contributor

Outstanding! 🙌

Thank you for that amazing update (and amazing work!). I’ve seen those issues and PRs popup here and there but didn’t have time to dig into everything, so I very much appreciate the birds-eye view here. It's exactly what I needed.

There was a path to do this that just layered more hacks and confusing code on top of our existing pile of hacks and confusing code (which had grown "organically" during our various efforts over the past year, as is natural), but this felt like the right time to do things cleanly, so I have been going with that.

Definitely! I always appreciate investing in some housekeeping to keep things more maintainable, otherwise we drown in technical debt.

@sipsma sipsma mentioned this issue May 13, 2024
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants