Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

monitor: Enable to run build and invoke in background #1296

Merged
merged 1 commit into from
Jan 31, 2023

Conversation

ktock
Copy link
Collaborator

@ktock ktock commented Aug 29, 2022

#1104

This commit enables to launch builds in a background process and adds some related commands.
Please give me feedbacks on the design of the detaching and the CLI.

This covers PR4 and PR5 of #1104.

PR4
Refactor build command to invoke builds in a background process. This is important as we don't want the lifecycle of "debugging session" to be locked into a single process. A socket should be created under ~/.buildx and even if the current process (unexpectedly) dies its state can be accessed again via the socket.

PR5
Add list and attach commands to the monitor mode. List would show all the current active sessions (via the socket described in the previous section). Attach would make that session active in current process. If the session is already active in another process these processes would detach and go to the monitor mode.

Example:

BUILDX_EXPERIMENTAL=1 is required to enable detached buildx.

FROM ghcr.io/stargz-containers/busybox:1.32.0-org
RUN echo hello > /hello

buildx build

$ BUILDX_EXPERIMENTAL=1 /tmp/out/buildx build --invoke sh /tmp/ctx
[+] Building 3.8s (6/6) FINISHED                                                                                                             
 => [internal] load build definition from Dockerfile                                                                                    0.1s
 => => transferring dockerfile: 111B                                                                                                    0.0s
 => [internal] load .dockerignore                                                                                                       0.1s
 => => transferring context: 2B                                                                                                         0.0s
 => [internal] load metadata for ghcr.io/stargz-containers/busybox:1.32.0-org                                                           2.6s
 => [auth] stargz-containers/busybox:pull token for ghcr.io                                                                             0.0s
 => [1/2] FROM ghcr.io/stargz-containers/busybox:1.32.0-org@sha256:bde48e1751173b709090c2539fdf12d6ba64e88ec7a4301591227ce925f3c678     0.9s
 => => resolve ghcr.io/stargz-containers/busybox:1.32.0-org@sha256:bde48e1751173b709090c2539fdf12d6ba64e88ec7a4301591227ce925f3c678     0.0s
 => => sha256:ea97eb0eb3ec0bbe00d90be500431050af63c31dc0706c2e8fb53e16cff7761f 764.63kB / 764.63kB                                      0.8s
 => => extracting sha256:ea97eb0eb3ec0bbe00d90be500431050af63c31dc0706c2e8fb53e16cff7761f                                               0.1s
 => [2/2] RUN echo hello > /hello                                                                                                       0.1s
Launching interactive container. Press Ctrl-a-c to switch to monitor console
/ #

The above command launches two buildx processes: client and server.
The server is detached in a separated session.
The detached buildx server actually runs the build and the foreground client buildx forwards I/O and provides monitor prompt.

ps command on the host:

$ ps -C buildx -o pid,ppid,pgid,sess,tty,comm
    PID    PPID    PGID    SESS TT       COMMAND
2175912 2172353 2175912 2172353 pts/7    buildx
2175934    3904 2175923 2175923 ?        buildx

The detached server lives even after the client exits.

$ ps -C buildx -o pid,ppid,pgid,sess,tty,comm
    PID    PPID    PGID    SESS TT       COMMAND
2175934    3904 2175923 2175923 ?        buildx

list

list lists available server instances.
USING will be ture if the current monitor connects to that instance.

(buildx) list
ID			PID		STARTED		USING
buildx3846920277    	2176183 	40 seconds ago	true
buildx902825017     	2175934 	3 minutes ago	false

attach

attach attaches to the specified instance.
Monitor commands like reload and rollback can be issued to the newly connected instance.

(buildx) attach buildx902825017
Interactive container was restarted. Press Ctrl-a-c to switch to the new container
(buildx) list
ID			PID		STARTED			USING
buildx3846920277    	2176183 	About a minute ago	false
buildx902825017     	2175934 	3 minutes ago		true
(buildx) reload
[+] Building 0.4s (5/5) FINISHED                                                                                                             
 => [internal] load .dockerignore                                                                                                       0.1s
 => => transferring context: 2B                                                                                                         0.0s
 => [internal] load build definition from Dockerfile                                                                                    0.0s
 => => transferring dockerfile: 111B                                                                                                    0.0s
 => [internal] load metadata for ghcr.io/stargz-containers/busybox:1.32.0-org                                                           0.3s
 => [1/2] FROM ghcr.io/stargz-containers/busybox:1.32.0-org@sha256:bde48e1751173b709090c2539fdf12d6ba64e88ec7a4301591227ce925f3c678     0.0s
 => => resolve ghcr.io/stargz-containers/busybox:1.32.0-org@sha256:bde48e1751173b709090c2539fdf12d6ba64e88ec7a4301591227ce925f3c678     0.0s
 => CACHED [2/2] RUN echo hello > /hello                                                                                                0.0s

Interactive container was restarted. Press Ctrl-a-c to switch to the new container
(buildx) Switched IO

/ # cat hello
hello

kill

kill kills the specified instance.
If no argument is provided, it kills the connecting instance.

(buildx) list
ID			PID		STARTED			USING
buildx3846920277    	2176183 	About a minute ago	false
buildx902825017     	2175934 	4 minutes ago		true
(buildx) kill buildx3846920277
(buildx) list
ID			PID		STARTED		USING
buildx902825017     	2175934 	4 minutes ago	true
(buildx) kill
(buildx) list
ID	PID	STARTED	USING

How it works

buildx build forks and detaches itself as a server daemon. After launching the detached server, the foreground one continues running as a client.

Each buildx server instance provides gRPC API via socket (e.g. ~/.local/share/buildx/<instance-name>/buildx.sock) for serving build and invoke functionalities. buildx client provides monitor prompt and forwards I/O. It issues builds and invokes to the server everytime the user requests on the monitor.

I/O for the build (e.g. Dockerfile via stdin) and container (e.g. stdio) are forwarded via gRPC API similarly done by BuildKit's NewContainer gateway API.

Please note that the server doesn't support multiple clients as of now. Every time the user runs buildx build, it launches a new server process with a newly created socket. If a client requests a build to the server while another client's build is ongoing, the old one is cancelled and the new one starts. In the future, the server can support parallel multiple builds and invokes. Maybe we need some additional refactorings for the buildx build implementation for supporting parallel calls.

TODOs

  • Considerations

    • attach command internally runs rollback command for detaching the old client and attaching to the new client. Should we allow this without restarting the container?
  • TODOs

    • Discuss and determine the design
    • Discuss about the design of CI
      • The following files have been copied from BuildKit for generating protobuf-related code.
        • hack/dockerfiles/generated-files.Dockerfile
        • hack/update-generated-files
        • hack/validate-generated-files
    • Add tests

@ktock
Copy link
Collaborator Author

ktock commented Aug 29, 2022

cc @tonistiigi @jedevc

commands/build.go Outdated Show resolved Hide resolved
commands/build.go Outdated Show resolved Hide resolved
commands/build.go Outdated Show resolved Hide resolved
@@ -502,6 +743,9 @@ func buildCmd(dockerCli command.Cli, rootOpts *rootOptions) *cobra.Command {

if isExperimental() {
flags.StringVar(&options.invoke, "invoke", "", "Invoke a command after the build [experimental]")
flags.StringVar(&options.root, "root", "", "Specify buildx server root [experimental]")
flags.BoolVar(&options.debug, "debug", false, "Print debug logs [experimental]")
flags.BoolVar(&options.detach, "detach", runtime.GOOS == "linux", "Detach buildx server (supported only on linux) [experimental]")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the linux specific here? I assume macos shouldn't have much issues?

@crazy-max Maybe you can check windows viability?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The location of buildx root dir and the logic to detach server (commands/launch_linux.go) currently assumes linux.

Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did think that the server process would be shared for all builds. I wonder which one would be preferred. Instance per build uses more memory but might be simpler and maybe also better for upgrades.

return err
}

// connect to buidlx and get client
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: buidlx

if err := os.MkdirAll(rootDir, 0700); err != nil {
return err
}
instanceRoot, err := os.MkdirTemp(rootDir, "buildx")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this called instanceRoot, doesn't look like it is instance specific?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvm, I think this is invocation instance, not builder instance. Bit confusing to use the same terms.

@ktock
Copy link
Collaborator Author

ktock commented Sep 1, 2022

I did think that the server process would be shared for all builds. I wonder which one would be preferred. Instance per build uses more memory but might be simpler and maybe also better for upgrades.

Yes, I agree with this. Both sound good to me though I'm currently taking the approch of server per build because it was easier to implement.
How should the shared buildx server be managed? Should we let the user manage & upgrade buildx server by their own (using something like systemd)?

@crazy-max
Copy link
Member

FROM ghcr.io/stargz-containers/busybox:1.32.0-org
RUN echo hello > /hello
$ BUILDX_EXPERIMENTAL=1 docker buildx build --invoke sh .
#1 [internal] load .dockerignore
#1 transferring context: 2B done
#1 DONE 0.0s

#2 [internal] load build definition from Dockerfile
#2 transferring dockerfile: 111B done
#2 DONE 0.0s

#3 [internal] load metadata for ghcr.io/stargz-containers/busybox:1.32.0-org
#3 DONE 0.2s

#4 [1/2] FROM ghcr.io/stargz-containers/busybox:1.32.0-org@sha256:bde48e1751173b709090c2539fdf12d6ba64e88ec7a4301591227ce925f3c678
#4 resolve ghcr.io/stargz-containers/busybox:1.32.0-org@sha256:bde48e1751173b709090c2539fdf12d6ba64e88ec7a4301591227ce925f3c678 0.0s done
#4 DONE 0.0s

#5 [2/2] RUN echo hello > /hello
#5 CACHED
Launching interactive container. Press Ctrl-a-c to switch to monitor console
/ # 

Ctrl-a-c doesn't not seem to switch to monitor console:

Launching interactive container. Press Ctrl-a-c to switch to monitor console
/ # ^C

But if I exit, it switches though:

Launching interactive container. Press Ctrl-a-c to switch to monitor console
/ # exit
Switched IO
(buildx) 
$ uname -a
Linux pc-sff 5.10.102.1-microsoft-standard-WSL2 #1 SMP Wed Mar 2 00:30:59 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

@ktock Let me know if you need more info

@ktock
Copy link
Collaborator Author

ktock commented Sep 20, 2022

@crazy-max Thank you for trying this.

Let me know if you need more info

  • Is this reproduced on the master branch version's buildx as well?
  • And please make sure that the toggle keys are not Ctrl-a + Ctrl-c but Ctrl-a + c.
  • Could you provide docker buildx inspect ?

@jedevc
Copy link
Collaborator

jedevc commented Sep 20, 2022

Couple notes while experimenting 🎉 (awesome work btw)

Ctrl-a-c doesn't not seem to switch to monitor console

Also having this.

  • Is this reproduced on the master branch version's buildx as well? No, works fine on master.

  • And please make sure that the toggle keys are not Ctrl-a + Ctrl-c but Ctrl-a + c. Confirmed, yup.

  • Could you provide docker buildx inspect ? (from a docker-desktop setup, but can also repro with a native docker install)

    $ buildx inspect container
    Name:   container
    Driver: docker-container
    
    Nodes:
    Name:      container0
    Endpoint:  desktop-linux
    Status:    running
    Buildkit:  v0.10.4
    Platforms: linux/amd64, linux/amd64/v2, linux/amd64/v3, linux/arm64, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/mips64le, linux/mips64, linux/arm/v7, linux/arm/v6
    

A socket should be created under ~/.buildx and even if the current process (unexpectedly) dies its state can be accessed again via the socket.

We don't seem to do this - the server process seems to remain active indefinitely until kill-ed. I think we should probably aim to have the client send a disconnect message if it exits cleanly (separate from the kill message, since when we have multiple clients to a single server, they should all disconnect before dying).

Please note that the server doesn't support multiple clients as of now. Every time the user runs buildx build, it launches a new server process with a newly created socket

The error messages aren't super clear for this atm. If we disconnect a client, ideally the server should send some reason to print out.

I did think that the server process would be shared for all builds.

I'd prefer something like this as well, though the split approach would keep crashes isolated to one build - a broken server taking out all the debugging sessions in process would definitely be a frustrating experience.

If we do go shared, I think we should go minimal to start, and simply manage the server by starting it if it's not already running, otherwise, connecting to the existing instance. If we detect a version mismatch, we should restart the server automatically after asking the user to disconnect all their debugging sessions. We could look into managing it with systemd/etc as an advanced use case later, but I think out-of-the-box experience should be low-friction (e.g. buildx can run inside a container, where systemd might not be available).

attach command internally runs rollback command for detaching the old client and attaching to the new client. Should we allow this without restarting the container?

I think this depends - if we want to allow multiple debugging clients for a single server, and those should both be able to open shells, then they need to be separate shells (or stdin/stdout/stderr will get messy), so we'd always restart the container. If we're ok with only a single shell at the same time, I think it would be nice to switch over without restarting.

@ktock
Copy link
Collaborator Author

ktock commented Oct 12, 2022

Thank you for the review. Updated the patch based on the comments.

Ctrl-a-c doesn't not seem to switch to monitor console:

Fixed this.

I did think that the server process would be shared for all builds.

Fixed to share one server among builds.

disconnect

Added.

detect a version mismatch

Fixed to print a warning on version mismatch.


FROM ghcr.io/stargz-containers/busybox:1.32.0-org
RUN echo hello > /hello

buildx build

  • One shared buildx server
  • Launched automatically when no buildx server is found.
$ BUILDX_EXPERIMENTAL=1 /tmp/out/buildx build --invoke sh /tmp/ctx
INFO: no buildx server found; launching...
[+] Building 0.4s (5/5) FINISHED                                                                                      
 => [internal] load .dockerignore                                                                                0.0s
 => => transferring context: 2B                                                                                  0.0s
 => [internal] load build definition from Dockerfile                                                             0.0s
 => => transferring dockerfile: 111B                                                                             0.0s
 => [internal] load metadata for ghcr.io/stargz-containers/busybox:1.32.0-org                                    0.4s
 => [1/2] FROM ghcr.io/stargz-containers/busybox:1.32.0-org@sha256:bde48e1751173b709090c2539fdf12d6ba64e88ec7a4  0.0s
 => => resolve ghcr.io/stargz-containers/busybox:1.32.0-org@sha256:bde48e1751173b709090c2539fdf12d6ba64e88ec7a4  0.0s
 => CACHED [2/2] RUN echo hello > /hello                                                                         0.0s
Launching interactive container. Press Ctrl-a-c to switch to monitor console
/ # 

list

(buildx) list
ID				CURRENT_SESSION
k0y9wsvamlh6rt7rzhsb3g3vx	false
xtwwz5l2tonpbqodc27uf3ztz	true

attach

(buildx) list
ID				CURRENT_SESSION
k0y9wsvamlh6rt7rzhsb3g3vx	false
xtwwz5l2tonpbqodc27uf3ztz	true
(buildx) attach k0y9wsvamlh6rt7rzhsb3g3vx
(buildx) list
ID				CURRENT_SESSION
k0y9wsvamlh6rt7rzhsb3g3vx	true
xtwwz5l2tonpbqodc27uf3ztz	false

disconnect

Ctrl-D disconnects the session.

(buildx) [Ctrl-D]

From another terminal, we can see that npvjmblid3xyill4mltoum00k is disconnected:

(buildx) list
ID				CURRENT_SESSION
s1ainu0yjs3pgt9oplurlaqpc	true
npvjmblid3xyill4mltoum00k	false
(buildx) list
ID				CURRENT_SESSION
s1ainu0yjs3pgt9oplurlaqpc	true

disconnect command is also provided for explicit disconnecting.

(buildx) list
ID				CURRENT_SESSION
k0y9wsvamlh6rt7rzhsb3g3vx	true
xtwwz5l2tonpbqodc27uf3ztz	false
(buildx) disconnect 
(buildx) list
ID				CURRENT_SESSION
xtwwz5l2tonpbqodc27uf3ztz	false

version mismatch

version mismatch warning is printed when the versions of client and server aren't the same.

$ BUILDX_EXPERIMENTAL=1 /tmp/out/buildx build --invoke sh /tmp/ctx
WARNING: version mismatch (server: "github.com/docker/buildx 0.0.0+unknown ", client: "github.com/docker/buildx 0.0.0+unknown test"); please kill and restart buildx server

monitor/monitor.go Outdated Show resolved Hide resolved
Comment on lines 1 to 5
#!/usr/bin/env bash
# Forked from: https://github.com/moby/buildkit/blob/v0.10.4/hack/update-generated-files
# Copyright The BuildKit Authors.
# Copyright The Buildx Authors.
# Licensed under the Apache License, Version 2.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @crazy-max do you have any ideas of how we could do this so that we don't need to copy this over?

Copy link
Member

@crazy-max crazy-max Oct 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine for now but in a follow-up, we could just build using the hack/dockerfiles/generated-files.Dockerfile from BuildKit repo that would be fetched using the BuildKit reference from go.mod so we would not need an extra dockerfile.

I will also take a look at cherry picking moby/buildkit@3fd0785 from moby/buildkit#3204 so we use the go tools build constraint and avoid hardcoded versions in the Dockerfile and fully relies on ones from go.mod for protobuf/gogo tooling.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

| [`-f`](https://docs.docker.com/engine/reference/commandline/build/#specify-a-dockerfile--f), [`--file`](https://docs.docker.com/engine/reference/commandline/build/#specify-a-dockerfile--f) | `string` | | Name of the Dockerfile (default: `PATH/Dockerfile`) |
| `--iidfile` | `string` | | Write the image ID to the file |
| `--invoke` | `string` | | Invoke a command after the build [experimental] |
| `--label` | `stringArray` | | Set metadata for an image |
| [`--load`](#load) | | | Shorthand for `--output=type=docker` |
| `--log-file` | `string` | | Specify file to output buildx server log [experimental] |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about the command line args, I think we should maybe group "client/server" control together somehow, or maybe we should have a config file?

--debug also feels like something we might want to keep for the debugging functionality, instead of for increasing logging verbosity.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduced a config file for the server.

commands/builder/generate.go Outdated Show resolved Hide resolved
commands/builder/client.go Outdated Show resolved Hide resolved
commands/build.go Outdated Show resolved Hide resolved
}
go wait()
return nil
case controlModeServe:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can, I think it makes sense to actually have a separate internal sub-command for this, instead of trying to distinguish by env var - only the client-side parts should need to be in this function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed to use hidden subcommands.

@@ -434,7 +456,263 @@ func buildCmd(dockerCli command.Cli, rootOpts *rootOptions) *cobra.Command {
options.contextPath = args[0]
options.builder = rootOpts.builder
cmd.Flags().VisitAll(checkWarnedFlags)
return runBuild(dockerCli, options)
if !isExperimental() || !options.detach {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this condition 😄

If --detach is set, then BUILDX_EXPERIMENTAL must be enabled, since --detach is an experimental option. I assume it's explicitly to run the build on the server, but I think we should always default to that in experimental mode - so I'm not sure if we need to keep the detach option?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently --detach is supported only on linux. So could we preserve --detach=false option for non-linux platform?

@jedevc
Copy link
Collaborator

jedevc commented Oct 12, 2022

I like the refactor to use a single server 🎉 Left a couple high-level comments, got a few notes I found while playing:

  • list, reload, etc feel clunky to me. I think we should go with commands that feel more "familiar" - maybe reload -> build and list to status. attach and disconnect don't match up, we should have attach and detach or connect and disconnect. kill kills the whole server - that was unexpected, when other clients were connected!
  • I keep looking around for a command that will just drop in to monitor mode directly, without running a build - I think the only way to get there atm is to start a dummy build, then disconnect.
  • Connecting multiple clients to a single session is weird. I think only one is supported, but when attaching a second client, it appears to work - with both lists in each client saying they're connected - even though the output has already been switched away from that first client. I think we should be more explicit with disconnecting clients when someone else connects.
  • We can leak sessions fairly easily 😄 If I have two separate clients in two separate sessions, then connect client 1 to session 2, then session 1 is owned by no one, then closing all the clients leaves that session unowned. Not sure what we should do here - should we kill a session the moment everyone has called disconnect from it (so it should still be preserved from a client crash), or should we leave it open until the client that opened it has exited?

I'm happy if we circle back and address these later, just wanted to list them out so I don't forget!

@ktock ktock force-pushed the monitor-list branch 2 times, most recently from a88b3f5 to 9e534bc Compare October 21, 2022 12:46
@ktock
Copy link
Collaborator Author

ktock commented Oct 21, 2022

@jedevc Thank you for the review.

We can leak sessions fairly easily smile If I have two separate clients in two separate sessions, then connect client 1 to session 2, then session 1 is owned by no one, then closing all the clients leaves that session unowned. Not sure what we should do here - should we kill a session the moment everyone has called disconnect from it (so it should still be preserved from a client crash), or should we leave it open until the client that opened it has exited?

disconnect command can take an arg to forcibly disconnect a specific session. So we can manually kill a specific session.

list, reload, etc feel clunky to me. I think we should go with commands that feel more "familiar" - maybe reload -> build and list to status. attach and disconnect don't match up, we should have attach and detach or connect and disconnect. kill kills the whole server - that was unexpected, when other clients were connected!

I keep looking around for a command that will just drop in to monitor mode directly, without running a build - I think the only way to get there atm is to start a dummy build, then disconnect.

Connecting multiple clients to a single session is weird. I think only one is supported, but when attaching a second client, it appears to work - with both lists in each client saying they're connected - even though the output has already been switched away from that first client. I think we should be more explicit with disconnecting clients when someone else connects.

Thank you for the suggestion. Could we address them in following-up patches as this PR is already very big?

if in.pull != nil {
pull = *in.pull
}
_, err = runBuildContext(ctx, dockerCli, in.BuildOptions, os.Stdin, in.progress, in.invoke != "", nil)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my reading of the code here is right, we only support --invoke when going through the runBuildOnBuilder path? That seems fine to me, especially as it looks we'll make this the default eventually?

If so, I think we might be able to remove the allowNoOutput flag at this point.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the review. Fixed.

)
}

func launchCmd(dockerCli command.Cli, rootOpts *rootOptions) *cobra.Command {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need both launchCmd and serveCmd? It seems like launch just calls serve, but I might be missing something as to why it's necessary.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was for immeditely detaching buildx server processs from the buildx (client) process and reparenting it to the init process. But it's not strictly necessary because reparenting happens after buildx (client) exits.
Fixed to directly exec buildx server.

@jedevc
Copy link
Collaborator

jedevc commented Oct 25, 2022

Hey @ktock sorry it's taking so long to review 🎉 I'm (personally) happy with the design, and would be happy to approve soon.

I think the design is good atm, the interface feels quite natural, but there is some complexity there we should address in a follow-up (IMO). My comments are mostly about making the code clear so I'd feel comfortable working on it in the future myself 😄

A couple notes:

  • I'm having difficulty understanding the ioset package, especially the SingleForwarder. Some godoc and some inline comments explaining what's going on and what it's for would be really helpful (I've spent quite a while trying to work it out, and it's not clicking for me!)
  • ioset's In/Out should probably be called PipeReader/PipeWriter, since they're created by a Pipe method (ideally they'd also have some way of making sure that the caller can't accidentally call Close on the Stdin/Stdout/Stderr directly). Again, some godoc would be nice ❤️
  • There's a few functions all with similar names and behaviors runBuild/runBuildOnBuilder/runBuildContext/buildTargets. This is kind of tricky to track through the code - some clearer names (and godoc) would be nice here.
  • We should avoid calling these "builders" - that name is already used for when we create drivers, and makes some of the code tricky to follow. We've already been using controllers (with BuildxController) which I quite like, but we should make sure that the whole PR follows the same standard.

Some things for follow-ups:

  • Things mentioned here. Yup this PR is getting long, it'll be easier to work on these in-tree (and then we can multitask a bit easier) 🎉
  • We should refactor out the different controllers (local and remote) into a separate package. I tried having a look at this myself, but as is, there's some strong coupling between the cli and the controllers. Even if that gets nicely separated out, we end up in a scenario where build and bake still use the same utilities, which makes things tricky. This ended up being kind of a time sync, and IMO, we should probably first work out how we want this design to interact with bake before going too much further.
    I'm happy to take a look at this once we merge, I already spent some time getting familiar with what will make this tricky.
  • I wonder if we should prefer a config file in a known location like ~/.docker/buildx - instead of requiring it be passed each time 🤔

@ktock
Copy link
Collaborator Author

ktock commented Nov 7, 2022

@jedevc Thank you for your comments.

I'm having difficulty understanding the ioset package, especially the SingleForwarder. Some godoc and some inline comments explaining what's going on and what it's for would be really helpful (I've spent quite a while trying to work it out, and it's not clicking for me!)
ioset's In/Out should probably be called PipeReader/PipeWriter, since they're created by a Pipe method (ideally they'd also have some way of making sure that the caller can't accidentally call Close on the Stdin/Stdout/Stderr directly). Again, some godoc would be nice heart

Added godoc to ioset package.

There's a few functions all with similar names and behaviors runBuild/runBuildOnBuilder/runBuildContext/buildTargets. This is kind of tricky to track through the code - some clearer names (and godoc) would be nice here.

Tried to make it clearer by renaming some of them as the following.

  • runBuildContext -> runBuildWithContext
  • runBuildOnBuilder -> launchControllerAndRunBuild
  • runBuild -> runBuild (unchanged)
  • buildTargets -> buildTargets (unchanged)

Please feel free to apply more changes (maybe in following-up PRs) if you come up with the better names.

We should avoid calling these "builders" - that name is already used for when we create drivers, and makes some of the code tricky to follow. We've already been using controllers (with BuildxController) which I quite like, but we should make sure that the whole PR follows the same standard.

Renamed builder to controller.

Some things for follow-ups:

Things mentioned here. Yup this PR is getting long, it'll be easier to work on these in-tree (and then we can multitask a bit easier) tada
We should refactor out the different controllers (local and remote) into a separate package. I tried having a look at this myself, but as is, there's some strong coupling between the cli and the controllers. Even if that gets nicely separated out, we end up in a scenario where build and bake still use the same utilities, which makes things tricky. This ended up being kind of a time sync, and IMO, we should probably first work out how we want this design to interact with bake before going too much further.
I'm happy to take a look at this once we merge, I already spent some time getting familiar with what will make this tricky.

Let's work on it in following up PRs.

@jedevc jedevc added this to the v0.10.0 milestone Dec 6, 2022
Copy link
Member

@crazy-max crazy-max left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @ktock, looks great!

Related to my review about proto gen/validation, I'm fine to do this in a follow-up, not a blocker.

Also needs rebase.

Comment on lines 237 to 229
func rootDataDir() (string, error) {
if xdh := os.Getenv("XDG_DATA_HOME"); xdh != "" {
return filepath.Join(xdh, "buildx"), nil
}
home := os.Getenv("HOME")
if home == "" {
return "", fmt.Errorf("environment variable HOME is not set")
}
return filepath.Join(home, ".local/share/buildx"), nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should not just use the current buildx config dir:

func ConfigDir(dockerCli command.Cli) string {
if buildxConfig := os.Getenv("BUILDX_CONFIG"); buildxConfig != "" {
logrus.Debugf("using config store %q based in \"$BUILDX_CONFIG\" environment variable", buildxConfig)
return buildxConfig
}
buildxConfig := filepath.Join(filepath.Dir(dockerCli.ConfigFile().Filename), "buildx")
logrus.Debugf("using default config store %q", buildxConfig)
return buildxConfig
}

And create a controller folder in it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. Fixed.

@@ -0,0 +1,53 @@
# Forked from https://github.com/moby/buildkit/blob/v0.10.4/hack/dockerfiles/generated-files.Dockerfile
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has changed a bit, see moby/buildkit#3236

I think we can do the same here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. Updated this file.

Comment on lines 1 to 5
#!/usr/bin/env bash
# Forked from: https://github.com/moby/buildkit/blob/v0.10.4/hack/update-generated-files
# Copyright The BuildKit Authors.
# Copyright The Buildx Authors.
# Licensed under the Apache License, Version 2.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -0,0 +1,35 @@
#!/usr/bin/env bash
# Forked from: https://github.com/moby/buildkit/blob/v0.10.4/hack/validate-generated-files
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jedevc jedevc removed this from the v0.10.0 milestone Jan 6, 2023
@ktock ktock force-pushed the monitor-list branch 2 times, most recently from a65c4b2 to 59e2ca0 Compare January 26, 2023 07:53
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
@jedevc jedevc merged commit cb94298 into docker:master Jan 31, 2023
@jedevc
Copy link
Collaborator

jedevc commented Jan 31, 2023

Heya @ktock sorry this took so long 😢 Hopefully we can open up some bandwidth now to get some follow-ups done ❤️

@ktock
Copy link
Collaborator Author

ktock commented Jan 31, 2023

@jedevc I'm sorry for my slow update, and thank you for your time for reviewing this patch!

@ktock ktock deleted the monitor-list branch January 31, 2023 14:34
@jedevc
Copy link
Collaborator

jedevc commented Jan 31, 2023

Aha, so, I just found something right off the bat 😢

BUILDX_EXPERIMENTAL=1 buildx build . --builder=dev --invoke=sh
INFO: connecting to buildx server
[+] Building 0.0s (2/2) FINISHED                                                                                                                                  
 => [internal] load build definition from Dockerfile                                                                                                         0.0s
 => => transferring dockerfile: 2B                                                                                                                           0.0s
 => [internal] load .dockerignore                                                                                                                            0.0s
 => => transferring context: 2B                                                                                                                              0.0s
ERROR: failed to build: rpc error: code = Unknown desc = failed to solve: failed to read dockerfile: open /tmp/buildkit-mount3665150422/Dockerfile: no such file or directory

Any ideas as to why we're no longer able to find the Dockerfile?

@jedevc
Copy link
Collaborator

jedevc commented Jan 31, 2023

Ah the detached server has a different working directory to the client - so I think we need to do something special to get the paths to resolve correctly?

@ktock
Copy link
Collaborator Author

ktock commented Jan 31, 2023

@jedevc Sorry for that bug. Maybe we need something like the following(not tested). Can I open the following-up PR?

func resolveAbsolutePaths(options *controllerapi.BuildOptions) (_ *controllerapi.BuildOptions, err error) {
	options.ContextPath, err = absIfLocal(options.ContextPath)
	if err != nil {
		return nil, err
	}
	// ... Do the same for DockerfileName, etc.
	return options, nil
}

func absIfLocal(p string) (string, error) {
	if _, err := os.Stat(p); err == nil {
		return filepath.Abs(p)
	}
	return p, nil
}

@jedevc
Copy link
Collaborator

jedevc commented Jan 31, 2023

Yeah sure! Go for it 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants