Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent Marshalling Order in client/llb #4695

Closed
adamperlin opened this issue Feb 26, 2024 · 6 comments
Closed

Inconsistent Marshalling Order in client/llb #4695

adamperlin opened this issue Feb 26, 2024 · 6 comments

Comments

@adamperlin
Copy link
Contributor

We are building a frontend for buildkit, and in testing have noticed some inconsistent behavior with the buildkit client LLB marshaller. Specifically, when multiple source nodes for an ExecOp are present, the sources are not marshaled in a consistent order. I've tracked the issue to the following places in the Inputs implementations in client/llb/exec.go, and also in client/llb/fileop.go. It appears that iteration over map keys is used to return the inputs to ExecOp and FileOp while the marshaller is traversing the state graph, which means the traversal order will be inconsistent and the ops preceding any ExecOp and FileOp will be marshaled in a possibly random order.

I just wanted to ask if:

  1. This behavior is intended? If so, we can certainly work around it in our testing.
  2. Whether or not this behavior might have caching implications. Could this behavior in the marshaller be causing any issues with the solver's caching strategy? (cc @cpuguy83)

I've included the following program which should reproduce the behavior.

package main

import (
	"context"
	"fmt"
	"log"
	"os"

	"github.com/moby/buildkit/client/llb"
	"github.com/moby/buildkit/solver/pb"
)

func main() {
        // two source ops for this image op...
	st := llb.Image("busybox:latest").
		Run(
			llb.Args([]string{"ls /a; ls /b"}),
			llb.AddMount("/b", llb.Local(".")),
		).AddMount("/a", llb.Local("."))

	ctx := context.Background()

	def, err := st.Marshal(ctx)
	if err != nil {
		log.Fatal(err)
		os.Exit(0)
	}

	out := make([]*pb.Op, 0, len(def.Def)-1)

	for _, dt := range def.Def[:len(def.Def)-1] {
		op := &pb.Op{}
		if err := op.Unmarshal(dt); err != nil {
			log.Fatal(err)
		}

		out = append(out, op)
	}

  // in marshaled output, sometimes source docker image will come first, 
  // and sometimes source context will come first
	for _, op := range out {
		fmt.Println(op)
		fmt.Println()
	}
}
@tonistiigi
Copy link
Member

tonistiigi commented Feb 27, 2024

This is not intended and can be directly fixed. Iiuc this does not affect cache as the indexing of inputs is consistent. But it may have an impact on parallel requests where some nodes need to do extra work to compute cache keys instead of just being deduplicated in LLB graph.

@tonistiigi tonistiigi added this to the v0.13.0 milestone Feb 27, 2024
@adamperlin
Copy link
Contributor Author

adamperlin commented Feb 27, 2024

Got it, thanks for the prompt response here! Just to clarify, were you planning on fixing this? I do have a fix I was looking at and can open a PR

@tonistiigi
Copy link
Member

If you have a PR already, then open it. If not, then I can carry this as well.

@adamperlin
Copy link
Contributor Author

adamperlin commented Feb 27, 2024

I have opened #4706. I'd like to see if it passes CI first before full review, as I've had some trouble running the full test suite locally

@tonistiigi
Copy link
Member

Actually, I think I looked at it wrong. The code you have pointed out only seems to change the marshaling order, so the order of elements in the definition array. The actual input order for each vertex, and therefore the content addressable digest for each operation seems to have correct guaranteed ordering already.

We can still make it consistent, but it should not have any effect for the builds at all. On the daemon side these are all read into a map and keyed by their digest, so individual ordering has no impact.

@tonistiigi tonistiigi removed this from the v0.13.0 milestone Feb 28, 2024
@adamperlin
Copy link
Contributor Author

I believe this can be closed now due to #311.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants