Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filesync: fix handling non-ascii in file paths #3946

Merged
merged 3 commits into from
Jul 7, 2023

Conversation

tonistiigi
Copy link
Member

fixes #3927

Another option would be to create a second set of keys that use encoded values, but I think this should be safe enough and maintain backward compatibility.

jedevc and others added 2 commits June 9, 2023 17:40
Signed-off-by: Justin Chadwell <me@jedevc.com>
(cherry picked from commit f9f2a00)
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
@tonistiigi tonistiigi requested a review from jedevc June 10, 2023 00:55
@jedevc
Copy link
Member

jedevc commented Jun 12, 2023

A couple funky edge cases I'm a bit worried about:

  1. What if the filename has a query escape string in it. That would be weird, but I think possible (e.g. an automatically generated name from a wgeted file). Then, we wouldn't encode it, since it's in the ASCII range, but we'd decode it, since it's run through url.QueryEscape.
  2. What if we have mismatched client and server versions. An older client is fine, since we just fallback to the old error, but an older server would write URL escaped filenames from a newer client. I think there should only be two behaviors: either we error as before, or both client and server support these filenames, and it works as expected - ideally we shouldn't have broken output.

Another option would be to create a second set of keys that use encoded values

It's a bit messy, but I think this is probably the way to go 😢 We can fully encode/decode on both sides, so avoids issue 1, and since the old property is still as before, we avoid issue 2.

There could also be a very hacky way to do this with a new capability? Though I'm less of a fan of that, since it would involve inspecting pathnames to check if they have unicode characters on the client side.

@tonistiigi
Copy link
Member Author

What if the filename has a query escape string in it

What case do you think is problematic? https://go.dev/play/p/TQgcNwLizWp

an older server would write URL escaped filenames from a newer client.

Is there a practical example for this? If there is, then for this, we don't need a full set of new fields. Just one field saying encoded=true would do, and the client would require that to decode.

@tonistiigi tonistiigi added this to the v0.12.0 milestone Jun 28, 2023
@jedevc
Copy link
Member

jedevc commented Jun 29, 2023

What case do you think is problematic? go.dev/play/p/TQgcNwLizWp

https://go.dev/play/p/o2VQ9OtsJo4:

    fmt.Println(decode(encode("foo?abc=aa&def=bb%2F")))
    // foo?abc=aa&def=bb/ <nil>

If the original path contains these escape strings, we reconstruct an incorrect path on the client.

Is there a practical example for this? If there is, then for this, we don't need a full set of new fields. Just one field saying encoded=true would do, and the client would require that to decode.

Low risk here, but again becomes an issue with the above as well.

Assuming two files on the client: test-äöü.txt and test-%C3%A4%C3%B6%C3%BC.txt (the encoded version of the previous).

If the buildkit server has this patch, doing COPY test-äöü.txt . causes different client behavior depending on whether it has the patch or not:

  • Old client gives test-%C3%A4%C3%B6%C3%BC.txt (since it doesn't perform decoding)
  • New client gives test-äöü.txt (since it does)

Previously though, this behavior would have just errored in buildkit - so maybe the change doesn't matter hugely?

@tonistiigi
Copy link
Member Author

Previously though, this behavior would have just errored in buildkit - so maybe the change doesn't matter hugely?

The cases that previously errored do not matter. But I guess if someone has an actual file test-%C3%A4%C3%B6%C3%BC.txt then that would have worked before. Not sure how realistic though.

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
@tonistiigi
Copy link
Member Author

Added a commit to detect old versions with raw values and not try to decode in that case.

Copy link
Member

@crazy-max crazy-max left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Not related but I don't think I've ever seen this test fail: https://github.com/moby/buildkit/actions/runs/5472440199/jobs/9964675293?pr=3946#step:7:2870

=== Failed
=== FAIL: worker/runc TestRuncWorkerExec (0.85s)
    common.go:77: Stdout: hello
    common.go:78: Stderr: 
    common.go:114: Stdout: 
    common.go:115: Stderr: 
    common.go:116: 
        	Error Trace:	/src/worker/tests/common.go:116
        	Error:      	Received unexpected error:
        	            	exit code: 1
        	            	github.com/moby/buildkit/util/stack.Enable
        	            		/src/util/stack/stack.go:77
        	            	github.com/moby/buildkit/executor/runcexecutor.exitError
        	            		/src/executor/runcexecutor/executor.go:372
        	            	github.com/moby/buildkit/executor/runcexecutor.(*runcExecutor).Run
        	            		/src/executor/runcexecutor/executor.go:336
        	            	github.com/moby/buildkit/worker/tests.TestWorkerExec.func2
        	            		/src/worker/tests/common.go:87
        	            	golang.org/x/sync/errgroup.(*Group).Go.func1
        	            		/src/vendor/golang.org/x/sync/errgroup/errgroup.go:75
        	            	runtime.goexit
        	            		/usr/local/go/src/runtime/asm_amd64.s:1598
        	            	container gt4ph8x3w9ki0dxvcvq81xnda has exited with error
        	            	github.com/moby/buildkit/executor/runcexecutor.(*runcExecutor).Exec
        	            		/src/executor/runcexecutor/executor.go:407
        	            	github.com/moby/buildkit/worker/tests.TestWorkerExec
        	            		/src/worker/tests/common.go:107
        	            	github.com/moby/buildkit/worker/runc.TestRuncWorkerExec
        	            		/src/worker/runc/runc_test.go:218
        	            	testing.tRunner
        	            		/usr/local/go/src/testing/testing.go:1576
        	            	runtime.goexit
        	            		/usr/local/go/src/runtime/asm_amd64.s:1598
        	Test:       	TestRuncWorkerExec

Wonder if this is not related to the runner the test runs on.

@tonistiigi tonistiigi merged commit 5e24548 into moby:master Jul 7, 2023
56 checks passed
var output strings.Builder
for _, runeVal := range input {
// Only encode non-ASCII characters.
if runeVal > unicode.MaxASCII {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

		if runeVal > unicode.MaxASCII || runeVal == '%' {

would fix this case I think #3946 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BuildKit can't COPY paths with non-ASCII characters
3 participants