Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LCOW: lazycontext: Use correct lstat, fix archive check #37356

Merged
merged 1 commit into from
Jun 28, 2018

Conversation

lowenna
Copy link
Member

@lowenna lowenna commented Jun 27, 2018

Signed-off-by: John Howard jhoward@microsoft.com

Fixes #36793. Replacement for #37316

First - thanks @yusuf-gunaydin for debugging and coming up with a fix. Yours was so very close :)

Here's the output using the same files as described in #36793 with the updated fix:

PS E:\docker\build\36793> docker build --platform=linux -t testcachelinux 1
Sending build context to Docker daemon  2.048kB
Step 1/2 : FROM alpine
latest: Pulling from library/alpine
ff3a5c916c92: Pull complete
Digest: sha256:e1871801d30885a610511c867de0d6baca7ed4e6a2573d506bbec7fd3b03873f
Status: Downloaded newer image for alpine:latest
 ---> 3fd9065eaf02
Step 2/2 : RUN echo 1 > /test.txt
 ---> Running in 260838f99d98
Removing intermediate container 260838f99d98
 ---> e09ed90c53b3
Successfully built e09ed90c53b3
Successfully tagged testcachelinux:latest
PS E:\docker\build\36793> docker build --platform=linux -t testusecachelinux 3
Sending build context to Docker daemon  2.048kB
Step 1/3 : FROM alpine
 ---> 3fd9065eaf02
Step 2/3 : COPY --from=testcachelinux /test.txt /t.txt
 ---> a68b85ea733c
Step 3/3 : RUN cat /t.txt
 ---> Running in 49834d78be86
1
Removing intermediate container 49834d78be86
 ---> be1ac7fc3569
Successfully built be1ac7fc3569
Successfully tagged testusecachelinux:latest
PS E:\docker\build\36793> docker build --platform=linux -t testcachelinux 2
Sending build context to Docker daemon  2.048kB
Step 1/2 : FROM alpine
 ---> 3fd9065eaf02
Step 2/2 : RUN echo 2 > /test.txt
 ---> Running in 004c92e53c89
Removing intermediate container 004c92e53c89
 ---> c6ed3a426e67
Successfully built c6ed3a426e67
Successfully tagged testcachelinux:latest
PS E:\docker\build\36793> docker build --platform=linux -t testusecachelinux 3
Sending build context to Docker daemon  2.048kB
Step 1/3 : FROM alpine
 ---> 3fd9065eaf02
Step 2/3 : COPY --from=testcachelinux /test.txt /t.txt
 ---> a02aa0e250d1
Step 3/3 : RUN cat /t.txt
 ---> Running in fa75974f48e8
2
Removing intermediate container fa75974f48e8
 ---> bce0113cecf4
Successfully built bce0113cecf4
Successfully tagged testusecachelinux:latest
PS E:\docker\build\36793>

Note on the last build step:

  • Step 2/3 correctly does NOT use the cache.
  • Step 3/3 correctly outputs 2 and tags bce0113cef4 correctly.

During debugging this (happened to be building the Microsoft/opengcs repo as that exercises LCOW code quite well in lieu of format CI yet), as mentioned in the issue, I noticed that as the Lstat call is now being made, and can succeed, there's an incorrect assertion in generating the hash (prepareHash function) when it calls into the archiver to Canonicalise the path, which causes build failures to occur on a step such as COPY --from=runc /usr/bin/runc /usr/bin. Have removed the incorrect assertion, updated the function to no longer return an error, and updated unit tests/call-sites not to handle the error there.

@lowenna
Copy link
Member Author

lowenna commented Jun 27, 2018

@johnstep PTAL

@lowenna
Copy link
Member Author

lowenna commented Jun 27, 2018

@cpuguy83 This addresses your feedback on the original PR: #37316 (comment)

return "", fmt.Errorf("Windows path contains forward slash: %s", p)
}
return strings.Replace(p, string(os.PathSeparator), "/", -1), nil
func CanonicalTarNameForPath(p string) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems same as filepath.ToSlash()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, yes, indeed. Not sure how long it's been like that. Will update.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed update

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought just removing this function completely but that can be a follow-up.

@codecov
Copy link

codecov bot commented Jun 27, 2018

Codecov Report

Merging #37356 into master will increase coverage by 0.01%.
The diff coverage is 85.71%.

@@            Coverage Diff             @@
##           master   #37356      +/-   ##
==========================================
+ Coverage   34.83%   34.85%   +0.01%     
==========================================
  Files         610      610              
  Lines       44986    44981       -5     
==========================================
+ Hits        15672    15677       +5     
+ Misses      27201    27196       -5     
+ Partials     2113     2108       -5

}{
{"foo", "foo", false},
{"foo/bar", "___", true}, // unix-styled windows path must fail
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep this case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cpuguy83 Is is a desired behavior for these paths to fail? I would think not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not fail but stay the same at least.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, added back in and updated to check they stay the same.

Signed-off-by: John Howard <jhoward@microsoft.com>
@tonistiigi
Copy link
Member

LGTM

Copy link
Member

@cpuguy83 cpuguy83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lowenna
Copy link
Member Author

lowenna commented Jun 27, 2018

@thaJeztah Is this a known failure in experimental and Janky - both failed the same, but restarted:

01:12:21.330 PASS: docker_api_swarm_test.go:201: DockerSwarmSuite.TestAPISwarmPromoteDemote	6.708s
01:13:30.804 
01:13:30.804 ----------------------------------------------------------------------
01:13:30.805 FAIL: check_test.go:347: DockerSwarmSuite.TearDownTest
01:13:30.805 
01:13:30.805 check_test.go:352:
01:13:30.805     d.Stop(c)
01:13:30.805 /go/src/github.com/docker/docker/internal/test/daemon/daemon.go:401:
01:13:30.805     t.Fatalf("Error while stopping the daemon %s : %v", d.id, err)
01:13:30.805 ... Error: Error while stopping the daemon df4a0aa9ac2f8 : exit status 130
01:13:30.805 
01:13:30.805 
01:13:30.805 ----------------------------------------------------------------------
01:13:30.805 PANIC: docker_api_swarm_test.go:359: DockerSwarmSuite.TestAPISwarmRaftQuorum
01:13:30.806 
01:13:30.806 [df4a0aa9ac2f8] waiting for daemon to start
01:13:30.806 [df4a0aa9ac2f8] daemon started
01:13:30.806 
01:13:30.806 [dda6fed517a8f] waiting for daemon to start
01:13:30.806 [dda6fed517a8f] daemon started
01:13:30.806 
01:13:30.806 [da0cd054b9b3e] waiting for daemon to start
01:13:30.806 [da0cd054b9b3e] daemon started
01:13:30.806 
01:13:30.806 [dda6fed517a8f] exiting daemon
01:13:30.806 [da0cd054b9b3e] exiting daemon
01:13:30.806 [dda6fed517a8f] waiting for daemon to start
01:13:30.806 [dda6fed517a8f] daemon started
01:13:30.806 
01:13:30.806 [df4a0aa9ac2f8] daemon started
01:13:30.806 Attempt #2: daemon is still running with pid 12513
01:13:30.806 Attempt #3: daemon is still running with pid 12513
01:13:30.807 Attempt #4: daemon is still running with pid 12513
01:13:30.807 [df4a0aa9ac2f8] exiting daemon
01:13:30.807 ... Panic: Fixture has panicked (see related PANIC)
01:13:30.807 
01:13:30.807 ----------------------------------------------------------------------

@thaJeztah
Copy link
Member

Janky https://jenkins.dockerproject.org/job/Docker-PRs/49887/console failing on a flaky test; tracked through #36501

02:12:32 FAIL: docker_api_swarm_service_test.go:33: DockerSwarmSuite.TestAPIServiceUpdatePort
02:12:32 
02:12:32 [d7565f4084109] waiting for daemon to start
02:12:32 [d7565f4084109] daemon started
02:12:32 
02:12:32 docker_api_swarm_service_test.go:39:
02:12:32     waitAndAssert(c, defaultReconciliationTimeout, d.CheckActiveContainerCount, checker.Equals, 1)
02:12:32 docker_utils_test.go:435:
02:12:32     c.Assert(v, checker, args...)
02:12:32 ... obtained int = 0
02:12:32 ... expected int = 1
02:12:32 
02:12:32 [d7565f4084109] exiting daemon


02:37:31 FAIL: docker_cli_swarm_test.go:1605: DockerSwarmSuite.TestSwarmPublishDuplicatePorts
02:37:31 
02:37:31 [d46e0af83d092] waiting for daemon to start
02:37:31 [d46e0af83d092] daemon started
02:37:31 
02:37:31 docker_cli_swarm_test.go:1613:
02:37:31     // make sure task has been deployed.
02:37:31     waitAndAssert(c, defaultReconciliationTimeout, d.CheckActiveContainerCount, checker.Equals, 1)
02:37:31 docker_utils_test.go:435:
02:37:31     c.Assert(v, checker, args...)
02:37:31 ... obtained int = 0
02:37:31 ... expected int = 1
02:37:31 
02:37:31 [d46e0af83d092] exiting daemon

Experimental https://jenkins.dockerproject.org/job/Docker-PRs-experimental/41137 also failing on a flaky test (tracked through #37246)

01:58:43 FAIL: docker_cli_daemon_test.go:1490: DockerDaemonSuite.TestRunContainerWithBridgeNone
01:58:43 
01:58:43 [dac3baa3784d0] waiting for daemon to start
01:58:43 [dac3baa3784d0] daemon started
01:58:43 
01:58:43 docker_cli_daemon_test.go:1514:
01:58:43     c.Assert(out, check.Equals, fmt.Sprintf("%s", stdout),
01:58:43         check.Commentf("The network interfaces in container should be the same with host when --net=host when bridge network is disabled: %s", out))

@thaJeztah
Copy link
Member

None of those failures are related in any way, so I'll go ahead and merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/builder area/lcow Issues and PR's related to the experimental LCOW feature platform/windows status/4-merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LCOW builder uses wrong cache layer.
5 participants