Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement destination inference for sync of dockerfile artifacts [2/3] #2084

Merged
merged 12 commits into from Jun 7, 2019

Conversation

@corneliusweig
Copy link
Collaborator

corneliusweig commented May 7, 2019

The sync feature is getting refurbished, see design proposal #1844 .

This PR implements step 2 of the implementation plan for the sync improvements. Notable changes include:

  • adding a new method SyncMap to the builder interface. This new method should provide sync destinations for all relevant source files.
  • implementing SyncMap for docker artifacts.

Note that SyncMap is intentionally not wired up yet. The only way to test it is with skaffold diagnose. The speed of GetDependencies vs SyncMap (which do similar things) is almost the same. In a sample project with 4.7k files GetDependencies and SyncMap took ~30ms and ~32ms, respectively.

I tried to (reasonably) avoid code duplication. Yet, the directory walking code (pkg/skaffold/docker/dependencies.go and pkg/skaffold/docker/syncmap.go) is almost twice the same. Nevertheless, I think this is the right approach, because CustomArtifact also relies on docker.WalkWorkspace and this one is a little faster than docker.walkWorkspaceWithDestinations.

Possible issues

  1. Overlapping copies to the same destination are now resolved correctly. This was not the case in the original PR #1812. For example:

    COPY foo /foo
    COPY bar/ /   # bar contains a file called "foo"

    In that case /foo will have the content from bar/foo, as it is expected.

  2. Transitive copies across stages of multi-stage dockerfiles are not resolved. For example:

    FROM scratch
    ADD foo .
    FROM scratch
    COPY --from=0 foo bar

    The current implementation will resolve foo -> foo and not foo -> bar. As discussed in the design
    proposal #1844, this edge case will be ignored for now.

based-on #1847
Supersedes #1812
CC design shepherd @tejal29

@googlebot googlebot added the cla: yes label May 7, 2019
@corneliusweig corneliusweig force-pushed the corneliusweig:sync/step-2 branch from 91eea9c to cd79f32 May 7, 2019
@corneliusweig corneliusweig changed the title Implement destination inference for sync of dockerfile artifacts Implement destination inference for sync of dockerfile artifacts [2/3] May 7, 2019
@corneliusweig corneliusweig force-pushed the corneliusweig:sync/step-2 branch from cd79f32 to a24c70f May 10, 2019
@corneliusweig

This comment has been minimized.

Copy link
Collaborator Author

corneliusweig commented May 10, 2019

@balopat rebase is done

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented May 10, 2019

Codecov Report

Merging #2084 into master will decrease coverage by <.01%.
The diff coverage is 68.98%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2084      +/-   ##
==========================================
- Coverage    58.7%   58.69%   -0.01%     
==========================================
  Files         188      191       +3     
  Lines        7849     8012     +163     
==========================================
+ Hits         4608     4703      +95     
- Misses       2873     2900      +27     
- Partials      368      409      +41
Impacted Files Coverage Δ
pkg/skaffold/runner/timings.go 14% <ø> (ø) ⬆️
pkg/skaffold/build/build.go 0% <0%> (ø)
pkg/skaffold/build/local/local.go 36.76% <0%> (-0.55%) ⬇️
pkg/skaffold/runner/diagnose.go 0% <0%> (ø) ⬆️
pkg/skaffold/build/gcb/types.go 0% <0%> (ø) ⬆️
pkg/skaffold/build/cluster/types.go 0% <0%> (ø) ⬆️
cmd/skaffold/app/cmd/deploy.go 68.75% <0%> (+4.46%) ⬆️
cmd/skaffold/app/cmd/run.go 60% <0%> (+6.15%) ⬆️
cmd/skaffold/app/cmd/delete.go 66.66% <0%> (+11.11%) ⬆️
cmd/skaffold/app/cmd/dev.go 23.91% <0%> (+1.18%) ⬆️
... and 53 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9935f9e...5d95a01. Read the comment docs.

@balopat balopat added kokoro:run and removed needs-rebase labels May 13, 2019
@kokoro-team kokoro-team removed the kokoro:run label May 13, 2019
Copy link
Member

dgageot left a comment

I'm ok with the design.
Because some functions are both moved and modified, I find it hard to review the code changes.

Is there a way to break this PR in two pieces. One that moves the code and one that actually adds the new feature?

if err != nil {
// if the context was cancelled act as if all is well
// TODO(dgageot): this should be even higher in the call chain.
if ctx.Err() == context.Canceled {

This comment has been minimized.

Copy link
@dgageot

dgageot May 16, 2019

Member

Can we fix that first to simplify the code here?

This comment has been minimized.

Copy link
@corneliusweig

corneliusweig May 16, 2019

Author Collaborator

I moved the checks to the outermost call site. Please check carefully :)

@@ -61,6 +60,11 @@ func (t *TestBench) TestDependencies() ([]string, error) { return n
func (t *TestBench) Dependencies() ([]string, error) { return nil, nil }
func (t *TestBench) Cleanup(ctx context.Context, out io.Writer) error { return nil }
func (t *TestBench) Prune(ctx context.Context, out io.Writer) error { return nil }

func (t *TestBench) SyncMap(ctx context.Context, artifact *latest.Artifact) (map[string][]string, error) {

This comment has been minimized.

Copy link
@dgageot

dgageot May 16, 2019

Member

This could be a oneliner

This comment has been minimized.

Copy link
@corneliusweig

corneliusweig May 16, 2019

Author Collaborator

goimports won't let me. I can only remove the blank lines

@@ -102,7 +102,7 @@ type Builder interface {
DependenciesForArtifact(ctx context.Context, artifact *latest.Artifact) ([]string, error)
// ADDED
SyncMap() (map[string][]string, error)
SyncMap(ctx context.Context, artifact *latest.Artifact) (map[string][]string, error)

This comment has been minimized.

Copy link
@dgageot

dgageot May 16, 2019

Member

should this return a syncMap? (Not sure it's better)

This comment has been minimized.

Copy link
@corneliusweig

corneliusweig May 16, 2019

Author Collaborator

@balopat had the same idea in pkg/skaffold/sync/sync.go. I'm also very unsure what is better. My main issue is that a type alias binds the using sites tighter to the place where type SyncMap is defined (probably builder.go?). Besides, I don't particularly like that there will be two SyncMap symbols in builder.go: 1 the function, 2 its return type.

Overall me feeling is slightly towards keeping it as is. On the other hand, the local type alias in sync.go makes somewhat sense, even though it cannot substitute all usages of map[string][]string due to visibility.

@@ -442,3 +383,10 @@ func hasMultiStageFlag(flags []string) bool {
}
return false
}

func changeDir(cur, to string) string {

This comment has been minimized.

Copy link
@dgageot

dgageot May 16, 2019

Member

It's hard to understand what this function is doing. It doesn't change the directory.

This comment has been minimized.

Copy link
@corneliusweig

corneliusweig May 16, 2019

Author Collaborator

I renamed it to resolveDir and added some godoc.

@@ -396,42 +349,30 @@ func retrieveImage(image string, insecureRegistries map[string]bool) (*v1.Config
return localDaemon.ConfigFile(context.Background(), image)
}

func processCopy(value *parser.Node, envs map[string]string) ([]string, error) {
var copied []string
func retrieveWorkingDir(tagged string, insecureRegistries map[string]bool) (string, error) {

This comment has been minimized.

Copy link
@dgageot

dgageot May 16, 2019

Member

There's a copy of this code in sync.go

This comment has been minimized.

Copy link
@corneliusweig

corneliusweig May 16, 2019

Author Collaborator

😓 I shamelessly copied from there. I think the best place for this will be in the docker package.

@corneliusweig corneliusweig force-pushed the corneliusweig:sync/step-2 branch from a24c70f to 610ff7a May 16, 2019
@corneliusweig

This comment has been minimized.

Copy link
Collaborator Author

corneliusweig commented May 16, 2019

@dgageot Thanks for taking this up so quickly after bringing it to your attention!

Because some functions are both moved and modified, I find it hard to review the code changes.

This is an artifact of a humongous rebase/squash of the original PR. Sorry for that 🙇 . As suggested, I squeezed in a pure move commit. This should make reviewing possible.

return nil, build.ErrSyncMapNotSupported{}
}

syncMap, err := docker.SyncMap(ctx, a.Workspace, a.DockerArtifact.DockerfilePath, a.DockerArtifact.BuildArgs, b.insecureRegistries)

This comment has been minimized.

Copy link
@nkubala

nkubala May 16, 2019

Member

can we not just return docker.SyncMap(....) here? I think it's more idiomatic to check if err != nil in the caller and then conditionally use the returned syncMap.

This comment has been minimized.

Copy link
@corneliusweig

corneliusweig May 16, 2019

Author Collaborator

oh yeah, sure! David made me pull other code out from there, but I missed that this can be further simplified..

// No local Docker is available
cf, err = RetrieveRemoteConfig(tagged, insecureRegistries)
} else {
cf, err = localDocker.ConfigFile(context.Background(), tagged)

This comment has been minimized.

Copy link
@nkubala

nkubala May 16, 2019

Member

I realized after making this comment you didn't write this code, but going to leave it here in case anyone has opinions on it:

do we not want to try and retrieve the image config from remote if we were able to get a local docker client, but couldn't find the image in the local daemon?

This comment has been minimized.

Copy link
@corneliusweig

corneliusweig May 16, 2019

Author Collaborator

good catch. Please have another look.

)

// NormalizeDockerfilePath returns the absolute path to the dockerfile.
func NormalizeDockerfilePath(context, dockerfile string) (string, error) {

This comment has been minimized.

Copy link
@nkubala

nkubala May 16, 2019

Member

this was copied from pkg/skaffold/docker/parse.go right? can we not move it to a common location (pkg/skaffold/docker/util/util.go?) and reuse it?

This comment has been minimized.

Copy link
@corneliusweig

corneliusweig May 16, 2019

Author Collaborator

@nkubala So the function was not copied but moved. It is therefore already reused.

I'm not so happy with moving it to util. Such packages/files quickly attract completely unrelated code. But if you feel strongly about it, I won't start to argue.

@@ -192,3 +193,11 @@ func SetUpLogs(out io.Writer, level string) error {
logrus.SetLevel(lvl)
return nil
}

func alwaysSucceedWhenCancelled(ctx context.Context, err error) error {

This comment has been minimized.

Copy link
@tejal29

tejal29 May 16, 2019

Member

is this related to sync destination inference?

This comment has been minimized.

Copy link
@corneliusweig

corneliusweig May 16, 2019

Author Collaborator

Only indirectly. @dgageot recommended to resolve an earlier todo item in order to simplify the code (#2084 (comment)).

Would you prefer if this change was submitted as a separate PR?

@@ -58,6 +58,21 @@ func (r *SkaffoldRunner) DiagnoseArtifacts(out io.Writer) error {
fmt.Fprintln(out, " - Dependencies:", len(deps), "files")
fmt.Fprintf(out, " - Time to list dependencies: %v (2nd time: %v)\n", timeDeps1, timeDeps2)

timeSyncMap1, err := timeToConstructSyncMap(ctx, r.Builder, artifact)

This comment has been minimized.

Copy link
@tejal29

tejal29 May 16, 2019

Member

why do we do this 2 times?
Also, is there a way to add tests for the L63 to L65?

This comment has been minimized.

Copy link
@corneliusweig

corneliusweig May 16, 2019

Author Collaborator

I think it's run twice so that the OS can warm the inode cache. The second run should normally be more representative.

Since there was no test for diagnose at all, I didn't bother adding one. Since this is creating (random) timings, the best we can achieve here is a smoke test. Do you feel strongly about it?

This comment has been minimized.

Copy link
@tejal29

tejal29 May 16, 2019

Member

i think the runtime object inspection could refactored and unit tested to make sure we dont' break this in future. But i dont feel strongly regarding it.

if _, isNotSupported := err.(build.ErrSyncMapNotSupported); !isNotSupported {
				return errors.Wrap(err, "construct artifact dependencies")
}

This comment has been minimized.

Copy link
@corneliusweig

corneliusweig May 20, 2019

Author Collaborator

@tejal29 Can you explain a little more how you would like that code to be refactored?

I opened #2157 for some smoke tests of the diagnose subcommand. I think this has advantages over unit testing here, because we would have to set up a lot of mock objects for unit tests.

@balopat balopat added the kokoro:run label May 22, 2019
@kokoro-team kokoro-team removed the kokoro:run label May 22, 2019
@corneliusweig corneliusweig force-pushed the corneliusweig:sync/step-2 branch from 953744c to a4e8318 May 23, 2019
@corneliusweig

This comment has been minimized.

Copy link
Collaborator Author

corneliusweig commented May 27, 2019

@dgageot At you convenience could you take another look?

@dgageot

This comment has been minimized.

Copy link
Member

dgageot commented Jun 6, 2019

@corneliusweig Could you rebase this PR? I'll then take a look at it to make sure it's merged quickly

@corneliusweig corneliusweig force-pushed the corneliusweig:sync/step-2 branch from 2bf304b to 232d119 Jun 6, 2019
@corneliusweig

This comment has been minimized.

Copy link
Collaborator Author

corneliusweig commented Jun 6, 2019

@dgageot Thanks! Rebase is done.

@dgageot
dgageot approved these changes Jun 6, 2019
@dgageot dgageot added the kokoro:run label Jun 6, 2019
@kokoro-team kokoro-team removed the kokoro:run label Jun 6, 2019
@dgageot

This comment has been minimized.

Copy link
Member

dgageot commented Jun 6, 2019

@corneliusweig sorry, I LGTMed but it needs rebase again

corneliusweig added 12 commits Feb 8, 2019
Signed-off-by: Cornelius Weig <22861411+corneliusweig@users.noreply.github.com>
Before, ONBUILD instructions were collected over the whole Dockerfile and prepended to all instructions. This has drawbacks when the sequence of commands matters, for example onbuild WORKDIR instructions.
This prepares for extracting the workdir during parsing.

Signed-off-by: Cornelius Weig <22861411+corneliusweig@users.noreply.github.com>
Signed-off-by: Cornelius Weig <22861411+corneliusweig@users.noreply.github.com>
Signed-off-by: Cornelius Weig <22861411+corneliusweig@users.noreply.github.com>
Ignore when the context was cancelled

Signed-off-by: Cornelius Weig <22861411+corneliusweig@users.noreply.github.com>
Preserve order of copy commands

Refactor to preserve order of copy commands

Add godoc, improve naming

Do not forcibly include the Dockerfile

Do not forcibly exclude `.dockerignore`

Disambiguate overwrites of the same destination location

Only COPY/ADD into the latest image are syncable

Reduce code duplication and reorganize code

- rely on a single implementation to extract the COPY/ADD commands from the Dockerfile
- introduce more helper functions to reduce code duplication

Signed-off-by: Cornelius Weig <22861411+corneliusweig@users.noreply.github.com>
Signed-off-by: Cornelius Weig <22861411+corneliusweig@users.noreply.github.com>
Note: not all occurrences of map[string][]string can replaced, because syncMap's visibility is restricted
Signed-off-by: Cornelius Weig <22861411+corneliusweig@users.noreply.github.com>
Signed-off-by: Cornelius Weig <22861411+corneliusweig@users.noreply.github.com>
Before, this was done on a much smaller scope in GetDependencies. Now this check is performed in the outermost call site, so that more places are covered.

Signed-off-by: Cornelius Weig <22861411+corneliusweig@users.noreply.github.com>
Signed-off-by: Cornelius Weig <22861411+corneliusweig@users.noreply.github.com>
@corneliusweig corneliusweig force-pushed the corneliusweig:sync/step-2 branch from 232d119 to 5d95a01 Jun 6, 2019
@corneliusweig

This comment has been minimized.

Copy link
Collaborator Author

corneliusweig commented Jun 6, 2019

@dgageot no problem, the rebase is done already. Thanks for seeing this through!

@tejal29 tejal29 added the kokoro:run label Jun 6, 2019
@kokoro-team kokoro-team removed the kokoro:run label Jun 6, 2019
@dgageot dgageot dismissed nkubala’s stale review Jun 7, 2019

your review was taken into account

@dgageot dgageot merged commit 6cb1d51 into GoogleContainerTools:master Jun 7, 2019
3 checks passed
3 checks passed
cla/google All necessary CLAs are signed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
kokoro CI build successful.
Details
@corneliusweig corneliusweig deleted the corneliusweig:sync/step-2 branch Jun 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.