Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --chown flag to ADD/COPY commands #34263

Merged
merged 2 commits into from Aug 28, 2017
Merged

Conversation

@estesp
Copy link
Contributor

@estesp estesp commented Jul 26, 2017

Carry #28499 (which was a carry of #27303)

This adds --chown function to the ADD and COPY commands in Dockerfile. The format of the --chown flag allows any uid/username and gid/groupname combination separated by a colon, e.g.:

--chown=someuser:555
--chown=anyuser:anygroup
--chown=1001:1002
--chown=333:agroupname

Lookups (to translate name -> ID integer) will use the /etc/passwd and /etc/group files found in the container filesystem. Currently if they do not exist, then the docker build will fail. This PR probably needs a way to fail gracefully (or ignore failures) on Windows as those containers will not have these files. If only numeric IDs are used, no lookups will be performed.

User namespaces are supported and IDs will be shifted (after translated names to integers) based on the user namespace mappings set in the daemon.

I also think a few more tests are necessary to fully test the behavior, but wanted to get the rebased code available for review initially.

Rebased by @estesp

Signed-off-by: Kara Alexandra <kalexandra@us.ibm.com>
Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
@@ -56,6 +56,7 @@ type copyInstruction struct {
cmdName string
infos []copyInfo
dest string
flChown string

This comment has been minimized.

@shouze

shouze Jul 26, 2017
Contributor

does fl stands for flag?

This comment has been minimized.

@estesp

estesp Jul 26, 2017
Author Contributor

yes, used commonly in the codebase; although in this struct the flag value is being stored, so toss up whether I should be using the fl prefix at this point :)

This comment has been minimized.

@dnephin

dnephin Jul 26, 2017
Member

Ya, maybe this should be user or something like that?

This comment has been minimized.

@shouze

shouze Jul 26, 2017
Contributor

@dnephin could be a string serialized uid:gid id pair, maybe it worth to parse the chown option earluer & split into uid & gid in the struct in place of a single string?

This comment has been minimized.

@shouze

shouze Jul 26, 2017
Contributor

If yes could be named destUid & destGid? Or simply uid & gid of course.

This comment has been minimized.

@dnephin

dnephin Jul 26, 2017
Member

We have an object for that, idtools.IDPair

@CWSpear
Copy link
Contributor

@CWSpear CWSpear commented Jul 27, 2017

This was quite a journey. I Googled my issue (the whole "having to chown" stuff) and found an issue from 2 years ago.

Then I had to keep reading through massive threads, with both Docker contributors and the community supporting this, but kept having to pick out "continuing this thread [here]," and about 7 or 8 threads later, I find this, which is obviously a very recent PR.

So... where are we with all this? :)

@estesp
Copy link
Contributor Author

@estesp estesp commented Jul 27, 2017

Hopefully as close as it has ever been to reality! With the maintainers discussion and comments (like this one from issue #30110 I would expect it is just a matter of review and cleaning up a few things with my implementation.

Given @justincormack wants it, I assume he will be a huge supporter and reviewer of my PR, right Justin! =)

return false
})
if err != nil {
return -1, err

This comment has been minimized.

@justincormack

justincormack Jul 28, 2017
Contributor

Minor thing, but it is kind of odd returning -1 here, as -1 in chown specifically means "do not chown this part" eg so you can chown a user but not a group, but you are not actually using that, so returning 0 ie the empty int is probably clearer. I had to double check you were not using the value regardless of the error code...

This comment has been minimized.

@estesp

estesp Jul 28, 2017
Author Contributor

Sure; was somewhat of a weird paranoia considering some future caller assuming that the translation was a valid uid/gid (0) by ignoring the err, but I agree that there is no actual valid reason to pass -1 given the err.

RUN echo 'test2:x:1002:' >> /etc/group
ADD --chown=test1:1002 . /new_dir
RUN ls -l /
RUN [ $(ls -l / | grep new_dir | awk '{print $3":"$4}') = 'test1:test2' ]`)),

This comment has been minimized.

@justincormack

justincormack Jul 28, 2017
Contributor

I would be inclined to add a test that does ls -ln to check the numeric values too as well as the names...

This comment has been minimized.

@estesp

estesp Jul 28, 2017
Author Contributor

sounds good

@justincormack
Copy link
Contributor

@justincormack justincormack commented Jul 28, 2017

Some minor comments but otherwise looks good. I think it is ok to fail on named users on Windows for now, maybe have a test for platform rather than trying to look for /etc/passwd.

// to the requested --chown flags as provided to the original command
if err := os.Lchown(dest, chownPair.UID, chownPair.GID); err != nil {
return errors.Wrapf(err, "failed to chown target directory")
}

This comment has been minimized.

@dnephin

dnephin Jul 28, 2017
Member

This (and the isExistingDirectory() call above) is unnnecessary. It's already handled by fixPermissions()

This comment has been minimized.

@estesp

estesp Jul 28, 2017
Author Contributor

except that it doesn't work for the chown case. I took the longest time trying to decide it if was worth adding complexity to fixPermissions() or not. The case that doesn't work is when the new directory is actually created by archive.CopyWithTar, so when checked for existence in fixPermissions, it does exist, so the wrong codepath is taken for the chown case. What I wanted was to be able to make an archiver with ChownOpts just for this ADD or COPY command, but after spending time in that code, it isn't possible without significant changes to add TarOptions cascading through a bunch of functions that don't take it today.

This comment has been minimized.

@dnephin

dnephin Jul 28, 2017
Member

Ah that makes sense, which means this is probably a bug introduced when this was moved out of daemon.

I think the logic should still be in fixPermission(). We can pass in the argument as a param, which is what the old code did I believe.

@@ -56,6 +56,7 @@ type copyInstruction struct {
cmdName string
infos []copyInfo
dest string
flChown string

This comment has been minimized.

@dnephin

dnephin Jul 28, 2017
Member

s/flChown/idPair/ or something like that. At this point I don't think it's relevant that this came from a flag.

@@ -157,6 +158,7 @@ func add(req dispatchRequest) error {
if err != nil {
return err
}
copyInstruction.flChown = flChown.Value

This comment has been minimized.

@dnephin

dnephin Jul 28, 2017
Member

Are we sure we want to add this to ADD as well? We only have --from on COPY, and I think that's by design.

I thought ADD was kind of being deprecated/frozen.

This comment has been minimized.

@estesp

estesp Jul 28, 2017
Author Contributor

Both prior PRs were adding the same feature to both so I simply followed suit. I didn't see anything in recent discussions narrowing the focus, but I'm willing to hear arguments either way.

This comment has been minimized.

@thaJeztah

thaJeztah Aug 18, 2017
Member

I think we should add it to both (even though we don't want to make ADD more magical); if we don't add it to ADD, people using it for adding remote files won't be helped by this PR.

This comment has been minimized.

@dnephin

dnephin Aug 18, 2017
Member

Shouldn't people using ADD for remote files be converting things to RUN curl + COPY anyway? So by not adding it to ADD we're helping nudge people in the right direction. It doesn't really make sense to have two almost identical instructions.

@@ -187,6 +190,7 @@ func dispatchCopy(req dispatchRequest) error {
if err != nil {
return err
}
copyInstruction.flChown = flChown.Value

This comment has been minimized.

@dnephin

dnephin Jul 28, 2017
Member

I think we should parse the Value here so we can fail fast when it's wrong. That way we can store an IDPair in copyInstruction as well.

This comment has been minimized.

@estesp

estesp Jul 28, 2017
Author Contributor

we have to be at a spot where we have the mount information for the container fs.. hence where it happens in this PR. Otherwise there is no source of truth to translate name -> IDs. I can look again and make sure I do that translation as early as we have that info.

if inst.flChown != "" {
commentStr = fmt.Sprintf("%s --chown=%s %s in %s ", inst.cmdName, inst.flChown, srcHash, inst.dest)
} else {
commentStr = fmt.Sprintf("%s %s in %s ", inst.cmdName, srcHash, inst.dest)

This comment has been minimized.

@dnephin

dnephin Jul 28, 2017
Member

minor nit, these lines are mostly the same so this could just be a variable for chownComment and use %s %s%s %s in %s as the format.

Could even be a separate function to keep it from cluttering the logic in performCopy()

This comment has been minimized.

@estesp

estesp Jul 28, 2017
Author Contributor

I should have known this comment was coming! :) Even though I factored it out further per your comments on prior PRs for this, there is more trimming possible for sure :)

if err != nil {
return errors.Wrapf(err, "unable to convert uid/gid to host mapping")
}
}

This comment has been minimized.

@dnephin

dnephin Jul 28, 2017
Member

Please extract this whole block (parsing the uid/gid pair) into a new function, and maybe call that function from dispatchCopy() instead of here.

return parts[0], parts[1]
}

func lookupUser(userStr, filepath string) (int, error) {

This comment has been minimized.

@dnephin

dnephin Jul 28, 2017
Member

I think this already exists as idtools.LookupUser() / LookupUID()? Can we re-use either of those?

This comment has been minimized.

@estesp

estesp Jul 28, 2017
Author Contributor

No, per my prior comment

return users[0].Uid, nil
}

func lookupGroup(groupStr, filepath string) (int, error) {

This comment has been minimized.

@dnephin

dnephin Jul 28, 2017
Member

I think this already exists as idtools.LookupGroup() / LookupGID()? Can we re-use either of those?

This comment has been minimized.

@estesp

estesp Jul 28, 2017
Author Contributor

No

@@ -137,6 +171,53 @@ func (b *Builder) performCopy(state *dispatchState, inst copyInstruction) error
return b.exportImage(state, imageMount, runConfigWithCommentCmd)
}

func parseChownFlag(flChown string) (string, string) {

This comment has been minimized.

@dnephin

dnephin Jul 28, 2017
Member

I'm surprised this doesn't exist in pkg/idtools. I think it should probably move to idtools.ParseIDPair(string) (IDPair, error)

This comment has been minimized.

@estesp

estesp Jul 28, 2017
Author Contributor

because the context is quite different and requires different knowledge in each situation where it is used. pkg/idtools only considers the case where the host /etc/{passwd,group} files are used for translation, as well as possible use of getent. That would be totally invalid for this case, where we need the container files to be used.

This comment has been minimized.

@dnephin

dnephin Jul 28, 2017
Member

Could we refactor it further to share more of the code?

This comment has been minimized.

@shouze

shouze Jul 28, 2017
Contributor

would need to introduce some context yup, but at the price of complexity?

This comment has been minimized.

@dnephin

dnephin Jul 28, 2017
Member

I don't think the complexity changes much, it's just being moved somewhere more appropriate, so there's less duplicate code to contain.

@dnephin
Copy link
Member

@dnephin dnephin commented Jul 28, 2017

Looks like windows failures are real because of lchown. I think maybe fixed by removing the code I suggested is not necessary.

@estesp
Copy link
Contributor Author

@estesp estesp commented Jul 28, 2017

Yes, the Windows failures are real--my delay in properly handling here yet is on the design point of whether to fail or silently ignore; seems ignore (by using some sort of _windows.go variation of the lookups/chown codepaths) is probably better if you can see a case for Dockerfile sharing, but that seems unlikely in lots of cases anyway. Do we want to fail on a --chown flag on ADD/COPY on Windows?

@shouze
Copy link
Contributor

@shouze shouze commented Jul 31, 2017

@estesp @dnephin can/will this PR fix #32816?

@estesp estesp force-pushed the estesp:chown-flag-add-copy branch from 6c11df6 to 8f9561c Aug 10, 2017
@estesp estesp requested a review from tonistiigi as a code owner Aug 10, 2017
@estesp estesp force-pushed the estesp:chown-flag-add-copy branch from 8f9561c to 1996f01 Aug 10, 2017
@estesp
Copy link
Contributor Author

@estesp estesp commented Aug 10, 2017

@dnephin and others; PR has been updated. A bit of vacation delay, sorry :)

At this point I believe the only review comments not handled is the recommended consolidation of pkg/idtools parsing of host /etc/{passwd,group} files and the new parsing here of container specific /etc/ files.

After looking through the idtools package use of the opencontainers/runc/libcontainer/user package and this new code's use, I don't see any effective way of combining these without simply adding complexity. In essence, the user package offers two paths:

  1. the "easy" default path of letting that package read host default files; and
  2. the more complex path of providing that code filepaths or open readers along with a custom filter function to find a specific entry.

The idtools package uses the "easy" path, because that fits the need of that package, and this new PR code uses the other path, because we want to reference specific paths inside the container we're using during build. Unless we force idtools to take the more complex path, and basically rewrite the "easy path" as a consumer of the functions already in that user package. Either way we end up duplicating code by copying functions that currently exist in user. I believe the way it is now, we have two consumers of the user package from libcontainer; they just use two different layers of that functionality.

@g13013
Copy link

@g13013 g13013 commented Aug 13, 2017

Any news on that ?

// if no group specified, use the user spec as group as well
userStr, grpStr = parts[0], parts[0]
}
userStr, grpStr = parts[0], parts[1]

This comment has been minimized.

@dnephin

dnephin Aug 15, 2017
Member

This line will will panic when len(parts) == 1 , it's not in an else.

This function should have unit tests.

@@ -602,6 +602,36 @@ RUN [ $(cat "/test dir/test_file6") = 'test6' ]`, command, command, command, com
}
}

func (s *DockerSuite) TestBuildChownFlagAdd(c *check.C) {

This comment has been minimized.

@dnephin

dnephin Aug 15, 2017
Member

I'm still not convinced we want to add this to ADD.

We definitely don't need two slow integration tests for this. The code path is 99% the same. Let's just have a single one for COPY. Even better would be to add it to an existing integration test that does COPY, but I'm not sure if we have one in the API suite yet.

integration-cli is deprecated, so these should be API tests. We just merged integration/ so there isn't a lot of work there yet, but at least this should be in docker_api_build_test.go

return chownPair, nil
}

func lookupUser(userStr, filepath string) (int, error) {

This comment has been minimized.

@dnephin

dnephin Aug 15, 2017
Member

This (and lookupGroup) should have unit tests for each case

@dnephin
Copy link
Member

@dnephin dnephin commented Aug 15, 2017

I believe the way it is now, we have two consumers of the user package from libcontainer; they just use two different layers of that functionality.

Thanks for looking into it

@estesp estesp force-pushed the estesp:chown-flag-add-copy branch from 1996f01 to a0885ff Aug 16, 2017
@estesp
Copy link
Contributor Author

@estesp estesp commented Aug 16, 2017

@dnephin bugs created during the re-work of those functions are fixed and unit tests added that test all formats of the input string and with/without userns mappings. This tests the parseChownFlag and also the lookupUser/lookupGroup functions. Let me know if you see any holes there.

I understand the comments re: testing, but as you note integration is barely started, and I don't see any movement of cli_* content to api_* yet. I could see potentially adding this to build API, but that will take some extra work as there is very poor scaffolding for standard use of test files, etc. Before I do that can we resolve the ADD and COPY discussion? I wasn't in the maintainer's meeting where it was recommended this work proceed, so I haven't been given any clue as to whether there as a debate around just COPY vs. COPY + ADD. If you mean only one test given the code paths are very similar, that makes sense to me.

@dnephin
Copy link
Member

@dnephin dnephin commented Aug 17, 2017

here is very poor scaffolding for standard use of test files

Not true. All the same scaffolding works for the api tests. See https://github.com/moby/moby/blob/master/integration-cli/docker_api_build_test.go#L288-L306

I don't see any movement of cli_* content to api_*

There should be. Every PR I have seen has called out to change the cli test to an api test

@mwarkentin
Copy link

@mwarkentin mwarkentin commented Oct 26, 2017

@vincentwoo yes, it’s working in docker 17.09-ce.

@thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Oct 26, 2017

See docker/cli#467 for the docs; that PR was a bit delayed 😅

@vipseixas
Copy link

@vipseixas vipseixas commented Nov 13, 2017

--chown does not work for compressed files (eg: tgz), it seems to only change the owner of the compressed file itself and then ignore the extracted content.

Is it the intended behavior?

@thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Nov 13, 2017

@vipseixas I'd say; yes, because it'll only change permissions on the file you add. Archives themselves can have permissions set as well, which are preserved (if I'm not mistaken).

Given that automatically extracting archives only works for local files; can you explain your use-case?
i.e., instead of compressing local files, adding them to the image, and then decompressing, why not add the uncompressed files directly?

@vipseixas
Copy link

@vipseixas vipseixas commented Nov 13, 2017

@thaJeztah I have a 3rd party installation (PHP code) that comes as a 180MB tar.gz file and I have to open it inside Apache's /var/www. The uncompressed files grows to ~570MB so it would be easier to store the compressed file with my Dockerfile. But if I ADD the compressed file and use a "RUN chown" I end up with 2 huge layers in my image.

@derikson
Copy link

@derikson derikson commented Nov 13, 2017

@vipseixas as a workaround, you could use multi-stage build to extract the files in one build stage, then COPY with chown in the second stage. This would prevent the final image from having the extra layer.

@vipseixas
Copy link

@vipseixas vipseixas commented Nov 13, 2017

@derikson Your suggestion worked perfectly! I was not aware of this new feature, thanx.

@marius311
Copy link

@marius311 marius311 commented Nov 17, 2017

This is awesome, thanks guys! Probably one of my most desired features for years.

At the risk of being one those never-satisfied Docker users though :) ...

@thaJeztah:

It's not a bug; variable expansion in options is not supported at this moment

Is this planned at all and if so is there an issue we could track this at?

@thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Nov 17, 2017

Is this planned at all and if so is there an issue we could track this at?

A feature request for that was opened here: #35018 (but not decided on yet 😅)

@khatribharat
Copy link

@khatribharat khatribharat commented Dec 27, 2017

This isn't yet reflected in the Dockerfile doc

@thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Dec 27, 2017

@khatribharat docs took some time to get merged (see docker/cli#467), but will be updated once Docker 17.12 has been released

@thaJeztah thaJeztah mentioned this pull request Jan 26, 2018
aglorei added a commit to aglorei/pokemon_battle_rails that referenced this pull request Feb 2, 2018
Ever since moby/moby/pull/34263 became merged, there's no need to
shuffle users in order to get the correct ownership on non-root users
sinhyeok-cp added a commit to EOSYS-io/eoshub.io that referenced this pull request Jun 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.