Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --chown flag to ADD/COPY commands #34263

Merged
merged 2 commits into from
Aug 28, 2017

Conversation

estesp
Copy link
Contributor

@estesp estesp commented Jul 26, 2017

Carry #28499 (which was a carry of #27303)

This adds --chown function to the ADD and COPY commands in Dockerfile. The format of the --chown flag allows any uid/username and gid/groupname combination separated by a colon, e.g.:

--chown=someuser:555
--chown=anyuser:anygroup
--chown=1001:1002
--chown=333:agroupname

Lookups (to translate name -> ID integer) will use the /etc/passwd and /etc/group files found in the container filesystem. Currently if they do not exist, then the docker build will fail. This PR probably needs a way to fail gracefully (or ignore failures) on Windows as those containers will not have these files. If only numeric IDs are used, no lookups will be performed.

User namespaces are supported and IDs will be shifted (after translated names to integers) based on the user namespace mappings set in the daemon.

I also think a few more tests are necessary to fully test the behavior, but wanted to get the rebased code available for review initially.

Rebased by @estesp

Signed-off-by: Kara Alexandra <kalexandra@us.ibm.com>
Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com>
@@ -56,6 +56,7 @@ type copyInstruction struct {
cmdName string
infos []copyInfo
dest string
flChown string
Copy link
Contributor

@shouze shouze Jul 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does fl stands for flag?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, used commonly in the codebase; although in this struct the flag value is being stored, so toss up whether I should be using the fl prefix at this point :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, maybe this should be user or something like that?

Copy link
Contributor

@shouze shouze Jul 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dnephin could be a string serialized uid:gid id pair, maybe it worth to parse the chown option earluer & split into uid & gid in the struct in place of a single string?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If yes could be named destUid & destGid? Or simply uid & gid of course.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have an object for that, idtools.IDPair

@CWSpear
Copy link
Contributor

CWSpear commented Jul 27, 2017

This was quite a journey. I Googled my issue (the whole "having to chown" stuff) and found an issue from 2 years ago.

Then I had to keep reading through massive threads, with both Docker contributors and the community supporting this, but kept having to pick out "continuing this thread [here]," and about 7 or 8 threads later, I find this, which is obviously a very recent PR.

So... where are we with all this? :)

@estesp
Copy link
Contributor Author

estesp commented Jul 27, 2017

Hopefully as close as it has ever been to reality! With the maintainers discussion and comments (like this one from issue #30110 I would expect it is just a matter of review and cleaning up a few things with my implementation.

Given @justincormack wants it, I assume he will be a huge supporter and reviewer of my PR, right Justin! =)

return false
})
if err != nil {
return -1, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor thing, but it is kind of odd returning -1 here, as -1 in chown specifically means "do not chown this part" eg so you can chown a user but not a group, but you are not actually using that, so returning 0 ie the empty int is probably clearer. I had to double check you were not using the value regardless of the error code...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure; was somewhat of a weird paranoia considering some future caller assuming that the translation was a valid uid/gid (0) by ignoring the err, but I agree that there is no actual valid reason to pass -1 given the err.

RUN echo 'test2:x:1002:' >> /etc/group
ADD --chown=test1:1002 . /new_dir
RUN ls -l /
RUN [ $(ls -l / | grep new_dir | awk '{print $3":"$4}') = 'test1:test2' ]`)),
Copy link
Contributor

@justincormack justincormack Jul 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be inclined to add a test that does ls -ln to check the numeric values too as well as the names...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

@justincormack
Copy link
Contributor

Some minor comments but otherwise looks good. I think it is ok to fail on named users on Windows for now, maybe have a test for platform rather than trying to look for /etc/passwd.

// to the requested --chown flags as provided to the original command
if err := os.Lchown(dest, chownPair.UID, chownPair.GID); err != nil {
return errors.Wrapf(err, "failed to chown target directory")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and the isExistingDirectory() call above) is unnnecessary. It's already handled by fixPermissions()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

except that it doesn't work for the chown case. I took the longest time trying to decide it if was worth adding complexity to fixPermissions() or not. The case that doesn't work is when the new directory is actually created by archive.CopyWithTar, so when checked for existence in fixPermissions, it does exist, so the wrong codepath is taken for the chown case. What I wanted was to be able to make an archiver with ChownOpts just for this ADD or COPY command, but after spending time in that code, it isn't possible without significant changes to add TarOptions cascading through a bunch of functions that don't take it today.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that makes sense, which means this is probably a bug introduced when this was moved out of daemon.

I think the logic should still be in fixPermission(). We can pass in the argument as a param, which is what the old code did I believe.

@@ -56,6 +56,7 @@ type copyInstruction struct {
cmdName string
infos []copyInfo
dest string
flChown string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/flChown/idPair/ or something like that. At this point I don't think it's relevant that this came from a flag.

@@ -157,6 +158,7 @@ func add(req dispatchRequest) error {
if err != nil {
return err
}
copyInstruction.flChown = flChown.Value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure we want to add this to ADD as well? We only have --from on COPY, and I think that's by design.

I thought ADD was kind of being deprecated/frozen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both prior PRs were adding the same feature to both so I simply followed suit. I didn't see anything in recent discussions narrowing the focus, but I'm willing to hear arguments either way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add it to both (even though we don't want to make ADD more magical); if we don't add it to ADD, people using it for adding remote files won't be helped by this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't people using ADD for remote files be converting things to RUN curl + COPY anyway? So by not adding it to ADD we're helping nudge people in the right direction. It doesn't really make sense to have two almost identical instructions.

@@ -187,6 +190,7 @@ func dispatchCopy(req dispatchRequest) error {
if err != nil {
return err
}
copyInstruction.flChown = flChown.Value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should parse the Value here so we can fail fast when it's wrong. That way we can store an IDPair in copyInstruction as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have to be at a spot where we have the mount information for the container fs.. hence where it happens in this PR. Otherwise there is no source of truth to translate name -> IDs. I can look again and make sure I do that translation as early as we have that info.

if inst.flChown != "" {
commentStr = fmt.Sprintf("%s --chown=%s %s in %s ", inst.cmdName, inst.flChown, srcHash, inst.dest)
} else {
commentStr = fmt.Sprintf("%s %s in %s ", inst.cmdName, srcHash, inst.dest)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit, these lines are mostly the same so this could just be a variable for chownComment and use %s %s%s %s in %s as the format.

Could even be a separate function to keep it from cluttering the logic in performCopy()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have known this comment was coming! :) Even though I factored it out further per your comments on prior PRs for this, there is more trimming possible for sure :)

if err != nil {
return errors.Wrapf(err, "unable to convert uid/gid to host mapping")
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please extract this whole block (parsing the uid/gid pair) into a new function, and maybe call that function from dispatchCopy() instead of here.

return parts[0], parts[1]
}

func lookupUser(userStr, filepath string) (int, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this already exists as idtools.LookupUser() / LookupUID()? Can we re-use either of those?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, per my prior comment

return users[0].Uid, nil
}

func lookupGroup(groupStr, filepath string) (int, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this already exists as idtools.LookupGroup() / LookupGID()? Can we re-use either of those?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No

@@ -137,6 +171,53 @@ func (b *Builder) performCopy(state *dispatchState, inst copyInstruction) error
return b.exportImage(state, imageMount, runConfigWithCommentCmd)
}

func parseChownFlag(flChown string) (string, string) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised this doesn't exist in pkg/idtools. I think it should probably move to idtools.ParseIDPair(string) (IDPair, error)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because the context is quite different and requires different knowledge in each situation where it is used. pkg/idtools only considers the case where the host /etc/{passwd,group} files are used for translation, as well as possible use of getent. That would be totally invalid for this case, where we need the container files to be used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we refactor it further to share more of the code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would need to introduce some context yup, but at the price of complexity?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the complexity changes much, it's just being moved somewhere more appropriate, so there's less duplicate code to contain.

@dnephin
Copy link
Member

dnephin commented Jul 28, 2017

Looks like windows failures are real because of lchown. I think maybe fixed by removing the code I suggested is not necessary.

@estesp
Copy link
Contributor Author

estesp commented Jul 28, 2017

Yes, the Windows failures are real--my delay in properly handling here yet is on the design point of whether to fail or silently ignore; seems ignore (by using some sort of _windows.go variation of the lookups/chown codepaths) is probably better if you can see a case for Dockerfile sharing, but that seems unlikely in lots of cases anyway. Do we want to fail on a --chown flag on ADD/COPY on Windows?

@shouze
Copy link
Contributor

shouze commented Jul 31, 2017

@estesp @dnephin can/will this PR fix #32816?

@estesp
Copy link
Contributor Author

estesp commented Aug 10, 2017

@dnephin and others; PR has been updated. A bit of vacation delay, sorry :)

At this point I believe the only review comments not handled is the recommended consolidation of pkg/idtools parsing of host /etc/{passwd,group} files and the new parsing here of container specific /etc/ files.

After looking through the idtools package use of the opencontainers/runc/libcontainer/user package and this new code's use, I don't see any effective way of combining these without simply adding complexity. In essence, the user package offers two paths:

  1. the "easy" default path of letting that package read host default files; and
  2. the more complex path of providing that code filepaths or open readers along with a custom filter function to find a specific entry.

The idtools package uses the "easy" path, because that fits the need of that package, and this new PR code uses the other path, because we want to reference specific paths inside the container we're using during build. Unless we force idtools to take the more complex path, and basically rewrite the "easy path" as a consumer of the functions already in that user package. Either way we end up duplicating code by copying functions that currently exist in user. I believe the way it is now, we have two consumers of the user package from libcontainer; they just use two different layers of that functionality.

@g13013
Copy link

g13013 commented Aug 13, 2017

Any news on that ?

// if no group specified, use the user spec as group as well
userStr, grpStr = parts[0], parts[0]
}
userStr, grpStr = parts[0], parts[1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line will will panic when len(parts) == 1 , it's not in an else.

This function should have unit tests.

@@ -602,6 +602,36 @@ RUN [ $(cat "/test dir/test_file6") = 'test6' ]`, command, command, command, com
}
}

func (s *DockerSuite) TestBuildChownFlagAdd(c *check.C) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not convinced we want to add this to ADD.

We definitely don't need two slow integration tests for this. The code path is 99% the same. Let's just have a single one for COPY. Even better would be to add it to an existing integration test that does COPY, but I'm not sure if we have one in the API suite yet.

integration-cli is deprecated, so these should be API tests. We just merged integration/ so there isn't a lot of work there yet, but at least this should be in docker_api_build_test.go

return chownPair, nil
}

func lookupUser(userStr, filepath string) (int, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and lookupGroup) should have unit tests for each case

@dnephin
Copy link
Member

dnephin commented Aug 15, 2017

I believe the way it is now, we have two consumers of the user package from libcontainer; they just use two different layers of that functionality.

Thanks for looking into it

@estesp
Copy link
Contributor Author

estesp commented Aug 16, 2017

@dnephin bugs created during the re-work of those functions are fixed and unit tests added that test all formats of the input string and with/without userns mappings. This tests the parseChownFlag and also the lookupUser/lookupGroup functions. Let me know if you see any holes there.

I understand the comments re: testing, but as you note integration is barely started, and I don't see any movement of cli_* content to api_* yet. I could see potentially adding this to build API, but that will take some extra work as there is very poor scaffolding for standard use of test files, etc. Before I do that can we resolve the ADD and COPY discussion? I wasn't in the maintainer's meeting where it was recommended this work proceed, so I haven't been given any clue as to whether there as a debate around just COPY vs. COPY + ADD. If you mean only one test given the code paths are very similar, that makes sense to me.

@dnephin
Copy link
Member

dnephin commented Aug 17, 2017

here is very poor scaffolding for standard use of test files

Not true. All the same scaffolding works for the api tests. See https://github.com/moby/moby/blob/master/integration-cli/docker_api_build_test.go#L288-L306

I don't see any movement of cli_* content to api_*

There should be. Every PR I have seen has called out to change the cli test to an api test

@vipseixas
Copy link

--chown does not work for compressed files (eg: tgz), it seems to only change the owner of the compressed file itself and then ignore the extracted content.

Is it the intended behavior?

@thaJeztah
Copy link
Member

@vipseixas I'd say; yes, because it'll only change permissions on the file you add. Archives themselves can have permissions set as well, which are preserved (if I'm not mistaken).

Given that automatically extracting archives only works for local files; can you explain your use-case?
i.e., instead of compressing local files, adding them to the image, and then decompressing, why not add the uncompressed files directly?

@vipseixas
Copy link

@thaJeztah I have a 3rd party installation (PHP code) that comes as a 180MB tar.gz file and I have to open it inside Apache's /var/www. The uncompressed files grows to ~570MB so it would be easier to store the compressed file with my Dockerfile. But if I ADD the compressed file and use a "RUN chown" I end up with 2 huge layers in my image.

@derikson
Copy link

@vipseixas as a workaround, you could use multi-stage build to extract the files in one build stage, then COPY with chown in the second stage. This would prevent the final image from having the extra layer.

@vipseixas
Copy link

@derikson Your suggestion worked perfectly! I was not aware of this new feature, thanx.

@marius311
Copy link

This is awesome, thanks guys! Probably one of my most desired features for years.

At the risk of being one those never-satisfied Docker users though :) ...

@thaJeztah:

It's not a bug; variable expansion in options is not supported at this moment

Is this planned at all and if so is there an issue we could track this at?

@thaJeztah
Copy link
Member

Is this planned at all and if so is there an issue we could track this at?

A feature request for that was opened here: #35018 (but not decided on yet 😅)

@khatribharat
Copy link

This isn't yet reflected in the Dockerfile doc

@thaJeztah
Copy link
Member

@khatribharat docs took some time to get merged (see docker/cli#467), but will be updated once Docker 17.12 has been released

@thaJeztah thaJeztah mentioned this pull request Jan 26, 2018
aglorei added a commit to aglorei/pokemon_battle_rails that referenced this pull request Feb 2, 2018
Ever since moby/moby/pull/34263 became merged, there's no need to
shuffle users in order to get the correct ownership on non-root users
sinhyeok-cp added a commit to EOSYS-io/eoshub.io that referenced this pull request Jun 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/builder impact/changelog impact/dockerfile impact/documentation kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny status/2-code-review
Projects
None yet
Development

Successfully merging this pull request may close these issues.