-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tar files added in dockerfile don't use gname/uname attribute for gid/uid mapping #37777
Comments
The motivation for this is to include (for example) a mysql/postgres db within a container and not have to worry about what uid/gid either was assigned at installation time (say via apt-get). we know that a mysql or postgres (etc...) user/group will be created, so we know that the tarball we have will be installed with the correct set of ownership attributes, much like would happen with regular gnu tar if installing into a more traditional system. while one could use --chown, this only works if all the files have the same exact ownership attribute. If some can be owned by root but some have to be owned by a different user (or multiple users), this doesn't work or becomes much more complicated as need multiple sets of tar files, one for each ownership attribute set. |
this is a POC patch, lookupUser() and lookupGroup() are stolen from builder/dockerfile/internal_linux.go diff --git a/pkg/archive/archive.go b/pkg/archive/archive.go
index 070dccb756..4dd44564a0 100644
--- a/pkg/archive/archive.go
+++ b/pkg/archive/archive.go
@@ -7,6 +7,7 @@ import (
"compress/bzip2"
"compress/gzip"
"context"
+ "errors"
"fmt"
"io"
"io/ioutil"
@@ -19,6 +20,8 @@ import (
"syscall"
"time"
+ lcUser "github.com/opencontainers/runc/libcontainer/user"
+
"github.com/docker/docker/pkg/fileutils"
"github.com/docker/docker/pkg/idtools"
"github.com/docker/docker/pkg/ioutils"
@@ -650,7 +653,15 @@ func createTarFile(path, extractDir string, hdr *tar.Header, reader io.Reader, L
// Lchown is not supported on Windows.
if Lchown && runtime.GOOS != "windows" {
if chownOpts == nil {
- chownOpts = &idtools.Identity{UID: hdr.Uid, GID: hdr.Gid}
+ uid, err := lookupUser(hdr.Uname, "/etc/passwd")
+ if err != nil {
+ uid = hdr.Uid
+ }
+ gid, err := lookupGroup(hdr.Gname, "/etc/group")
+ if err != nil {
+ gid = hdr.Gid
+ }
+ chownOpts = &idtools.Identity{UID: uid, GID: gid}
}
if err := os.Lchown(path, chownOpts.UID, chownOpts.GID); err != nil {
return err
@@ -711,6 +722,45 @@ func createTarFile(path, extractDir string, hdr *tar.Header, reader io.Reader, L
return nil
}
+func lookupUser(userStr, filepath string) (int, error) {
+ // if the string is actually a uid integer, parse to int and return
+ // as we don't need to translate with the help of files
+ uid, err := strconv.Atoi(userStr)
+ if err == nil {
+ return uid, nil
+ }
+ users, err := lcUser.ParsePasswdFileFilter(filepath, func(u lcUser.User) bool {
+ return u.Name == userStr
+ })
+ if err != nil {
+ return 0, err
+ }
+ if len(users) == 0 {
+ return 0, errors.New("no such user: " + userStr)
+ }
+ return users[0].Uid, nil
+}
+
+func lookupGroup(groupStr, filepath string) (int, error) {
+ // if the string is actually a gid integer, parse to int and return
+ // as we don't need to translate with the help of files
+ gid, err := strconv.Atoi(groupStr)
+ if err == nil {
+ return gid, nil
+ }
+ groups, err := lcUser.ParseGroupFileFilter(filepath, func(g lcUser.Group) bool {
+ return g.Name == groupStr
+ })
+ if err != nil {
+ return 0, err
+ }
+ if len(groups) == 0 {
+ return 0, errors.New("no such group: " + groupStr)
+ }
+ return groups[0].Gid, nil
+}
+
+
// Tar creates an archive from the directory at `path`, and returns it as a
// stream of bytes.
func Tar(path string, compression Compression) (io.ReadCloser, error) { |
A thing I dislike about my patch is that lookupUser/Group are called on every single file. hitting the file system, even if its already looked up that uname/gname already. gnu tar caches all this data in a per execution cache, but not 100% sure how to do that in this case. i.e. don't want the cache leaking from one invocation to the next. |
If we are going to add this feature to Maybe we can just suggest using |
COPY/RUN are what we do now but it basically doubles the size of the image (we could squash it but that defeats the point of layers/cache) The point of the poc code above isn't for PR purpose (I think it would actually break layer pulling due to diff.go calling the same code and only wanting to follow gid/uid for chown) more as proof that it can be implemented fairly simply. |
I encountered the same issue, do you have a better solution in 2024? |
Description
If one extracts a tar file with a version of tar that understands the gname/uname string attribute, if gname/uname exist in the group/user lookup, it will use that to set the gid/uid of the file instead of the gid/uid stored numerically in the tar file. However, if one adds a tar file within a Dockerfile, it ignores gname/uname, and only uses the gid/uid.
This would make sense if the claim was that a Dockerfile context cannot know the name to id mapping within a container image, but this doesn't seem to be true. the --chown argument to the same dockerfile directive can take a name and map it to an id if it's defined within the container image.
Steps to reproduce the issue:
Describe the results you expected:
expect to see file owned by user spotter
Describe the results you received:
instead see it owned by user with id 1001.
Additional information you deem important (e.g. issue happens only occasionally):
Output of
docker version
:Output of
docker info
:The text was updated successfully, but these errors were encountered: