Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uclibc BusyBox in Distroless debug images fails with "ls: Value too large for defined data type" #225

Closed
xinau opened this issue Aug 1, 2018 · 59 comments · Fixed by #437 or #513
Closed

Comments

@xinau
Copy link

xinau commented Aug 1, 2018

UPDATE by @chanseokoh: this issue is specifically about the pre-built, buggy BusyBox binary compiled using uClibc embedded in the Distroless :debug Docker images (e.g., gcr.io/distroless/java:debug). The root cause is Bug 11651 in BusyBox.


It seems like LargeFileSupport is disabled in the debug images.

$ docker run --entrypoint=/busybox/sh --rm -it gcr.io/distroless/base:debug 
/ # ls /
ls: can't open '/': Value too large for defined data type

According to the coreutils docs this error is caused by not enabling LargeFileSupport when compiling.

The default docker image doesn't seem to have this problem

$ docker run --entrypoint=/bin/sh --rm -it busybox
/ # ls
bin   dev   etc   home  proc  root  sys   tmp   usr   var
@dlorenc
Copy link
Contributor

dlorenc commented Aug 6, 2018

cc @tejal29

@tejal29
Copy link
Member

tejal29 commented Aug 7, 2018

hmm not able to reproduce this

docker run --entrypoint=/busybox/sh --rm -it gcr.io/distroless/base:debug 
Unable to find image 'gcr.io/distroless/base:debug' locally
debug: Pulling from distroless/base
8f125ded1b48: Pull complete 
8ca8724e5ffb: Pull complete 
Digest: sha256:333a18407c0e2d46b264a240175284b3bd2482892f4273bacb63a5d37a7adea2
Status: Downloaded newer image for gcr.io/distroless/base:debug
/ # ls
busybox  etc      lib      proc     tmp      var
dev      home     lib64    sys      usr

@dlorenc
Copy link
Contributor

dlorenc commented Aug 7, 2018

Same here. @xinau do you have any other info on how you reproduced this?

@xinau
Copy link
Author

xinau commented Aug 8, 2018

Since I've done several package updates (kernel, util-linux, ...) in the mean time, they seem to have magically fixed the problem. I can't reproduce it anymore :( .

Sry for bothering.

@xinau xinau closed this as completed Aug 8, 2018
@dgageot
Copy link

dgageot commented Sep 5, 2018

I have the same issue on Docker for Mac Version 18.06.0-ce-mac69 (26398)

docker run --entrypoint=sh --rm gcr.io/distroless/base:debug -c ls
ls: can't open '.': Value too large for defined data type

@dgageot dgageot reopened this Sep 5, 2018
@dgageot
Copy link

dgageot commented Sep 5, 2018

Works on Linux with Docker 18.03.1-ce & Docker 18.06.1-ce

@dgageot
Copy link

dgageot commented Sep 5, 2018

Funny, on D4M, it sometimes works just after I restart Docker or remove all the images. And then it breaks again

@dgageot
Copy link

dgageot commented Sep 5, 2018

See docker/for-mac#3203

@reegnz
Copy link

reegnz commented Oct 15, 2018

I ran into the same issue on Docker for Windows:

reegnz@DOOD:~/distroless
$ docker version
Client:
 Version:           18.06.1-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        e68fc7a
 Built:             Tue Aug 21 17:24:51 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.1-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       e68fc7a
  Built:            Tue Aug 21 17:29:02 2018
  OS/Arch:          linux/amd64
  Experimental:     false
reegnz@DOOD:~/distroless
$ docker image ls gcr.io/distroless/java:debug
REPOSITORY               TAG                 IMAGE ID            CREATED             SIZE
gcr.io/distroless/java   debug               a2b123fd9877        48 years ago        119MB
reegnz@DOOD:~/distroless
$ docker run -it --rm --entrypoint=sh gcr.io/distroless/java:debug -c ls
ls: can't open '.': Value too large for defined data type

After I restart docker it seems to work, just as @dgageot already explained. (didn't have to remove any images).
Then while I am performing commands within the container it suddenly starts breaking again.

@emaildanwilson
Copy link

I'm seeing this running against a k8s cluster. It is intermittent, for some pods works fine, other give this error. Perhaps it could be related to how large the filesystem is which the container is on?

@nfk
Copy link

nfk commented Nov 8, 2018

I confirm this issue with docker 18.06.1-ce and gcr.io/distroless/java:debug (0507ea3dccb1). The restart of the docker daemon solves the issue.

@gf1730
Copy link

gf1730 commented Dec 4, 2018

In my experience, the error "Value is too large for defined data type" is raised when a 32-bit application/library tries to access a file that has a 64-bit inode and overflows the 32-bit integer used to store the inode number.

@daisy-ycguo
Copy link

I confirm this issue with docker 18.06.1-ce on Mac.

$ docker run --entrypoint=/busybox/sh --rm -it gcr.io/distroless/base:debug
/ # ls
ls: can't open '.': Value too large for defined data type
$docker version
Client:
 Version:           18.06.1-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        e68fc7a
 Built:             Tue Aug 21 17:21:31 2018
 OS/Arch:           darwin/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.1-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       e68fc7a
  Built:            Tue Aug 21 17:29:02 2018
  OS/Arch:          linux/amd64
  Experimental:     true

@donbowman
Copy link
Contributor

donbowman commented Jan 28, 2019

The issue is the 64-bit inode.

Busybox would need to be compiled with e.g. _FILE_OFFSET_BITS=64, but then it would need to be 64-bit as well (likely) since off_t etc would change, and wouldn't fit in an int anymore.
UPDATE by @chanseokoh: the busybox dev said the above isn't correct: https://bugs.busybox.net/show_bug.cgi?id=11651#c2

This intermittently breaks on Microsoft Azure AKS Kubernetes with the distroless/java image w/ debug.

A workaround is to do something like:

FROM amd64/busybox:1.30.0-glibc as busybox
FROM gcr.io/distroless/java

COPY --from=busybox /bin/busybox /busybox/busybox
RUN ["/busybox/busybox", "--install", "/bin"]

donbowman added a commit to Agilicus/incubator-druid that referenced this issue Jan 28, 2019
The 32-bit uclibc busybox does not support 64-bit inodes
(see GoogleContainerTools/distroless#225)

Signed-off-by: Don Bowman <don@agilicus.com>
@chanseokoh
Copy link
Member

I've filed a bug against busybox: https://bugs.busybox.net/show_bug.cgi?id=11651

@donbowman Distroless is pulling in the 1.27.1 version, and I wonder if recent binaries have fixed this issue. I have trouble reproducing this in-house, so is it possible for you to test the most recent 1.30.0 binary from https://busybox.net/downloads/binaries/?

@donbowman
Copy link
Contributor

I've confirmed 1.30.0 from busybox.net has the same issue.
I believe the issue is uclibc. i've confirmed that 1.30.0 glibc works.

@chanseokoh
Copy link
Member

The busybox dev is asking for more info for debugging: https://bugs.busybox.net/show_bug.cgi?id=11651#c5. If anyone can still reproduce this issue, please update the bug tracker. @donbowman

@donbowman
Copy link
Contributor

done, i need to find a static strace and get a privileged container up to do this tho.

@donbowman
Copy link
Contributor

/proc # strace ls -l
execve("/busybox/ls", ["ls", "-l"], 0x7ffcdc43ede8 /* 16 vars */) = 0
strace: [ Process PID=33 runs in 32 bit mode. ]
strace: WARNING: Proper structure decoding for this personality is not supported, please consider building strace with mpers support enabled.
ioctl(0, TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0
getuid32()                              = 0
time([1551486393 /* 2019-03-02T00:26:33+0000 */]) = 1551486393 (2019-03-02T00:26:33+0000)
ioctl(0, TIOCGWINSZ, {ws_row=38, ws_col=120, ws_xpixel=0, ws_ypixel=0}) = 0
ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0
brk(NULL)                               = 0x9768000
brk(0x9769000)                          = 0x9769000
lstat64(".", 0xffa6e374)                = 0
open(".", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
fstat(3, 0xffa6e378)                    = -1 EOVERFLOW (Value too large for defined data type)
close(3)                                = 0
write(2, "ls: can't open '.': Value too la"..., 58ls: can't open '.': Value too large for defined data type
) = 58
write(1, "total 0\n", 8total 0
)                = 8
exit(1)                                 = ?
+++ exited with 1 +++

@donbowman
Copy link
Contributor

# ./strace_static_x86_64 -v -v ls -l /proc
num_quals=312
--- stopped by SIGSTOP ---
--- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=73, si_uid=0} ---
execve("/busybox/ls", ["ls", "-l", "/proc"], ["KUBERNETES_SERVICE_PORT=443", "KUBERNETES_PORT=tcp://noc-noc-18"..., "HOSTNAME=test", "SHLVL=1", "OLDPWD=/usr", "HOME=/home", "SSL_CERT_FILE=/etc/ssl/certs/ca-"..., "TERM=xterm", "KUBERNETES_PORT_443_TCP_ADDR=noc"..., "PATH=/usr/local/sbin:/usr/local/"..., "KUBERNETES_PORT_443_TCP_PORT=443", "KUBERNETES_PORT_443_TCP_PROTO=tc"..., "KUBERNETES_SERVICE_PORT_HTTPS=44"..., "KUBERNETES_PORT_443_TCP=tcp://no"..., "PWD=/tmp", "KUBERNETES_SERVICE_HOST=noc-noc-"...]) = 0
[ Process PID=76 runs in 32 bit mode. ]
ioctl(0, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, {c_iflags=0x500, c_oflags=0x5, c_cflags=0xbf, c_lflags=0x8a3b, c_line=0, c_cc="\x03\x1c\x7f\x15\x04\x00\x01\x00\x11\x13\x1a\x00\x12\x0f\x17\x16\x00\x00\x00"}) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, {c_iflags=0x500, c_oflags=0x5, c_cflags=0xbf, c_lflags=0x8a3b, c_line=0, c_cc="\x03\x1c\x7f\x15\x04\x00\x01\x00\x11\x13\x1a\x00\x12\x0f\x17\x16\x00\x00\x00"}) = 0
getuid32()                              = 0
time([1551486719])                      = 1551486719
ioctl(0, TIOCGWINSZ, {ws_row=38, ws_col=120, ws_xpixel=0, ws_ypixel=0}) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, {c_iflags=0x500, c_oflags=0x5, c_cflags=0xbf, c_lflags=0x8a3b, c_line=0, c_cc="\x03\x1c\x7f\x15\x04\x00\x01\x00\x11\x13\x1a\x00\x12\x0f\x17\x16\x00\x00\x00"}) = 0
brk(0)                                  = 0x9a97000
brk(0x9a98000)                          = 0x9a98000
lstat64("/proc", {st_dev=makedev(0, 259), st_ino=1, st_mode=S_IFDIR|0555, st_nlink=299, st_uid=0, st_gid=0, st_blksize=1024, st_blocks=0, st_size=0, st_atime=2019/03/02-00:23:48, st_mtime=2019/03/02-00:23:48, st_ctime=2019/03/02-00:23:48}) = 0
open("/proc", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
fstat(3, 0xffe905e8)                    = -1 EOVERFLOW (Value too large for defined data type)
close(3)                                = 0
write(2, "ls: can't open '/proc': Value to"..., 62ls: can't open '/proc': Value too large for defined data type
) = 62
write(1, "total 0\n", 8total 0
)                = 8
_exit(1)                                = ?
+++ exited with 1 +++

@chanseokoh
Copy link
Member

@donbowman I saw your update on the busybox bug. Thanks for reporting. Let's see what the dev will say.

@chanseokoh
Copy link
Member

I see the busybox bug got the lowest priority P5. If anyone is severely affected by this issue, I suggest to go to the busybox bug and increase awareness.

https://bugs.busybox.net/show_bug.cgi?id=11651

@donbowman
Copy link
Contributor

donbowman commented Mar 5, 2019

the workaround is to use the glibc version https://github.com/docker-library/busybox/tree/master/glibc

across the board i find a lot of things on kubernetes (e.g. helm charts) that use 'busybox' as an initcontainer that intermittently fail in azure because of this issue, its not just the debug distroless.

@chanseokoh
Copy link
Member

chanseokoh commented Mar 5, 2019

@donbowman sounds like if this would be gone if we just pull in and install a busybox binary using glibc. Does busybox provide compiled binaries with glibc at https://busybox.net/downloads/binaries/? Doesn't seem like they do.

I kind of heard that Kubernetes might be using distroless for some part of its workings (although I'm not sure).

@sesto
Copy link

sesto commented May 22, 2019

I am experiencing the same issue when running on GKE cluster

 Kernel Version:             4.14.94+
 OS Image:                   Container-Optimized OS from Google
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://18.9.3
 Kubelet Version:            v1.13.5-gke.10
 Kube-Proxy Version:         v1.13.5-gke.10

No issue running on Rancher 2.0 clusters

 Kernel Version:             4.4.0-119-generic
 OS Image:                   Ubuntu 16.04.6 LTS
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://18.9.5
 Kubelet Version:            v1.13.5
 Kube-Proxy Version:         v1.13.5

@chanseokoh

This comment has been minimized.

@chanseokoh

This comment has been minimized.

@alex1989hu
Copy link

We are also affected - upvote +1

@chanseokoh
Copy link
Member

chanseokoh commented May 20, 2020

This is the issue in BusyBox. If you are affected, please go to the BusyBox Bug 11651 and upvote there.

@alex1989hu
Copy link

@chanseokoh
This is not clear for me why it is working if we replace busybox in kaniko image with busybox. Different variant?

@chanseokoh
Copy link
Member

chanseokoh commented May 20, 2020

Likely. The way I see it, this is a bug in BusyBox compiled with uclibc. Sounds like using a glibc BusyBox should be fine. And is Kaniko using or based on a debug Distroless image? Why are you using a debug image?

@chanseokoh chanseokoh changed the title ls: Value too large for defined data type uclibc BusyBox in Distroless debug images fails with "ls: Value too large for defined data type" May 20, 2020
@chanseokoh
Copy link
Member

I think some people are reporting their issue on a wrong repository only after googling with the apparent error message "Value too large for defined data type", not realizing that this is the Distroless repo. I've updated the issue title and hidden some comments seemingly unrelated to Distroless.

@donbowman
Copy link
Contributor

is there some compelling reason we can't use the glibc busybox in here? it works, its compatible, those that use the :debug for whatever reason are happy?
its +2 years open, the uclibc approach was designed for small embedded systems, it may never see the reason to fix, there could be other kernel changes coming it doesn't stay up to date w/.

@chanseokoh
Copy link
Member

chanseokoh commented May 20, 2020

is there some compelling reason we can't use the glibc busybox in here?

@donbowman I don't think there is. I think whatever approach that resolves this issue should be fine. Someone put up a PR (which has merge conflicts) to get BusyBox from the Debian package. However, it downgrades the version and misses some commands; maybe it is no longer the case with Debian 10, I don't know. Even so, it's for debug images, so it may not matter much. But maybe, just downloading the glibc version would be a simpler and easier way. Another person suggested switching to ToyBox.

I think either solution can work. We appreciate community contributions. (I'm not going to work on fixing this myself, BTW.)

@alex1989hu
Copy link

@chanseokoh: many people around who want to use kaniko:debug
Guys tracked down that distroless provided busybox causes the issue - which is part of kaniko:debug image:
https://github.com/GoogleContainerTools/kaniko/blob/0522fe2485c25f55d9d96a028058da0b180e0bec/deploy/Dockerfile_debug#L36-L41

Q: Why anybody wants to use debug image?
A: I can only share my needs, but saw many others with the common need -> to let us able to build images with kaniko:debug securely (non-privileged). A great example is written here: https://gitlab.com/guided-explorations/containers/kaniko-docker-build.

Q: How I met this issue?
A: Wanted to try out image build w/ kaniko:debug following the referenced GitLab guide. My kubernetes cluster runs on Talos.

Q: How I managed to solve?
A: Had to apply the already provided workaround #225 (comment)

I would be happy to help in changing the busybox. Do you know any other maintainer who can assist us?

@chanseokoh
Copy link
Member

@alex1989hu thanks for the detailed explanation. It's really good to know that Kaniko is based on Distroless; didn't know that.

I would be happy to help in changing the busybox. Do you know any other maintainer who can assist us?

Thanks a lot. I am the most active "moderator" (not "maintainer"); I can gladly assist you. What's your plan? I think downloading the precompiled glibc BusyBox is the easiest and straightforward fix.

@chanseokoh
Copy link
Member

chanseokoh commented May 20, 2020

which is part of kaniko:debug image:
https://github.com/GoogleContainerTools/kaniko/blob/0522fe2485c25f55d9d96a028058da0b180e0bec/deploy/Dockerfile_debug#L36-L41

Unrelated, the way Kaniko fetches BusyBox is inexplicably convoluted. Why getting it from a Distroless image after building it with bazel build, while it can simply download the file directly from busybox.net (which is what Distroless does)? It's a single binary file. Doesn't make much sense to me.

@alex1989hu
Copy link

/cc @tejal29 @tstromberg @samos123 what do you think? I think most of us wants to replace busybox but @chanseokoh points to a good question. Is it mandatory to use distroless provided busybox?

One of the kaniko related issue: GoogleContainerTools/kaniko#985 (comment)

@donbowman
Copy link
Contributor

is there some compelling reason we can't use the glibc busybox in here?

@donbowman I don't think there is. I think whatever approach that resolves this issue should be fine. Someone put up a PR (which has merge conflicts) to get BusyBox from the Debian package. However, it downgrades the version and misses some commands; maybe it is no longer the case with Debian 10, I don't know. Even so, it's for debug images, so it may not matter much. But maybe, just downloading the glibc version would be a simpler and easier way. Another person suggested switching to ToyBox.

I think either solution can work. We appreciate community contributions. (I'm not going to work on fixing this myself, BTW.)

actually its not just debug, i believe the java distroless has busybox in it for non-debug.

@donbowman
Copy link
Contributor

so If I update #380 to not conflict, and use current busybox from busybox.net (w/ glibc), all are good w/ that? I can take a look.

@chanseokoh
Copy link
Member

i believe the java distroless has busybox in it for non-debug.

No, all non-debug images do not have busybox. Particularly Java images have tests for this.

- name: no-busybox
path: "/busybox/sh"
shouldExist: false

so If I update #380 to not conflict, and use current busybox from busybox.net (w/ glibc), all are good w/ that?

If you get the glibc version from busybox.net, I think we can forget about #380. (#380 is for getting busybox from Debian.) Thanks for your help.

@chanseokoh
Copy link
Member

Instead of waiting for BusyBox to fix Bug 11651, we switched from using uClibc BusyBox to musl-libc BusyBox (#513). Previously this thread talked about using glibc BusyBox, but I think it doesn't matter to use musl as long as we can resolve this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet