Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fchownat not respecting AT_EMPTY_PATH on file descriptors opened with O_PATH #9453

Closed
thundergolfer opened this issue Oct 4, 2023 · 7 comments
Assignees
Labels
type: bug Something isn't working

Comments

@thundergolfer
Copy link
Contributor

thundergolfer commented Oct 4, 2023

Description

apt-get install systemd was failing within gvisor, and using the debugger I think I've tracked it down to a failed fchownat syscall.

fchownat(0x4 /var/log/journal, 0x55937991a725 , 0x0, 0x66, 0x1000) = 0 (0x0) errno=9 (bad file number)

To implement the syscall gvisor is using the gvisor.dev/gvisor/pkg/sentry/vfs.opathFD FD implementation (which expectedly doesn't support the operation) even though the AT_EMPTY_PATH flag has been passed. Per fchownat (2) I think gvisor should operate on the file referred to be by the FD in order to fulfil the fchownat operation successfully.

Here is the relevant strace output from the systemd install:

[pid   603] close(4)                    = 0
[pid   603] openat(3, "journal", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|O_PATH) = 4
[pid   603] fstat(4, {st_mode=S_IFDIR|0755, st_size=40, ...}) = 0
[pid   603] close(3)                    = 0
[pid   603] fstat(4, {st_mode=S_IFDIR|0755, st_size=40, ...}) = 0
[pid   603] writev(2, [{iov_base="\"/var/log/journal\" matches tempo"..., iov_len=54}, {iov_base="\n", iov_len=1}], 2"/var/log/journal" matches temporary mode 755 already.
) = 55
[pid   603] writev(2, [{iov_base="Changing \"/var/log/journal\" to o"..., iov_len=42}, {iov_base="\n", iov_len=1}], 2Changing "/var/log/journal" to owner 0:101
) = 43
[pid   603] fchownat(4, "", 0, 101, AT_EMPTY_PATH) = -1 EBADF (Bad file descriptor)
[pid   603] writev(2, [{iov_base="fchownat() of /var/log/journal f"..., iov_len=58}, {iov_base="\n", iov_len=1}], 2fchownat() of /var/log/journal failed: Bad file descriptor
) = 59
[pid   603] close(4)                    = 0

Here is the relevant delve debugger output

This is acquired by attaching a debugger to the docker container given below:

image
(dlv) p dirfile.impl
gvisor.dev/gvisor/pkg/sentry/vfs.FileDescriptionImpl(*gvisor.dev/gvisor/pkg/sentry/vfs.opathFD) *{
	vfsfd: gvisor.dev/gvisor/pkg/sentry/vfs.FileDescription {
		FileDescriptionRefs: (*"gvisor.dev/gvisor/pkg/sentry/vfs.FileDescriptionRefs")(0xc001993b90),
		flagsMu: (*"gvisor.dev/gvisor/pkg/sync.Mutex")(0xc001993b98),
		statusFlags: (*"gvisor.dev/gvisor/pkg/atomicbitops.Uint32")(0xc001993ba0),
		asyncHandler: gvisor.dev/gvisor/pkg/sentry/vfs.FileAsync nil,
		epollMu: (*"gvisor.dev/gvisor/pkg/sentry/vfs.epollMutex")(0xc001993bb8),
		epolls: map[*gvisor.dev/gvisor/pkg/sentry/vfs.epollInterest]struct {} nil,
		vd: (*"gvisor.dev/gvisor/pkg/sentry/vfs.VirtualDentry")(0xc001993bc8),
		opts: (*"gvisor.dev/gvisor/pkg/sentry/vfs.FileDescriptionOptions")(0xc001993bd8),
		readable: true,
		writable: false,
		usedLockBSD: (*"gvisor.dev/gvisor/pkg/atomicbitops.Uint32")(0xc001993be0),
		impl: gvisor.dev/gvisor/pkg/sentry/vfs.FileDescriptionImpl(*gvisor.dev/gvisor/pkg/sentry/vfs.opathFD) ...,},
	FileDescriptionDefaultImpl: gvisor.dev/gvisor/pkg/sentry/vfs.FileDescriptionDefaultImpl {},
	BadLockFD: gvisor.dev/gvisor/pkg/sentry/vfs.BadLockFD {},}

Steps to reproduce

The following is sufficient to reproduce the error, and is how the problem was originally discovered:

docker run -it --runtime=runsc python:3.9.15-slim-bullseye bash -c "apt-get update && apt-get install --yes systemd"

The installation of systemd will not succeed, stemming from installed systemd package post-installation script subprocess returned error exit status 73. The cause of the failure in the post-installation script is fchownat() of /var/log/journal failed: Bad file descriptor.

The problem being the post-installation script was confirmed by unpacking the install into separate steps:

apt-get install --yes libapparmor1 libcap2 libcryptsetup12 libip4tc2 libkmod2 libsystemd0
apt-get download systemd
dpkg --unpack systemd*.deb
rm -f /var/lib/dpkg/info/systemd.postinst
dpkg --configure systemd
apt-get install -yf
systemd --help # should succeed

runsc version

`runsc version release-20230920.0-21-ge81e0c72a70b
spec: 1.1.0-rc.1`

Also reproducible on latest `master` (b6671307).

docker version (if using docker)

Client: Docker Engine - Community
 Version:           24.0.6
 API version:       1.43
 Go version:        go1.20.7
 Git commit:        ed223bc
 Built:             Mon Sep  4 12:32:12 2023
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          24.0.6
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.7
  Git commit:       1a79695
  Built:            Mon Sep  4 12:32:12 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.22
  GitCommit:        8165feabfdfe38c65b599c4993d227328c231fca
 runc:
  Version:          1.1.8
  GitCommit:        v1.1.8-0-g82f18fe
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

uname

Linux ip-10-1-8-45 5.15.0-1044-aws #49~20.04.1-Ubuntu SMP Mon Aug 21 17:09:32 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

kubectl (if using Kubernetes)

No response

repo state (if built from source)

No response

runsc debug logs (if available)

This is the relevant slice grep -C100 "fchownat(0x4 /var/log/journal, 0x55937991a725 , 0x0, 0x66, 0x1000) = 0 (0x0) errno=9" of the boot logs. Overall file was 98MB and too large for Github's 25MB limit.

boot_logs_slice.txt

@thundergolfer thundergolfer added the type: bug Something isn't working label Oct 4, 2023
@EtiennePerot
Copy link
Contributor

Small note: systemd itself, even if it were installed successfully, would probably not run in gVisor. Last time this was attempted, gVisor was still missing the implementation of some syscalls and functionality (cgroups perhaps?) to make it work properly.

@thundergolfer
Copy link
Contributor Author

@EtiennePerot thanks for the reply. I'm not too surprised to be honest. The install of systemd was actually resulting from apt-get install npm, and it's possible that npm would work correctly if the systemd dep succeeded.

@ayushr2
Copy link
Collaborator

ayushr2 commented Oct 4, 2023

Again, thanks for the detailed bug report and doing a lot of the debugging on our behalf. I am sending a fix.

copybara-service bot pushed a commit that referenced this issue Oct 4, 2023
This way, we get the correct implementation for Stat, SetStat, StatFS and Xattr
for free.

Fixes #9453

PiperOrigin-RevId: 570753850
@ayushr2
Copy link
Collaborator

ayushr2 commented Oct 4, 2023

Hmm, open(2) says:

       O_PATH (since Linux 2.6.39)
              Obtain a file descriptor that can be used for two
              purposes: to indicate a location in the filesystem tree
              and to perform operations that act purely at the file
              descriptor level.  The file itself is not opened, and
              other file operations (e.g., read(2), write(2), fchmod(2),
              fchown(2), fgetxattr(2), ioctl(2), mmap(2)) fail with the
              error EBADF.

But it seems like O_PATH FD is allowed to make fchown(2) with AT_EMPTY_PATH?

@thundergolfer
Copy link
Contributor Author

O_PATH FD is allowed to make fchown(2) with AT_EMPTY_PATH?

Yes that's my interpretation of the open(2) and fchownat(2) docs.

AT_EMPTY_PATH (since Linux 2.6.39)
              If  pathname is an empty string, operate on the file referred to
              by dirfd (which may have been obtained using the [open(2)](http://man.he.net/man2/open)  O_PATH
              flag).
              ....

@ayushr2
Copy link
Collaborator

ayushr2 commented Oct 5, 2023

#9468 should be the right fix.

@thundergolfer
Copy link
Contributor Author

Nice one!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
3 participants