Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemd-tmpfiles should skip mountpoints #12692

Closed
casipw opened this issue May 29, 2019 · 9 comments
Closed

systemd-tmpfiles should skip mountpoints #12692

casipw opened this issue May 29, 2019 · 9 comments
Labels
needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer tmpfiles

Comments

@casipw
Copy link

casipw commented May 29, 2019

I quickly mounted a remote location (with sshfs) into /tmp/foo. Then systemd-tmpfiles-clean.timer kicked in and systemd-tmpfiles-clean.service executed systemd-tmpfiles, which deleted all files older than 14 days in my mount.

Of course /tmp is not the perfect place for mounts. But systemd-tmpfiles should really skip mountpoints.

Fedora 27, systemd 234-11.

@cdown
Copy link
Member

cdown commented May 31, 2019

Can you please show systemd-tmpfiles --cat-config? I assume this is something Fedora-specific, since we don't ship anything that clears /tmp after 14 days.

I think it's fine to have a removal option that doesn't introspect past mountpoints, but I don't think systemd-tmpfiles should do that by default.

@cdown cdown added needs-discussion 🤔 needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer tmpfiles labels May 31, 2019
@casipw
Copy link
Author

casipw commented Jun 6, 2019

Hi, thanks for your reply. Apparently systemd 234 does not have systemd-tmpfiles --cat-config yet. I can show the service and timer files:

$ cat /usr/lib/systemd/system/systemd-tmpfiles-clean.timer
#  This file is part of systemd.
#
#  systemd is free software; you can redistribute it and/or modify it
#  under the terms of the GNU Lesser General Public License as published by
#  the Free Software Foundation; either version 2.1 of the License, or
#  (at your option) any later version.

[Unit]
Description=Daily Cleanup of Temporary Directories
Documentation=man:tmpfiles.d(5) man:systemd-tmpfiles(8)

[Timer]
OnBootSec=15min
OnUnitActiveSec=1d
$ rpm -qf /usr/lib/systemd/system/systemd-tmpfiles-clean.timer
systemd-234-11.git5f8984e.fc27.x86_64
$ cat /usr/lib/systemd/system/systemd-tmpfiles-clean.service
#  This file is part of systemd.
#
#  systemd is free software; you can redistribute it and/or modify it
#  under the terms of the GNU Lesser General Public License as published by
#  the Free Software Foundation; either version 2.1 of the License, or
#  (at your option) any later version.

[Unit]
Description=Cleanup of Temporary Directories
Documentation=man:tmpfiles.d(5) man:systemd-tmpfiles(8)
DefaultDependencies=no
Conflicts=shutdown.target
After=local-fs.target time-sync.target
Before=shutdown.target

[Service]
Type=oneshot
ExecStart=/usr/bin/systemd-tmpfiles --clean
IOSchedulingClass=idle
$ rpm -qf /usr/lib/systemd/system/systemd-tmpfiles-clean.service
systemd-234-11.git5f8984e.fc27.x86_64

@cdown
Copy link
Member

cdown commented Jun 7, 2019

--cat-config shows the config files that tmpfiles reads, rather than the service file for tmpfiles itself. It's probably in /usr/lib/tmpfiles.d/.

@casipw
Copy link
Author

casipw commented Jun 11, 2019

Thanks for your patience. There are three *tmp* files in /usr/lib/tmpfiles.d/:

$ cat /usr/lib/tmpfiles.d/gvfsd-fuse-tmpfiles.conf
# This is a systemd tmpfiles.d configuration file
# [...]

x /run/user/*/gvfs
$ cat /usr/lib/tmpfiles.d/ostree-tmpfiles.conf 
# Copyright (C) 2017 Colin Walters <walters@verbum.org>
# [...]

# https://github.com/ostreedev/ostree/issues/393
R! /var/tmp/ostree-unlock-ovl.*
$ cat /usr/lib/tmpfiles.d/tmp.conf
#  This file is part of systemd.
#  [...]

# Clear tmp directories separately, to make them easier to override
q /tmp 1777 root root 10d
q /var/tmp 1777 root root 30d

# Exclude namespace mountpoints created with PrivateTmp=yes
x /tmp/systemd-private-%b-*
X /tmp/systemd-private-%b-*/tmp
x /var/tmp/systemd-private-%b-*
X /var/tmp/systemd-private-%b-*/tmp

# Remove top-level private temporary directories on each boot
R! /tmp/systemd-private-*
R! /var/tmp/systemd-private-*

@poettering
Copy link
Member

We do check for mount points in the "file aging" code, in fact always have.

There are even two checks in that code:

https://github.com/systemd/systemd/blob/master/src/tmpfiles/tmpfiles.c#L551

First we check that st_dev of all files we look at is the same as the st_dev of the top-level dir we are operating on. This check should filter out most cases, except those where you bind mount an fs on itself, i.e. where st_dev of a dir and its child match even though the latter is a mount point.

This is then followed by a test with dir_is_mount_point(), which uses name_to_handle_at(). That syscall returns us the mnt id of a path, which is different for each actual mount point. sadly, that syscall is not implemented on all file systems, and I figure not on the two your chose for /tmp and yous sshfs.

That said, the first check should already have figured out everything and skipped the entry. What is the output of stat /tmp /tmp/foo on your system? i.e. what are the backing devices reported?

@casipw
Copy link
Author

casipw commented Jun 18, 2019

$ mkdir /tmp/foo
$ sshfs foo:/scratch/bar /tmp/foo
$ stat /tmp /tmp/foo
  File: /tmp
  Size: 640       	Blocks: 0          IO Block: 4096   directory
Device: 2bh/43d	Inode: 19807       Links: 19
Access: (1777/drwxrwxrwt)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2019-06-18 12:41:00.853699915 +0200
Modify: 2019-06-18 12:49:01.683053387 +0200
Change: 2019-06-18 12:49:01.683053387 +0200
 Birth: -
  File: /tmp/foo
  Size: 4096      	Blocks: 8          IO Block: 4096   directory
Device: 3eh/62d	Inode: 1           Links: 1
Access: (0755/drwxr-xr-x)  Uid: ( 1002/ xxxx)   Gid: (  501/    xxxx)
Access: 2019-06-11 15:47:26.000000000 +0200
Modify: 2019-05-16 12:59:25.000000000 +0200
Change: 2019-05-16 12:59:25.000000000 +0200
 Birth: -

@poettering
Copy link
Member

so dev_t in the two dirs is different: 2bh vs 3eh. I have no idea why our check hence didn't work for you.

Can you reproduce this? If so, can you step through it (with gdb) and see what the if check I linked above results in in your case on the relevant mount? But before that, please update to a current systemd version, we actually don't track issues here with anything older than the two most recently released systemd versions, i.e. 241 or 242 right now. Even better, please test this with a tmpfiles version that includes #12822, which should improve the second check (even though the first check have already caught your case)

@casipw
Copy link
Author

casipw commented Jun 21, 2019

I could reproduce the issue - and the culprit probably isn't systemd.

I rebooted my machine, mounted a remote folder with sshfs and created files with different modification timestamps in it:

[user@workstation ~]$ mkdir /tmp/foo
[user@workstation ~]$ sshfs server:/scratch/user/foo /tmp/foo/
[user@workstation ~]$ ssh server
[user@server ~]$ cd /scratch/user/foo/
[user@server foo]$ touch -d  "1 day ago"  01d
[user@server foo]$ touch -d  "2 days ago" 02d
[user@server foo]$ touch -d  "3 days ago" 03d
[user@server foo]$ touch -d  "4 days ago" 04d
[user@server foo]$ touch -d  "5 days ago" 05d
[user@server foo]$ touch -d  "6 days ago" 06d
[user@server foo]$ touch -d  "7 days ago" 07d
[user@server foo]$ touch -d  "8 days ago" 08d
[user@server foo]$ touch -d  "9 days ago" 09d
[user@server foo]$ touch -d "10 days ago" 10d
[user@server foo]$ touch -d "11 days ago" 11d
[user@server foo]$ touch -d "12 days ago" 12d
[user@server foo]$ touch -d "13 days ago" 13d
[user@server foo]$ touch -d "14 days ago" 14d
[user@server foo]$ touch -d "15 days ago" 15d
[user@server foo]$ touch -d "16 days ago" 16d
[user@server foo]$ ls -la
total 8
drwxr-xr-x 2 user workgroup 4096 21. Jun 15:16 .
drwxr-xr-x 5 user department     4096 21. Jun 13:58 ..
-rw-r--r-- 1 user workgroup    0 20. Jun 15:16 01d
-rw-r--r-- 1 user workgroup    0 19. Jun 15:16 02d
-rw-r--r-- 1 user workgroup    0 18. Jun 15:16 03d
-rw-r--r-- 1 user workgroup    0 17. Jun 15:16 04d
-rw-r--r-- 1 user workgroup    0 16. Jun 15:16 05d
-rw-r--r-- 1 user workgroup    0 15. Jun 15:16 06d
-rw-r--r-- 1 user workgroup    0 14. Jun 15:16 07d
-rw-r--r-- 1 user workgroup    0 13. Jun 15:16 08d
-rw-r--r-- 1 user workgroup    0 12. Jun 15:16 09d
-rw-r--r-- 1 user workgroup    0 11. Jun 15:16 10d
-rw-r--r-- 1 user workgroup    0 10. Jun 15:16 11d
-rw-r--r-- 1 user workgroup    0  9. Jun 15:16 12d
-rw-r--r-- 1 user workgroup    0  8. Jun 15:16 13d
-rw-r--r-- 1 user workgroup    0  7. Jun 15:16 14d
-rw-r--r-- 1 user workgroup    0  6. Jun 15:16 15d
-rw-r--r-- 1 user workgroup    0  5. Jun 15:16 16d
[user@server ~]$ exit
logout
Connection to server closed.

Then, on my workstation, I waited for the first run of systemd-tmpfiles-clean 15 minutes after boot. All files were still there:

[user@workstation ~]$ cd /tmp/foo/
[user@workstation foo]$ systemctl status systemd-tmpfiles-clean
● systemd-tmpfiles-clean.service - Cleanup of Temporary Directories
   Loaded: loaded (/usr/lib/systemd/system/systemd-tmpfiles-clean.service; static; vendor preset: disabled)
   Active: inactive (dead) since Fri 2019-06-21 15:25:36 CEST; 3min 2s ago
     Docs: man:tmpfiles.d(5)
           man:systemd-tmpfiles(8)
  Process: 3218 ExecStart=/usr/bin/systemd-tmpfiles --clean (code=exited, status=0/SUCCESS)
 Main PID: 3218 (code=exited, status=0/SUCCESS)
[user@workstation foo]$ uptime
 15:28:44 up 18 min,  1 user,  load average: 0,04, 0,03, 0,01
[user@workstation foo]$ ls -la
total 4
drwxr-xr-x  1 user workgroup 4096 21. Jun 15:16 .
drwxrwxrwt 17 root    root      400 21. Jun 15:20 ..
-rw-r--r--  1 user workgroup    0 20. Jun 15:16 01d
-rw-r--r--  1 user workgroup    0 19. Jun 15:16 02d
-rw-r--r--  1 user workgroup    0 18. Jun 15:16 03d
-rw-r--r--  1 user workgroup    0 17. Jun 15:16 04d
-rw-r--r--  1 user workgroup    0 16. Jun 15:16 05d
-rw-r--r--  1 user workgroup    0 15. Jun 15:16 06d
-rw-r--r--  1 user workgroup    0 14. Jun 15:16 07d
-rw-r--r--  1 user workgroup    0 13. Jun 15:16 08d
-rw-r--r--  1 user workgroup    0 12. Jun 15:16 09d
-rw-r--r--  1 user workgroup    0 11. Jun 15:16 10d
-rw-r--r--  1 user workgroup    0 10. Jun 15:16 11d
-rw-r--r--  1 user workgroup    0  9. Jun 15:16 12d
-rw-r--r--  1 user workgroup    0  8. Jun 15:16 13d
-rw-r--r--  1 user workgroup    0  7. Jun 15:16 14d
-rw-r--r--  1 user workgroup    0  6. Jun 15:16 15d
-rw-r--r--  1 user workgroup    0  5. Jun 15:16 16d

I checked the files at irregular intervals, until, after one and a half hours, files older than 14 days were missing. However, systemd-tmpfiles-clean had not been running in the meantime!

[user@workstation foo]$ uptime
 16:45:13 up  1:35,  1 user,  load average: 0,00, 0,00, 0,00
[user@workstation foo]$ ls -la
total 4
drwxr-xr-x  1 user workgroup 4096 21. Jun 16:10 .
drwxrwxrwt 17 root    root      400 21. Jun 16:21 ..
-rw-r--r--  1 user workgroup    0 20. Jun 15:16 01d
-rw-r--r--  1 user workgroup    0 19. Jun 15:16 02d
-rw-r--r--  1 user workgroup    0 18. Jun 15:16 03d
-rw-r--r--  1 user workgroup    0 17. Jun 15:16 04d
-rw-r--r--  1 user workgroup    0 16. Jun 15:16 05d
-rw-r--r--  1 user workgroup    0 15. Jun 15:16 06d
-rw-r--r--  1 user workgroup    0 14. Jun 15:16 07d
-rw-r--r--  1 user workgroup    0 13. Jun 15:16 08d
-rw-r--r--  1 user workgroup    0 12. Jun 15:16 09d
-rw-r--r--  1 user workgroup    0 11. Jun 15:16 10d
-rw-r--r--  1 user workgroup    0 10. Jun 15:16 11d
-rw-r--r--  1 user workgroup    0  9. Jun 15:16 12d
-rw-r--r--  1 user workgroup    0  8. Jun 15:16 13d
[user@workstation foo]$ systemctl status systemd-tmpfiles-clean
● systemd-tmpfiles-clean.service - Cleanup of Temporary Directories
   Loaded: loaded (/usr/lib/systemd/system/systemd-tmpfiles-clean.service; static; vendor preset: disabled)
   Active: inactive (dead) since Fri 2019-06-21 15:25:36 CEST; 1h 20min ago
     Docs: man:tmpfiles.d(5)
           man:systemd-tmpfiles(8)
  Process: 3218 ExecStart=/usr/bin/systemd-tmpfiles --clean (code=exited, status=0/SUCCESS)
 Main PID: 3218 (code=exited, status=0/SUCCESS)

Looking into journalctl -e revealed another suspect: gsd-housekeeping threw some /tmp-related error messages exactly one hour after booting:

Jun 21 16:10:44 workstation gsd-housekeepin[1968]: Failed to enumerate children of /tmp/xxx: Error opening directory '/tmp/xxx': Permission denied
Jun 21 16:10:44 workstation gsd-housekeepin[1968]: Failed to enumerate children of /tmp/systemd-private-xxx-fwupd.service-t1esvB: Error open[...]

So Gnome probably is the culprit. systemd-tmpfiles-clean and its safety measures worked as intended. Please excuse the noise, and thanks for your great work!

@casipw casipw closed this as completed Jun 21, 2019
@poettering
Copy link
Member

Urks. There was (is?) a bug about that asking gsd to just drop their clean-up logic, at least of systemd based systems. It's not just worse than tmpfiles in regards to the mount point detection it's actively detrimental to the system, since their code runs unprivileged, and that means they cannot use O_NOATIME and that in turns means the keep touching the atime of all dirs they look at and thus break the aging logic in tmpfiles.

There's really no point in having a second implementation of the clean-up logic in gnome when we already have it better, and their logic even breaks the system logic we need anyway.

Gah.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer tmpfiles
Development

No branches or pull requests

3 participants