Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemd-time-wait-sync.service seems to be failing to start inside user namespace containers #8535

Closed
evverx opened this issue Mar 21, 2018 · 9 comments
Labels
Milestone

Comments

@evverx
Copy link
Member

evverx commented Mar 21, 2018

I haven't read #8494 where systemd-time-wait-sync.service was introduced yet, so I'm not sure whether it is a bug or not:

bash-4.4# systemd-detect-virt
systemd-nspawn

bash-4.4# systemd-detect-virt --private-users
bash-4.4# echo $?
0

bash-4.4# systemctl --failed --no-pager
  UNIT                           LOAD   ACTIVE SUB    DESCRIPTION
● systemd-time-wait-sync.service loaded failed failed Wait Until Kernel Time Synchronized

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.

1 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.

bash-4.4# journalctl -b | grep -i time-wait-sync
Mar 20 13:34:27 systemd-testsuite systemd[23]: systemd-time-wait-sync.service: Executing: /usr/lib/systemd/systemd-time-wait-sync
Mar 20 13:34:27 systemd-testsuite systemd-time-wait-sync[23]: Failed to read adjtimex state: Operation not permitted
@poettering
Copy link
Member

Hmm, I figure adjitemex() will fail in various container setups. Let's handle that gracefully, and simply assume that if we get EPERM or EACCES on adjtimex() we run inside a container, and the clock is set correctly anyway.

@pabigot any chance you can hack up a quick patch for this?

Maybe add a brief log_debug() message explaining the assumption when we exit early if adjtimex() fails like this.

@poettering poettering added this to the v239 milestone Mar 21, 2018
@pabigot
Copy link
Contributor

pabigot commented Mar 21, 2018

Would disabling it in a container as is done with timesyncd be the right solution? I'll take a look...

@poettering
Copy link
Member

@pabigot we generally want that our images likely boot as well on are metal as in VMs or containers. And that means for auxiliary stuff like this we should handle things gracefully. Of course if some major part of a system can't work in a container, by all means it should fail, but this isn't like that. It's more auxiliary.

@pabigot
Copy link
Contributor

pabigot commented Mar 21, 2018

From memory, it is in fact EPERM that shows up in this case.

I'm very naive in system.unit configuration, but I'm looking at the various extra stuff that timesyncd has in its unit, and should be able to make something work.

@poettering
Copy link
Member

That said, maybe we should not solve this in code but simply via a condition:

ConditionCapability=CAP_SYS_TIME
ConditionVirtualization=!container

The same two conditions we already have in timesyncd. I think we should add this here too. It would mean the service is skipped silently if CAP_SYS_TIME is missing or we run in a container, which really is the behaviour we want here.

Fixing the unit file like this should be trivial.

@pabigot
Copy link
Contributor

pabigot commented Mar 21, 2018

Great; that's exactly what I was looking at.

@poettering
Copy link
Member

From memory, it is in fact EPERM that shows up in this case.

Well, container managers might block adjtimes via various different ways, for example via seccomp. If they do, then all bets are off, they might as well use EACCES… But yes you are right the caps/userns based blocking would mean we'd get EPERM.

but anyway, this is moot, I am pretty sure we should go the condition way.

@pabigot
Copy link
Contributor

pabigot commented Mar 21, 2018

FWIW, I don't believe CAP_SYS_TIME should be necessary for this service. adjtimex(2) is never invoked by it with any flags that attempt to change state; the way it's invoked is normally allowed for unprivileged processes.

@poettering
Copy link
Member

FWIW, I don't believe CAP_SYS_TIME should be necessary for this service. adjtimex(2) is never invoked by it with any flags that attempt to change state; the way it's invoked is normally allowed for unprivileged processes.

It's not necessary. However, given that the ConditionCapability checks PID1's caps, and given that timesyncd has the same condition, we know that there's no point in waiting for an NTP sync if the NTP sync can't happen anyway. Hence, even though this service won't need CAP_SYS_TIME itself, it's still a good check

evverx added a commit to evverx/systemd that referenced this issue Aug 27, 2018
This should make it much easier to catch regressions like
systemd#9914 and
systemd#8535.
yuwata pushed a commit that referenced this issue Aug 30, 2018
This should make it much easier to catch regressions like
#9914 and
#8535.
mrc0mmand pushed a commit to mrc0mmand/systemd-rhel that referenced this issue Mar 10, 2019
This should make it much easier to catch regressions like
systemd/systemd#9914 and
systemd/systemd#8535.

(cherry picked from commit 746fbd9)
mrc0mmand pushed a commit to lnykryn/systemd-rhel that referenced this issue Mar 10, 2019
This should make it much easier to catch regressions like
systemd/systemd#9914 and
systemd/systemd#8535.

(cherry picked from commit 746fbd9)
mrc0mmand pushed a commit to mrc0mmand/rhel-8 that referenced this issue Feb 10, 2020
This should make it much easier to catch regressions like
systemd/systemd#9914 and
systemd/systemd#8535.

(cherry picked from commit 746fbd9)
(cherry picked from commit 91bd0b9)
mrc0mmand pushed a commit to mrc0mmand/rhel-8 that referenced this issue Feb 11, 2020
This should make it much easier to catch regressions like
systemd/systemd#9914 and
systemd/systemd#8535.

Related: #1794787

(cherry picked from commit 746fbd9)
(cherry picked from commit 91bd0b9)
systemd-rhel-bot pushed a commit to redhat-plumbers/systemd-rhel8 that referenced this issue Feb 12, 2020
This should make it much easier to catch regressions like
systemd/systemd#9914 and
systemd/systemd#8535.

Related: #1794787

(cherry picked from commit 746fbd9)
(cherry picked from commit 91bd0b9)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants