Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v239 failure to establish watch on symbolic link of /etc/localtime #9602

Closed
gvlx opened this Issue Jul 16, 2018 · 11 comments

Comments

3 participants
@gvlx
Copy link

gvlx commented Jul 16, 2018

systemd version the issue has been seen with

v239

Used distribution

Archlinux ARM

Expected behaviour you didn't see

normal startup

Unexpected behaviour you saw

[ 14.429708] systemd[1]: Detected architecture arm.
[ 14.466434] systemd[1]: Set hostname to .
[ 14.519163] systemd[1]: Failed to create timezone change event source: Too many levels of symbolic links
[ 14.535323] systemd[1]: Failed to allocate manager object: Too many levels of symbolic links
[ 14.556312] systemd[1]: Freezing execution.

Steps to reproduce the problem

  • Install v239
  • timedatectl set-timezone UTC
  • restart

Cause
Commit https://github.com/systemd/systemd/blame/1e75824cb005f7fa0089792e45c2747c4d059601/src/core/manager.c#L424 does not follow symbolic links properly.

Workaround

  • remove disk from original machine and mount it on another system
  • # mount /dev/sdXY /mnt
  • # ls -l /mnt/etc/ | grep time

lrwxrwxrwx 1 root root 25 Jul 1 22:26 localtime -> ../usr/share/zoneinfo/UTC

  • # rm /mnt/etc/localtime
  • # cp /mnt/usr/share/zoneinfo/UTC /mnt/etc/localtime
  • # umount /mnt
  • remove disk from other system and mount it back on original machine

@gvlx gvlx changed the title v239 system freeze on startup because of /etc/localtime symbolic link v239 system freeze on startup caused by symbolic link of /etc/localtime Jul 17, 2018

@poettering poettering added the bug 🐛 label Jul 18, 2018

@poettering poettering added this to the v240 milestone Jul 18, 2018

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Oct 13, 2018

Hmm, I am a bit puzzled by this one. I don't see how we could ever get ELOOP (which is what "Too many levels of symbolic links" means) here, as we pass IN_DONT_FOLLOW to inotify_add_watch(), as you see in the the sources you linked. This means we watch the link itself instead of what it points too, which is exactly what we want here. We expect this to be a symlink, and

If inotify_add_watch() returns ELOOP then this can only happen afaics if within the path /etc/localtime there's a symlink cycle or so (maybe /etc itself?), or maybe that the kernel is too old (what is it?)

I mean, I run with /etc/localtime being a symlink myself too, it's what we really push people to do, it's what timedated actually sets things up as, hence I am really confused by this, and can't see a way how this could be triggered.

Is this reproducible for you? Is there anything interesting regarding setup of /etc or the file system in general?

(That said, even if this fails, we really shouldn't bail out on it. Will prep a patch for that shortly. But this will only fix the effect of the ELOOP issue, not the ELOOP issue itself, and if this is reproducible I'd really like to fix that too, but first need to understand that.)

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Oct 13, 2018

(oh, and for me with /etc/localtime being exactly the symlink you suggest everything works fine and perfectly)

one more question: is /usr split into a separate file system for you, or something like that?

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Oct 13, 2018

hmm, any chance you can strace this somehow?

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Oct 13, 2018

#10392 addresses the freeze issue.

@keszybz keszybz changed the title v239 system freeze on startup caused by symbolic link of /etc/localtime v239 ELOOP caused by symbolic link of /etc/localtime Oct 15, 2018

@keszybz keszybz changed the title v239 ELOOP caused by symbolic link of /etc/localtime v239 failure to establish watch on symbolic link of /etc/localtime Oct 15, 2018

@zonidjan

This comment has been minimized.

Copy link

zonidjan commented Oct 18, 2018

This happened to me after upgrading Debian to unstable (from Jessie or perhaps Stretch). No separate filesystems or anything. I'm happy to put the symlink back in place and strace if you can tell me how to strace it, given that it occurs at (and prevents) boot.

BTW, the only error printed in my case was "Failed to allocate manager object: Too many levels of symbolic links.". I'm not sure if that's because I wasn't passing some boot options or such.

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Oct 18, 2018

This happened to me after upgrading Debian to unstable (from Jessie or perhaps Stretch). No separate filesystems or anything. I'm happy to put the symlink back in place and strace if you can tell me how to strace it, given that it occurs at (and prevents) boot.

hmm, that depends. one simple way might be do replace /usr/sbin/init with a shell script that does something like this:

#!/bin/sh
exec strace -s 500 -o /tmp/init.strace -D /usr/sbin/init.real "$@"

Under the assumption you moved the real init binary to /usr/sbin/init.real first, and installed strace.

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Oct 18, 2018

(and the -D switch to strace is what is relevant here, as it makes sure that strace gets a PID != 1 and the real init remains PID 1 — normally strace starts processes as its own children which wouldn't work for an init system which must be PID 1)

@poettering poettering removed this from the v240 milestone Oct 23, 2018

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Oct 23, 2018

Let's drop this from the v240 milestone for now, as this is currently not actionable

keszybz added a commit to keszybz/systemd that referenced this issue Oct 28, 2018

fpletz pushed a commit to NixOS/systemd that referenced this issue Oct 31, 2018

@zonidjan

This comment has been minimized.

Copy link

zonidjan commented Jan 23, 2019

FYI, I got a kernel panic trying to strace it this way, but I don't have time to investigate further.

@poettering

This comment has been minimized.

Copy link
Member

poettering commented Feb 26, 2019

Closing due to lack of response.

@poettering poettering closed this Feb 26, 2019

@gvlx

This comment has been minimized.

Copy link
Author

gvlx commented Mar 5, 2019

v241.7-2 (archlinux armv7l with kernel 3.5.7-14) seems to work correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.