-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework unit loading to take into account all aliases #13119
Conversation
This pull request introduces 4 alerts when merging cfa3958 into f7e7bb6 - view on LGTM.com new alerts:
|
CentOS CI (Arch in KVM) is failing with:
It seems arch might be doing something special with how rpcbind is defined... No idea. |
This pull request introduces 4 alerts when merging 1e023f5 into a505166 - view on LGTM.com new alerts:
|
@keszybz I tweaked a little the sanity boot check in the CentOS 7 job (by adding |
This pull request introduces 4 alerts when merging 2b1e0f4 into a505166 - view on LGTM.com new alerts:
|
According to the CentOS CI, systemd doesn't detect newly created units without calling TEST-03-JOBS:
TEST-12-ISSUE-3171 ...
U=/run/systemd/system/test.socket
cat <<'EOL' >$U
...
systemctl start test.socket
...
etc. |
Duh, a type ( I'm now happy with the latest version. It seems pretty clean. There's still some minor details to work out, but in general this should be mergeable. |
This pull request introduces 3 alerts when merging 6a08991 into 47685d9 - view on LGTM.com new alerts:
|
src/shared/unit-file.c
Outdated
"%s points to \"%s\" which is not a valid unit name: %m", filename, dst); | ||
if (r2 == UNIT_NAME_INSTANCE) | ||
return log_notice_errno(SYNTHETIC_ERRNO(EXDEV), | ||
"%s: unit symlink target \"%s\" has instance name, rejecting.", filename, dst); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, this is a new limitation that I am not sure I agree with? i mean, we previously said as long as the instance name matches you can have aliases, and that when searching for a file we'd always look for the name with the instance first, and without it as fallback. Why take this away? Are you sure this is nowhere used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'll remove this limitation. (BTW, thanks, this is the kind of comments I was looking for).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I remember why this is a problem. Normally aliases are functions (in the mathematical sense, i.e. there can by only one y for any given x). This is nicely enforced by the fact that a symlink can only point one way, and once we find the symlink, we don't look for any fragments with lower priority. But once we allow symlinks for aliases, we can have a@foo.service → b@foo.service
and at the same time a@.service → c@.service
, while both b@.service
and c@service
can have fragments on disk. So are now a@.service
, b@.service
, c@.service
now all aliases? If yes, do we pick the fragment for b@.service
or c@.service
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this can be even simpler: a@.service
may exist on disk, and b@foo.service
a@foo.service
may be a symlink to b@foo.service
. Now we have two candidate fragments again.
Maybe we should allow this, check that the instance alias matches the main alias, but warn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say the rule is that if a@foo.service
→ b@foo.service
exists that a@.service
is never itself read (but the drop-ins for that are.)
i.e. in your example, I think we should read (and in this order):
b@foo.service
a@.service.d/*.conf
a@foo.service.d/*.conf
b@.service.d/*.conf
b@foo.service.d/*.conf
But not:
a@.service
(i.e. that this is a symlink is irrelevant to us, we already found a unit with thea@foo.service
chain)c@.service
c@foo.service
c@foo.service.d/*.conf
c@.service.d/*.conf
i.e. we only collect the drop-ins on the symlink chain we actually follow, and we stop searching as soon as we found the first unit file that is not a symlink. that means we'd never be tempted to resolve a@.service
at all, since we already found a unit file by traversing a@foo.service
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dunno, the fact that we read a@.service.d/*.conf
but not a@.service
seems very arbitrary. Before, it didn't matter if a setting was defined in the fragment or in the drop-in (apart from ordering issues). But now the dropins start living a life independent of the main unit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I can't convince myself that this is OK.
We either consider the instanced unit b@foo
independent of the template a@.service
, or not.
If the first, then we shouldn't load a@.service
or a@.service.d
. If the second, we should load both.
In this part of the code: systemd/test/TEST-15-DROPIN/test-dropin.sh Lines 149 to 150 in 6a08991
With appropriate description here: systemd/test/TEST-15-DROPIN/test-dropin.sh Lines 132 to 139 in 6a08991
|
This pull request introduces 3 alerts when merging f74e4b4 into 6fd79cc - view on LGTM.com new alerts:
|
This pull request introduces 3 alerts when merging bb0e86a into 6fd79cc - view on LGTM.com new alerts:
|
src/shared/unit-file.c
Outdated
if (src_name_type < 0) | ||
return log_notice_errno(src_name_type, | ||
"%s: not a valid unit name \"%s\": %m", filename, src); | ||
if (src_name_type == UNIT_NAME_INSTANCE) | ||
return log_notice_errno(SYNTHETIC_ERRNO(EINVAL), | ||
"%s: unit symlink has instance name, rejecting.", filename); | ||
|
||
src_type = unit_name_to_type(src); | ||
assert(src_type >= 0); /* unit_name_classify() checked the suffix already */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment needs update
Without this, repeated runs of "make -C TEST/... setup" fail when trying to create the symlink.
I adjusted the tests to pass. I don't think the behaviour makes much sense, even if we ignore the issue with "lazy loading" of aliases. E.g. in the last section, the fact that dropins for yup@.service and yup@3.service are not loaded seems to be a plain old bug.
It turns out most possible symlinks are invalid, because the type has to match, and template units can only be linked to template units. I'm not sure if the existing code made the same checks consistently. At least I don't see the same rules expressed in a single place.
This reworks how we load units from disk. Instead of chasing symlinks every time we are asked to load a unit by name, we slurp all symlinks from disk and build two hashmaps: 1. from unit name to either alias target, or fragment on disk (if an alias, we put just the target name in the hashmap, if a fragment we put an absolute path, so we can distinguish both). 2. from a unit name to all aliases Reading all this data can be pretty costly (40 ms) on my machine, so we keep it around for reuse. The advantage is that we can reliably know what all the aliases of a given unit are. This means we can reliably load dropins under all names. This fixes systemd#11972.
I'm not convinced that this is useful enough to be included... But it is certainly nice when debugging.
v2: - do not watch mtime of transient and generated dirs We'd reload the map after every transient unit we created, which we don't need to do, since we create those units ourselves and know their fragment path.
Useful for manual debugging.
Another update. I added an extensive test to verify the unit loading behaviour. It is first added in a version that passes before any changes, and then when the code is changed, the test is updated again. Somewhat surprisingly, the results for the old code don't make much sense. The final result is not quite as you describe in #13119 (comment) (difference is crossed out):
But not:
My thinking is: if we aliased a unit "away" from some template, this unit does not use this template, and we load neither the template fragment, nor any dropins declared for the template. OTOH, we load both the fragment and the dropins for the new template. Of course, we load all dropins for the instance, under all names. This means that declaring something directly in the template fragment and in the template dropin is equivalent. I corrected some minor bugs on the way and cleaned up the debugging stmts and FIXMEs. (One remains, but that's something I'm leaving for later). PTAL. |
This pull request introduces 1 alert when merging 8027654 into 2d1b928 - view on LGTM.com new alerts:
|
hmm, lots's of CI fialures? |
It seems to be timeouts, e.g. on systemd-update-hwdb.service. I restarted the tests. |
|
|
@keszybz @poettering could you at least cc @ddstreet when something like this happens? It looks like that particular issue can be fixed by passing Another option (especially after #13221) would be to stop pretending anyone looks at Ubuntu CI when PRs are opened and simply turn it off. |
@keszybz I'm not sure how you restarted Ubuntu CI but it seems the tests were run against the "master" branch of the Debian package where @mbiebl reverted the commit where |
Oh, I just use the |
Yes please do - I'm happy to look at any Ubuntu CI failures/problems. I'm working on getting the Ubuntu CI more stable/reliable, as well. |
This mostly reuses existing checkers used by pid1, so handling of aliases should be consistent. Hopefully, with the test it'll be clearer what it happening. Support for .wants/.requires "aliases" is restored. Those are still used in the wild quite a bit, so we need to support them. See systemd#13119 for a discussion of aliases with an instance that point to a different template: this is allowed.
This mostly reuses existing checkers used by pid1, so handling of aliases should be consistent. Hopefully, with the test it'll be clearer what it happening. Support for .wants/.requires "aliases" is restored. Those are still used in the wild quite a bit, so we need to support them. See systemd#13119 for a discussion of aliases with an instance that point to a different template: this is allowed.
This mostly reuses existing checkers used by pid1, so handling of aliases should be consistent. Hopefully, with the test it'll be clearer what it happening. Support for .wants/.requires "aliases" is restored. Those are still used in the wild quite a bit, so we need to support them. See systemd#13119 for a discussion of aliases with an instance that point to a different template: this is allowed.
Only the last 7 patches are new, the rest is the same as in #13096. I'll keep rebasing this until the other one is merged.For #11972.