Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set DynamicUser=no for networkd, resolved, timesyncd #10117

Merged
merged 6 commits into from
Oct 5, 2018

Conversation

keszybz
Copy link
Member

@keszybz keszybz commented Sep 18, 2018

Dynamic users don't work well if the service requires matching configuration in
other places, for example dbus policy. This is true for those three services.
In effect, distros create the user statically [1, 2]. Dynamic users make more sense
for "add-on" services where not creating the user, or more precisely, creating the
user lazily, can save resources. For "basic" services, if we are going to create the
user on package installation anyway, setting DynamicUser= just creates unneeded
confusion. The only case where it is actually used is when somebody forgets to
do system configuration. But it's better to have the service fail cleanly in
this case too. If we want to turn on some side-effect of DynamicUser=yes for those
services, we should just do that directly through fine-grained options.

[1] https://salsa.debian.org/systemd-team/systemd/commit/bd9bf307274faca24699c0c2d67cb86f18c0b2cb
[2] https://src.fedoraproject.org/rpms/systemd/blob/48ac1cebdedb055d9daf3dfe28c7bde80103f7a1/f/systemd.spec#_473
(Fedora does not create systemd-timesync user.)

Fixes #10025, possibly helps with #9700.

@evverx
Copy link
Member

evverx commented Sep 18, 2018

It's unlikely to help with #9700, but I'm pretty sure it fixes #10011.

@evverx
Copy link
Member

evverx commented Sep 18, 2018

To be precise, It partially fixes #10011 because the other services fail to start due to unsuccessful bind mounts. @mbiebl could you run the tests to see if at least networkd-test.py and timedated pass?

@yuwata
Copy link
Member

yuwata commented Sep 18, 2018

I thought dropping static users for the services actually causes several bad situations. So I've dropped that. But, is it necessary to disable DynamicUser= from those services? I guess #9700 and #10011 can be solved by 'correctly' configuring LXC or AppArmor. Also #10025 can be, at least partially, fixed by #10115, though, I know you do not like the PR...

I respect your decision, but that is a sad news for me...

@yuwata yuwata added the units label Sep 18, 2018
@yuwata
Copy link
Member

yuwata commented Sep 18, 2018

Please do not just disable DynamicUser= but revert d4e9e57, 0187368, 48d3e88 and be80154.

@evverx
Copy link
Member

evverx commented Sep 18, 2018

'correctly' configuring LXC or AppArmor

@yuwata I think the default AppArmor profile blocking fishy operations is good enough and it's unclear (at least to me) why it should be relaxed just to make unnecessary (in that particular environment) features work again to simply start the services that had always worked before.

@keszybz keszybz added the reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks label Sep 18, 2018
@yuwata
Copy link
Member

yuwata commented Sep 18, 2018

@evverx Hmm... Good point.

BTW, I do not think this 'fixes' #10025. I do not strongly disagree with this, disabling DynamicUser= for the basic services. But, DynamicUser= itself is a useful concept especially for container environments, no? And also StateDirectory= or friends are useful settings especially with DynamicUser=. But these may not be possible to be combined on some environment. I think we should find a way (a workaround?) to set both setting simultaneously.

@yuwata
Copy link
Member

yuwata commented Sep 18, 2018

What do you think about introducing a new setting, e.g., MountNamespace=yes, auto, no?
When it is set to yes, then mount namespace setup failure is critical. If auto, some failures are ignored to work both fully privileged environment and some unprivileged containers. And DynamicUser=yes implies MountNamespace=auto.

@evverx
Copy link
Member

evverx commented Sep 18, 2018

@yuwata I agree they all can be helpful and that they can even be combined seamlessly sometimes, but I'm not sure if it's possible to make them work everywhere without losing what makes them useful. The workarounds I personally was able to come up with are unsatisfactory in the sense that they are all based on ignoring errors, but I strongly believe that's not the right thing to do.

Regarding MountNamespace, it would help to solve some internal systemd issues. Though, it's unclear to me why it would be useful generally.

@yuwata
Copy link
Member

yuwata commented Sep 18, 2018

Note for timesyncd. For timesyncd, DynamicUser= was enabled slightly earlier than networkd or resolved. And if I remember correctly, no critical issue is posted. Maybe because usually it cannot be used in containers. So, I think it may not necessary to disable DynamicUser= for timesyncd. One more point, if we will drop DynamicUser= from timesyncd, then rpm or other package need some care to 'downgrade' private state directory.

@evverx
Copy link
Member

evverx commented Sep 18, 2018

#9583 was serious enough, but I suppose it was just decided that it would be better to keep relaxing the SELinux policy even further.

@evverx
Copy link
Member

evverx commented Sep 18, 2018

On second thought, removing just DynamicUser won't magically make everything else work everywhere and will most likely break something. I'm inclined to say that at this point it would probably be better and easier to leave everything as is. Those who think that they need these features will adapt their security policies and stuff like that. Others just manually clean up the unit files as I assume they usually do. The only thing is to somehow mention (maybe, in the documentation) that those settings aren't quite portable and YMMV.

This reverts commit 48d3e88.

I kept the follow-symlink=false → follow-symlink=true change instact, since
we're likely to have existing installations with a symlink now.
@keszybz
Copy link
Member Author

keszybz commented Sep 19, 2018

Please do not just disable DynamicUser= but revert d4e9e57, 0187368, 48d3e88 and be80154.

Thanks for this list. I now did proper reverts.

DynamicUser= itself is a useful concept especially for container environments, no?

For me, DynamicUser= is not related in any way to containers (implementation troubles aside). In fact, I would expect a container to be custom-made for a service, so pre-creating the user there is simpler than in a normal machine. DynamicUser= makes the most sense in long-lived installations where packages may be added and removed and unused users will accumulate. I think container images need to be constructed as carefully as any installation, so there's no reason that setup functions will be skipped in a container installation.

On second thought, removing just DynamicUser won't magically make everything else work everywhere and will most likely break something. I'm inclined to say that at this point it would probably be better and easier to leave everything as is.

Not a good sign. If we (core systemd contributors) have trouble getting a grasp of the situation, that means that we reached an unmaintainable level of complexity. I think DynamicUser= is a great concept, but it just doesn't seem to be a good fit for basic system services. I don't want to give up on the idea that basic systemd installation should be portable. Every time we have a basic systemd service which needs to be tweaked between Debian/Ubuntu/Fedora/Arch, or between a container and VM and real machine, or between AA/SELinux/Tomoyo, ..., we should treat this as a bug. Possibly not our bug, and not something to fix immediately, but certainly a wart to be removed at some point.

I'm pushing an updated version and I'll do some testing of upgrades on previous installations.

@evverx
Copy link
Member

evverx commented Sep 19, 2018

I'll just try to clarify what I was trying to say. I don't think that removing just DynamicUser will make the basic services portable because, well, every setting breaks something every time it's introduced or its functionality is extended, the most recent example being ProtectKernelTunables preventing some services from starting. So, basically, there were two options I was thinking about: to leave everything as is or to remove all the settings. I picked the first one because at this point it would be least painful. I wouldn't be opposed to the second one either, but, as far as I know, nobody is going to stop using everything.

@keszybz
Copy link
Member Author

keszybz commented Sep 19, 2018

So this doesn't seem to cause any great problems. When upgrading on a system that already has the users statically defined, the directories are already owned by the right user. On Fedora, systemd-timesync user did not exist, so the state directory has wrong ownership and timesyncd cannot write its state file. It's a minor glitch, and anyway, this should be solved by a %postinstall scriptlet in the rpm. So upgradability looks OK.

@evverx
Copy link
Member

evverx commented Sep 20, 2018

networkd-test.py seems to have started to fail. From what I can see in the log, the test is having trouble running systemd-resolved:

Sep 19 11:18:48 autopkgtest systemd[718]: systemd-resolved.service: Executing: /lib/systemd/systemd-resolved
Sep 19 11:18:49 autopkgtest systemd-resolved[718]: Directory "/run/systemd/resolve" already exists, but is owned by 0:0 (101:103 was requested), refusing.
Sep 19 11:18:49 autopkgtest systemd-resolved[718]: Could not create runtime directory: File exists

@evverx evverx added the ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR label Sep 20, 2018
@evverx
Copy link
Member

evverx commented Sep 20, 2018

Apart from networkd-test.py, I'd also wait for @mbiebl to run the tests in a real LXC container because I have a suspicion that even with this series of patches applied, at least systemd-hostnamed won't start. I figure the unit files should be trimmed even further to make the basic services work.

@keszybz
Copy link
Member Author

keszybz commented Sep 20, 2018

I pushed with what-I-hope-is a fix for networkd-test.py.

@mbiebl
Copy link
Contributor

mbiebl commented Sep 20, 2018

@evverx since running the autopkgtests under LXC/AA is pretty much broken in git master atm, I don't think this PR will make it worse

@yuwata yuwata removed ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks labels Sep 20, 2018
@evverx
Copy link
Member

evverx commented Sep 20, 2018

@mbiebl the point, as I see it, is to make things better. It seems a little bit strange to merge PRs that don't really fix anything and are based on the assumption that just DynamicUser is problematic.

@evverx
Copy link
Member

evverx commented Sep 20, 2018

@mbiebl more precisely, this PR is supposed to make all the tests except test-execute pass. Note "fixes #10011" in the commit message.

@mbiebl
Copy link
Contributor

mbiebl commented Sep 20, 2018

Under unconfined LXC, currently only test-bpf is failing. This PR doesn't change that

Under AA-confined LXC, this PR doesn't seem to help fixing any failures I see:
log.txt

@evverx
Copy link
Member

evverx commented Sep 20, 2018

@mbiebl thank you. That means that, as I expected, just removing DynamicUser isn't enough to make the basic services work in confined LXC containers. Though, frankly, I expected that only systemd-hostnamed wouldn't start.

This reverts commit 0187368.
(systemd.conf.m4 part was already reverted in 5b5d826.)
This reverts commit d4e9e57.
(systemd.conf.m4 part was already reverted in 5b5d826.)

Together those reverts should "fix" systemd#10025 and systemd#10011. ("fix" is in quotes
because this doesn't really fix the underlying issue, which is combining
DynamicUser= with strict container sandbox, but it avoids the problem by not
using that feature in our default installation.)

Dynamic users don't work well if the service requires matching configuration in
other places, for example dbus policy. This is true for those three services.
In effect, distros create the user statically [1, 2]. Dynamic users make more
sense for "add-on" services where not creating the user, or more precisely,
creating the user lazily, can save resources. For "basic" services, if we are
going to create the user on package installation anyway, setting DynamicUser=
just creates unneeded confusion. The only case where it is actually used is
when somebody forgets to do system configuration. But it's better to have the
service fail cleanly in this case too. If we want to turn on some side-effect
of DynamicUser=yes for those services, we should just do that directly through
fine-grained options. By not using DynamicUser= we also avoid the need to
restart dbus.

[1] https://salsa.debian.org/systemd-team/systemd/commit/bd9bf307274faca24699c0c2d67cb86f18c0b2cb
[2] https://src.fedoraproject.org/rpms/systemd/blob/48ac1cebdedb055d9daf3dfe28c7bde80103f7a1/f/systemd.spec#_473
(Fedora does not create systemd-timesync user.)
@keszybz
Copy link
Member Author

keszybz commented Sep 20, 2018

I pushed yet another version to get networkd-test.py to pass. (Previous version chown-ned /run/systemd/resolve, but did not change the group. This one does both.)

Please note that I never said that the PR is motivated by the tests. My concern is with having a setting which is effectively a noop under normal installations and the resulting complexity. If the tests are fixed, or just require less special setup, this is only a pleasant side effect.

@keszybz
Copy link
Member Author

keszybz commented Sep 20, 2018

Hmm, Fedora Rawhide CI has + exit 4. Not sure what's going on here.

@evverx
Copy link
Member

evverx commented Sep 20, 2018

@keszybz thank you for the clarification. I seem to have missed the point entirely. Sorry about that.

@poettering
Copy link
Member

Hmm so, what's the status quo here? Of course I find it a bit disappointing if we have to turn DynamicUser= off again for these services, but then again, I must acknowledge that there isn't the biggest benefit in turning it on for three static system services, so I am inclined to merge this patchset.

Does everybody agree on this now? If not, I'll just hit the green button...

@yuwata
Copy link
Member

yuwata commented Oct 5, 2018

Now I agree with this PR.

@poettering poettering merged commit dacd723 into systemd:master Oct 5, 2018
@poettering
Copy link
Member

ok, merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

5 participants