New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refuse to load some units #6300

Merged
merged 3 commits into from Jul 12, 2017

Conversation

4 participants
@keszybz
Member

keszybz commented Jul 6, 2017

Be nice to users which make syntax error in various crucial directives, and refuse to start the unit at all. This seems to be a better solution then starting the unit and having it execute only a subset of commands or under a different directory / user / group / root.

This only works if the configuration option name is spelled properly, and is located in the right section, so it's not the panacea for all or even most configuration syntax errors.

We might consider doing something similar e.g. for [Match] sections in .network files — if there are [Match] directives present, but some of them failed to parse, refuse the whole unit.

@keszybz keszybz added the pid1 label Jul 6, 2017

@noloader

This comment has been minimized.

Show comment
Hide comment
@noloader

noloader Jul 6, 2017

It may be prudent or useful to be able to designate some options as "critical" to ensure systemd stops when instead of continuing with a warning. There's nothing new or revolutionary about the "critical" concept. X.509 uses it for some options.

noloader commented Jul 6, 2017

It may be prudent or useful to be able to designate some options as "critical" to ensure systemd stops when instead of continuing with a warning. There's nothing new or revolutionary about the "critical" concept. X.509 uses it for some options.

@tixxdz

This comment has been minimized.

Show comment
Hide comment
@tixxdz

tixxdz Jul 9, 2017

Member

@keszybz thanks! I think we have to document this new behaviour and list the directives that will fail. Patch lgtm! the xenial fail seems unrelated but I can re-check later.

Member

tixxdz commented Jul 9, 2017

@keszybz thanks! I think we have to document this new behaviour and list the directives that will fail. Patch lgtm! the xenial fail seems unrelated but I can re-check later.

@keszybz

This comment has been minimized.

Show comment
Hide comment
@keszybz

keszybz Jul 9, 2017

Member

Do you really think this needs documenting? We didn't document previous behaviour…, and I'm not sure it's easy to draw a meaningful distinction between syntax error and value errors for the user. What would the docs would say? "If a syntax error occurs in the following directives (ExecStart=, ExecStartPre=, ExecStartPost=, ExecReload=, RootDirectory=, etc, etc) the unit will not be started. This list is not final and might change at any time."?

Member

keszybz commented Jul 9, 2017

Do you really think this needs documenting? We didn't document previous behaviour…, and I'm not sure it's easy to draw a meaningful distinction between syntax error and value errors for the user. What would the docs would say? "If a syntax error occurs in the following directives (ExecStart=, ExecStartPre=, ExecStartPost=, ExecReload=, RootDirectory=, etc, etc) the unit will not be started. This list is not final and might change at any time."?

@tixxdz

This comment has been minimized.

Show comment
Hide comment
@tixxdz

tixxdz Jul 9, 2017

Member

@keszybz hmm you are right and we already document that the prefix "-" that covers these changes downgrades what should be a fatal to a non fatal, thanks!

Member

tixxdz commented Jul 9, 2017

@keszybz hmm you are right and we already document that the prefix "-" that covers these changes downgrades what should be a fatal to a non fatal, thanks!

@poettering

This comment has been minimized.

Show comment
Hide comment
@poettering

poettering Jul 11, 2017

Member

This looks mostly OK, but I don't like the change for the PrivateTmp=, PrivateNetwork=, PrivateUsers= stuff, and I think that part should be dropped. Note that in the past we added additional values to existing settings just like these, and I think we should be able to do that in the future too for these specific ones. In particular for PrivateNetwork= or PrivateUsers= I see that we might want to add new values (think: PrivateNetwork=down or so, which could have the same effect as "yes", but actually set the iface down or so, so that apps can't even bind to it).

Moreover, these three options are designed to provide sandboxing where we can but we already degrade them gracefully if we can't, for example because we are running in a container (or on a kernel compiled without namespace) or so, where namespacing is unavailable. These sandboxing concepts are really designed to be "best-effort", i.e. take effect if the execution environment allows that, but otherwise really degrade gracefully, hence I think it's also OK if we skip them on syntax errors, as long as we continue to log about this case.

None of these three options really alter the execution environment drastically, all they do is make a bit more stuff unavailable. I hope that makes sense?

Member

poettering commented Jul 11, 2017

This looks mostly OK, but I don't like the change for the PrivateTmp=, PrivateNetwork=, PrivateUsers= stuff, and I think that part should be dropped. Note that in the past we added additional values to existing settings just like these, and I think we should be able to do that in the future too for these specific ones. In particular for PrivateNetwork= or PrivateUsers= I see that we might want to add new values (think: PrivateNetwork=down or so, which could have the same effect as "yes", but actually set the iface down or so, so that apps can't even bind to it).

Moreover, these three options are designed to provide sandboxing where we can but we already degrade them gracefully if we can't, for example because we are running in a container (or on a kernel compiled without namespace) or so, where namespacing is unavailable. These sandboxing concepts are really designed to be "best-effort", i.e. take effect if the execution environment allows that, but otherwise really degrade gracefully, hence I think it's also OK if we skip them on syntax errors, as long as we continue to log about this case.

None of these three options really alter the execution environment drastically, all they do is make a bit more stuff unavailable. I hope that makes sense?

@poettering

This comment has been minimized.

Show comment
Hide comment
@poettering

poettering Jul 11, 2017

Member

(btw, the patch should probably say "Closes: #6277")

Member

poettering commented Jul 11, 2017

(btw, the patch should probably say "Closes: #6277")

@keszybz

This comment has been minimized.

Show comment
Hide comment
@keszybz

keszybz Jul 11, 2017

Member

PrivateUsers= — OK, I can see "soft degradation" to normal user namespace. But PrivateTmp= and PrivateNetwork= I think are different — they provide fairly strong sandboxing that the service might not be prepared to work correctly without, either creating vulnerabilities or allowing externally visible actions to be undertaken by the service unexpectedly. Continuing with your example of PrivateNetwork=down: if we add this in systemd-235, and the unit is executed on systemd-234, I think the expected action is to fail to run the unit at all, rather then "softly downgrade" to full network.

And also note: this patch is only about parsing, and we parsing for those fields is always compiled in, even if we are running on a kernel without some namespace support. So I think it's OK to fail the unit with PrivateTmp=yess or PrivateNetwork=always.

In other words: I think that a stronger check is OK, and it'll allow us and our users to catch more errors early.

Member

keszybz commented Jul 11, 2017

PrivateUsers= — OK, I can see "soft degradation" to normal user namespace. But PrivateTmp= and PrivateNetwork= I think are different — they provide fairly strong sandboxing that the service might not be prepared to work correctly without, either creating vulnerabilities or allowing externally visible actions to be undertaken by the service unexpectedly. Continuing with your example of PrivateNetwork=down: if we add this in systemd-235, and the unit is executed on systemd-234, I think the expected action is to fail to run the unit at all, rather then "softly downgrade" to full network.

And also note: this patch is only about parsing, and we parsing for those fields is always compiled in, even if we are running on a kernel without some namespace support. So I think it's OK to fail the unit with PrivateTmp=yess or PrivateNetwork=always.

In other words: I think that a stronger check is OK, and it'll allow us and our users to catch more errors early.

@poettering

This comment has been minimized.

Show comment
Hide comment
@poettering

poettering Jul 11, 2017

Member

But PrivateTmp= and PrivateNetwork= I think are different — they provide fairly strong sandboxing that the service might not be prepared to work correctly without, either creating vulnerabilities or allowing externally visible actions to be undertaken by the service unexpectedly.

Both privatetmp= and privatenetwork= are implemented in a way that the service shouldn't notice anything has changed: /tmp and "lo" work the same way as they did before, the one change being that all the other stuff happening in /tmp, and all the other interfaces silently disappear. Hence, it's sandboxing in terms of "visibility" not in terms "access prohibition". As such its the friendliest way of sandboxing: you don't see EPERM suddenly that weren't there before, you just get a slightly altered, reduced view of the world. And because things are written in this "defensive" style I think ignoring the option if a value we don't know is assigned is fully OK.

Continuing with your example of PrivateNetwork=down: if we add this in systemd-235, and the unit is executed on systemd-234, I think the expected action is to fail to run the unit at all, rather then "softly downgrade" to full network.

I think actually that that'd be the expectation.

I mean, the namespacing code is already written in a way to degrade softly:

https://github.com/systemd/systemd/blob/master/src/core/execute.c#L2015

And that's done precisely because we know that the seeing more stuff is fine, only seeing more would be a problem...

(that all said, there's a misfeature in the current code: if BindPaths=/BindReadOnlyPaths= are used we should not permit starting of the unit, since these two options do a lot more than hide stuff: they rearrange stuff in non-binary way, and hence if they can't be applied we should fail the service)

In other words: I think that a stronger check is OK, and it'll allow us and our users to catch more errors early.

Still not convinced: I still think the compat path is more relevant for these specific options...

Any chance we can split this discussion up in two? i.e. leave the privatenetwork/privateusers/privatetmp stuff out for now, and do that in a later PR? I mean, apparently the current namespacing code shouldn't be as defensive as it is (see above), hence we should rework that anyway in more detail.

Member

poettering commented Jul 11, 2017

But PrivateTmp= and PrivateNetwork= I think are different — they provide fairly strong sandboxing that the service might not be prepared to work correctly without, either creating vulnerabilities or allowing externally visible actions to be undertaken by the service unexpectedly.

Both privatetmp= and privatenetwork= are implemented in a way that the service shouldn't notice anything has changed: /tmp and "lo" work the same way as they did before, the one change being that all the other stuff happening in /tmp, and all the other interfaces silently disappear. Hence, it's sandboxing in terms of "visibility" not in terms "access prohibition". As such its the friendliest way of sandboxing: you don't see EPERM suddenly that weren't there before, you just get a slightly altered, reduced view of the world. And because things are written in this "defensive" style I think ignoring the option if a value we don't know is assigned is fully OK.

Continuing with your example of PrivateNetwork=down: if we add this in systemd-235, and the unit is executed on systemd-234, I think the expected action is to fail to run the unit at all, rather then "softly downgrade" to full network.

I think actually that that'd be the expectation.

I mean, the namespacing code is already written in a way to degrade softly:

https://github.com/systemd/systemd/blob/master/src/core/execute.c#L2015

And that's done precisely because we know that the seeing more stuff is fine, only seeing more would be a problem...

(that all said, there's a misfeature in the current code: if BindPaths=/BindReadOnlyPaths= are used we should not permit starting of the unit, since these two options do a lot more than hide stuff: they rearrange stuff in non-binary way, and hence if they can't be applied we should fail the service)

In other words: I think that a stronger check is OK, and it'll allow us and our users to catch more errors early.

Still not convinced: I still think the compat path is more relevant for these specific options...

Any chance we can split this discussion up in two? i.e. leave the privatenetwork/privateusers/privatetmp stuff out for now, and do that in a later PR? I mean, apparently the current namespacing code shouldn't be as defensive as it is (see above), hence we should rework that anyway in more detail.

@keszybz keszybz added this to the v234 milestone Jul 11, 2017

keszybz added some commits Jul 6, 2017

core/load-fragment: refuse units with errors in certain directives
If an error is encountered in any of the Exec* lines, WorkingDirectory,
SELinuxContext, ApparmorProfile, SmackProcessLabel, Service (in .socket
units), User, or Group, refuse to load the unit. If the config stanza
has support, ignore the failure if '-' is present.

For those configuration directives, even if we started the unit, it's
pretty likely that it'll do something unexpected (like write files
in a wrong place, or with a wrong context, or run with wrong permissions,
etc). It seems better to refuse to start the unit and have the admin
clean up the configuration without giving the service a chance to mess
up stuff.

Note that all "security" options that restrict what the unit can do
(Capabilities, AmbientCapabilities, Restrict*, SystemCallFilter, Limit*,
PrivateDevices, Protect*, etc) are _not_ treated like this. Such options are
only supplementary, and are not always available depending on the architecture
and compilation options, so unit authors have to make sure that the service
runs correctly without them anyway.

Fixes #6237, #6277.
core/load-fragment: refuse units with errors in RootDirectory/RootIma…
…ge/DynamicUser

Behaviour of the service is completely different with the option off, so the
service would probably mess up state on disk and do unexpected things.

@poettering poettering merged commit 6297d07 into systemd:master Jul 12, 2017

4 of 5 checks passed

xenial-s390x autopkgtest finished (failure)
Details
default Build finished.
Details
semaphoreci The build passed on Semaphore.
Details
xenial-amd64 autopkgtest finished (success)
Details
xenial-i386 autopkgtest finished (success)
Details

@keszybz keszybz deleted the keszybz:refuse-to-load-some-units branch Jul 13, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment