Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The default position of critical infrastructure cannot be complete system failure #341

Open
luebking opened this issue Jan 24, 2024 · 6 comments

Comments

@luebking
Copy link

Status quo, dbus-broker self-terminates when failing to parse a service config.
The stated reason are security concerns for possibly ignoring restricitve instructions.

Because of the heavy reliance of modern software stacks, including logind, on a functional system bus, the fatal failure of the system bus leads to an effectively disfunctional system that the user will to have to anaylze and fix offline.
Recent threads on the archlinux bbs¹ have shown that this condition will cause users to to engage in uninformed mitigational efforts (return to dbus-daemon over re-installation to wild actions on the filesystem), likely because they're overwhelmed by the error, the urge to return basic functionality and the need to fix it offline.

Flawed strategies include:

  1. Hope that google gets better at finding correct solutions.
    The user will still be confronted with a complete system failure and there's no guarantee they'll be able to isolate the proper approaches from the noise (see the first bbs thread where another user picked up the flawed approach right after the situation had been explained ad nauseam)
  2. Hope that legacy services get fixed upstream.
    They might no longer be maintained and the problem can occur for new bugs in services or downstream mistakes (bad "touch") or even bugs in dbus-broker.

Necessary changes:

dbus-broker is a system-critical process. It has to
a) be more resiliant and
b) communicate the problem, its cause and correct treatment better than "no login for you, lol"
Both imply that the affected system continues to function as much as possible.

Taking the security concerns somewhat into consideration, the broker certainly can detect that the syntactically flawed config indicates some problem (whether security related or not)
But that problem does not affect the broker directly.
It is very much capable of containing it, by denying the specific service access to the bus, whilst maintainig general functionality.

This might still lead to a partially misbehaving system (flawed sevice fails) but will on average allow
a) the user to login and inspect the running system and
b) the dbus-broker process, that is now aware of the problem, to communicate this to the user, eg.

  1. through any instance of org.freedesktop.Notifications appearing on any bus or
  2. with an aggressive error message and stalls during the boot (though splash systems and the mere absence of the user might spoil that)
  3. the system journal that might be consulted to inquire the failing service and resulting partial misbehavior

It is IMO however not the brokers job to fix the world by refusing action unless the system is in pristine condition.
It can itr eg. also not guarantee logic correctness of the service configurations.


¹ https://bbs.archlinux.org/viewtopic.php?id=292174
https://bbs.archlinux.org/viewtopic.php?id=292149
https://bbs.archlinux.org/viewtopic.php?id=292126
https://bbs.archlinux.org/viewtopic.php?id=292200

@dvdhrm
Copy link
Member

dvdhrm commented Jan 25, 2024

Thank you for taking part in the upstream dbus-broker community! Before I will engage in this discussion I want to encourage you to take a day or two and then maybe try to rewrite this report. I do not feel like it is written in good faith, nor does the wording feel like a promising strategy to convince us of your position. This is my personal opinion and you are very much allowed to disagree. Yet I do not feel like this report values our time and effort spent on this project.

@teg
Copy link
Contributor

teg commented Jan 26, 2024

It is very much capable of containing it, by denying the specific service access to the bus, whilst maintainig general functionality.

This is not true, and that is the crux of the matter. I would be perfectly fine with replacing a non-parsing config file with a "maximally restrictive" version of the file for the scope it can affect. The problem is that config files (despite their names) are not scoped, so a "maximally restrictive" config file would restrict all access to the whole bus, which is exactly the same as our current behaviour.

The alternative is to ignore broken files, which would be the same as saying "we don't know what the intended policy is, but we know that we are less restrictive than the creator of the system intended". If that is really what you want, make a helper in your distro package that moves broken xml files out of the way, but it does not sound lie a responsible general policy.

@Scimmia22
Copy link

Scimmia22 commented Jan 26, 2024

Thank you for taking part in the upstream dbus-broker community! Before I will engage in this discussion I want to encourage you to take a day or two and then maybe try to rewrite this report. I do not feel like it is written in good faith, nor does the wording feel like a promising strategy to convince us of your position. This is my personal opinion and you are very much allowed to disagree. Yet I do not feel like this report values our time and effort spent on this project.

I'm sorry, this sounds like a way to hand wave away someone's concerns. Being dismissive and saying you refuse to engage when someone has an issue is insulting and is bad faith in itself.

As it stands, as more people switch to dbus-broker, it's going to get the reputation for being fragile, and in all honesty, it IS right now. If things aren't testing with this specific implementation, they could completely crash dbus leaving the system minimally functional. From the users perspective, there was no issue with freedesktop dbus, but now everything stops working. Quite honestly, this makes it a bad choice for general adoption.

@rgudwin
Copy link

rgudwin commented Jan 26, 2024

Instead of a full failure, it would be much better if you print a warning, indicating the failing file, freeze the system for 30 seconds, and then just ignore the failing file and start without failure. A freeze of 30 seconds would be enough to show that there is something wrong without the hassle of being stuck with a non-functional system that you don't have any clue why it is non-functional.

@luebking
Copy link
Author

Tom raises an important detail that I had not anticipied:
The entire policy config structure is essentially unspecified.

It's implausible, but not impossible that eg. com.rastersoft.panther.remotecontrol.conf holds policies concerning org.freedesktop.login1
That's in and by itself a problem and raises some interesting questions (how are cascade conflicts correctly resolved if there's no preference specified? Catastrophic failure as well?) but it is what it is and unfortunately means that a clear-cut solution isn't possible.

It's no option to just implement an assumed specification as detail, because that risks backward incompatibility with the additional risk of silent security issues because existing configuration is ignored for violating the internal specification (eg. I've a dbus-1/system.d/wpa_supplicant.conf here)

For a risk/reward analysis it's important to remember that the security provision of even the present rigid approach is still limited.

dbus-broker can detect syntactical errors, but not:

  • entirely missing files
  • logical mistakes
  • typos in eg. service names misdirecting the scope

Estimating what can go wrong on the target system, possible flubs will include

  1. dangeling symlinks
  2. empty files
  3. garbage content
  4. ill-formed xml configs
  5. insecure paths

(1) is a strong indicator for a missing file and an actual problem since no further assumptions can be made. It'll hopefully also be rare outside develompent contexts

For (2) and (3) the spec is luckily very clear:

The configuration file is an XML document. It must have the following doctype declaration:
<!DOCTYPE busconfig PUBLIC "-//freedesktop//DTD D-Bus Bus Configuration 1.0//EN" "http://www.freedesktop.org/standards/dbus/1.0/busconfig.dtd">

So anything that doesn't include that and therefore doesn't even make an effort to configure anything can imo be disregarded as obvious but harmless error. There should certainly be a warning, but no further action.
For an actual implementation, the check could be error tolerant, so any tag <!DOCTYPE.*bus.*> would indicate the purpose and trigger actual inspection.

(4) is a clear error in the configuration attempt and must be addressed:

  1. how relaxed is the xml parser atm (eg. wrt empty tags quoting tokens)?
  2. even if ill-formed, do you think the intended scope is still sufficiently detectable by pattern analysis and then deny every occuring interface?

From a strict security perspective, (5) is the ultimate disaster scenario, because no file can be trusted.
Should that still lead to catastrophic failure on non-hardened systems where the most likely explanation is an inept chown/chmod [-R]?

@C0rn3j
Copy link

C0rn3j commented Jan 29, 2024

Just making sure relevant issues are linked
This is a continuation of #337, and related to #342, which also touches on the same issue but the proposed solutions there are to enable the user to remedy the situation instead of getting them stuck, while this one is more about not failing the service in the first place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants