Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resolved: various monitor fixes #24853

Merged
merged 16 commits into from Sep 30, 2022
Merged

Conversation

poettering
Copy link
Member

No description provided.

@poettering
Copy link
Member Author

This should address all remaining issues I see with #22845

Copy link
Member

@bluca bluca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big fan of removing the config flag, but ok. But this needs test updates - new helpers need unit tests, and TEST-75 needs adjustments and coverage for the monitor command.

src/libsystemd/sd-event/sd-event.c Outdated Show resolved Hide resolved
man/rules/meson.build Show resolved Hide resolved
src/resolve/resolved-varlink.c Outdated Show resolved Hide resolved
Copy link
Member

@keszybz keszybz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice. I didn't review the monitor output in detail. @surajkrishnan14 can you review those parts to make sure that the output looks OK?

man/sd_event_add_signal.xml Outdated Show resolved Hide resolved
man/sd_event_add_signal.xml Outdated Show resolved Hide resolved
man/sd_event_set_signal_exit.xml Outdated Show resolved Hide resolved
man/sd_event_set_signal_exit.xml Outdated Show resolved Hide resolved
man/sd_event_set_signal_exit.xml Outdated Show resolved Hide resolved
man/sd_event_set_signal_exit.xml Outdated Show resolved Hide resolved
man/sd_event_set_signal_exit.xml Outdated Show resolved Hide resolved
man/org.freedesktop.resolve1.xml Show resolved Hide resolved
man/resolvectl.xml Outdated Show resolved Hide resolved
man/resolvectl.xml Outdated Show resolved Hide resolved
@keszybz keszybz added the reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks label Sep 29, 2022
@keszybz
Copy link
Member

keszybz commented Sep 29, 2022

This will need another round of reviews after all the comments…

@yuwata yuwata added ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR and removed reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks labels Sep 29, 2022
@bluca
Copy link
Member

bluca commented Sep 29, 2022

Not a big fan of removing the config flag, but ok. But this needs test updates - new helpers need unit tests, and TEST-75 needs adjustments and coverage for the monitor command.

This is still unaddressed, and tests are failing because of it

@poettering
Copy link
Member Author

This is still unaddressed, and tests are failing because of it

Now addressed. PTAL.

@poettering
Copy link
Member Author

I also added some better support for following CNAME chains: each element in the chain will show up as a separate query. But each query will also contain the questions of previous elements of the chain in a new collectedQuestions metadata field. This means we will report for each element of the chain the results once we have it, but we'll also tell you how we came there, and the final answer will give you complete information about the chain we followed.

@poettering poettering added please-review and removed ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR labels Sep 30, 2022
@poettering poettering added this to the v252 milestone Sep 30, 2022
@bluca bluca added good-to-merge/waiting-for-ci 👍 PR is good to merge, but CI hasn't passed at time of review. Please merge if you see CI has passed and removed please-review labels Sep 30, 2022
@mrc0mmand
Copy link
Member

[ 3783.010976] resolvectl[326]: ../build/src/shared/json.c:4277:116: runtime error: applying non-zero offset 140134595963136 to null pointer
[ 3783.115350] resolvectl[326]:     #0 0x7f73a3af5738 in json_dispatch /systemd-meson-build/../build/src/shared/json.c:4277:116
[ 3783.115350] resolvectl[326]:     #1 0x5606782e79ed in monitor_query_dump /systemd-meson-build/../build/src/resolve/resolvectl.c:2625:13
[ 3783.115350] resolvectl[326]:     #2 0x5606782e79ed in monitor_reply /systemd-meson-build/../build/src/resolve/resolvectl.c:2677:17
[ 3783.115350] resolvectl[326]:     #3 0x7f73a3be2e00 in varlink_dispatch_reply /systemd-meson-build/../build/src/shared/varlink.c:758:29
[ 3783.115350] resolvectl[326]:     #4 0x7f73a3be2e00 in varlink_process /systemd-meson-build/../build/src/shared/varlink.c:949:13
[ 3783.115350] resolvectl[326]:     #5 0x7f73a3bf5c15 in defer_callback /systemd-meson-build/../build/src/shared/varlink.c:1888:16
[ 3783.115350] resolvectl[326]:     #6 0x7f73a3f97b62 in source_dispatch /systemd-meson-build/../build/src/libsystemd/sd-event/sd-event.c
[ 3783.115350] resolvectl[326]:     #7 0x7f73a3f967e9 in sd_event_dispatch /systemd-meson-build/../build/src/libsystemd/sd-event/sd-event.c:4246:21
[ 3783.115350] resolvectl[326]:     #8 0x7f73a3f9a054 in sd_event_run /systemd-meson-build/../build/src/libsystemd/sd-event/sd-event.c:4307:21
[ 3783.115350] resolvectl[326]:     #9 0x7f73a3f9a72a in sd_event_loop /systemd-meson-build/../build/src/libsystemd/sd-event/sd-event.c:4328:21
[ 3783.115350] resolvectl[326]:     #10 0x5606782d9062 in verb_monitor /systemd-meson-build/../build/src/resolve/resolvectl.c:2720:13
[ 3783.115350] resolvectl[326]:     #11 0x7f73a3bfdb83 in dispatch_verb /systemd-meson-build/../build/src/shared/verbs.c:103:24
[ 3783.115350] resolvectl[326]:     #12 0x5606782cf033 in native_main /systemd-meson-build/../build/src/resolve/resolvectl.c:3486:16
[ 3783.115350] resolvectl[326]:     #13 0x5606782cf033 in run /systemd-meson-build/../build/src/resolve/resolvectl.c:3623:16
[ 3783.115350] resolvectl[326]:     #14 0x5606782cd4d7 in main /systemd-meson-build/../build/src/resolve/resolvectl.c:3626:1
[ 3783.115350] resolvectl[326]:     #15 0x7f73a323c28f  (/usr/lib/libc.so.6+0x2328f) (BuildId: 26c81e7e05ebaf40bac3523b7d76be0cd71fad82)
[ 3783.115350] resolvectl[326]:     #16 0x7f73a323c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349) (BuildId: 26c81e7e05ebaf40bac3523b7d76be0cd71fad82)
[ 3783.115350] resolvectl[326]:     #17 0x5606782ca1f4 in _start /build/glibc/src/glibc/csu/../sysdeps/x86_64/start.S:115
[ 3783.115350] resolvectl[326]: SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../build/src/shared/json.c:4277:116 in

@mrc0mmand mrc0mmand added ci-fails/needs-rework 🔥 Please rework this, the CI noticed an issue with the PR and removed good-to-merge/waiting-for-ci 👍 PR is good to merge, but CI hasn't passed at time of review. Please merge if you see CI has passed labels Sep 30, 2022
@poettering
Copy link
Member Author

grr, this is UndefinedBehaviorSanitizer being stupid here... will add work-around

@poettering poettering added the good-to-merge/waiting-for-ci 👍 PR is good to merge, but CI hasn't passed at time of review. Please merge if you see CI has passed label Sep 30, 2022
@bluca bluca merged commit 697f082 into systemd:main Sep 30, 2022
45 checks passed
@keszybz keszybz removed the good-to-merge/waiting-for-ci 👍 PR is good to merge, but CI hasn't passed at time of review. Please merge if you see CI has passed label Oct 10, 2022
@evverx
Copy link
Member

evverx commented Nov 19, 2022

I've just reported the first remote DOS with the monitor enabled: #25449.

@evverx
Copy link
Member

evverx commented Nov 24, 2022

FWIW it's going to get its CVE in a couple of days I think. Technically there should be two CVEs because there are two different codepaths that can lead to the crash but I don't think it matters much in this particular case.

@bluca
Copy link
Member

bluca commented Nov 24, 2022

There is no need for any CVE, given it requires local root access to reproduce, a bug report is more than enough

@evverx
Copy link
Member

evverx commented Nov 24, 2022

It requires local root access to subscribe to the notifications. Once there is at least one subscriber resolved can be crashed remotely without any access to the box. Those conditions are mentioned.

@evverx
Copy link
Member

evverx commented Nov 24, 2022

I have also almost gotten around to #23873 and #23894 (those are triggered via adjacent networks though).

@bluca
Copy link
Member

bluca commented Nov 24, 2022

Local root access is a necessary prerequisite, hence "can be crashed remotely" is completely nonsensical, as anything including ssh fits into that definition

@evverx
Copy link
Member

evverx commented Nov 24, 2022

hence "can be crashed remotely" is completely nonsensical

By this logic resolved can never be crashed remotely because root should enable it in the first place and that's nonsensical.

@bluca
Copy link
Member

bluca commented Nov 24, 2022

No, resolved is enabled by default in many distros and images

@evverx
Copy link
Member

evverx commented Nov 24, 2022

Let me get this right if resolved on one machine with at least one subscriber can be crashed by resource records coming from remote servers it's actually just a local crash that requires admin privileges?

@bluca
Copy link
Member

bluca commented Nov 24, 2022

with at least one subscriber

There are zero subscribers for this debug tool by default, adding one requires real-time local root access, hence this is at best a minor bug causing a local crash, and even that's a stretch

@evverx
Copy link
Member

evverx commented Nov 24, 2022

There are zero subscribers for this debug tool by default

Right and nobody is saying that there are more than zero subscribers by default. It affects only machines where this feature is actually used and they can be crashed by resource records coming from remote machines.

@bluca
Copy link
Member

bluca commented Nov 24, 2022

Which makes it a minor bug at best, but most definitely not worth of wasting anybody's time with a cve, because the security significance is non-existent given it requires real-time root access

@evverx
Copy link
Member

evverx commented Nov 24, 2022

it requires real-time root access

Well, by this logic #23873 (where mdns should be turned on at some point somehow by privileged users by editing the config file or at runtime) it can't be triggered via adjacent networks either and I don't agree with that.

Anyway I get what you are saying. It seems that by that logic only stuff like #19584 can be considered a remote DOS.

@evverx
Copy link
Member

evverx commented Nov 24, 2022

I sent a link to this thread. It should be added to the description eventually.

@bluca
Copy link
Member

bluca commented Nov 24, 2022

Sent to whom? What description?

@evverx
Copy link
Member

evverx commented Nov 24, 2022

Sent to whom?

MITRE

What description?

The place where references to bugs are placed.

@bluca
Copy link
Member

bluca commented Nov 24, 2022

Are you aware of the process defined at https://github.com/systemd/systemd/security/policy ?

@evverx
Copy link
Member

evverx commented Nov 24, 2022

I'm aware of that.

@bluca
Copy link
Member

bluca commented Nov 24, 2022

And you decided to ignore it and bypass it because...?

@evverx
Copy link
Member

evverx commented Nov 24, 2022

It's a long story but if you haven't noticed I always report systemd bugs like that publicly. I think it started when I was told that fuzzing stuff shouldn't be sent to that mailing list for some reason. I didn't think it was a good idea but basically when I opened PRs like #10200 all those overflows were immediately public. I didn't request CVEs though because at the time I was able to backport those fixes without that.

@evverx
Copy link
Member

evverx commented Nov 24, 2022

Bugs like #19584 are already public so there is nothing to coordinate anymore.

@bluca
Copy link
Member

bluca commented Nov 24, 2022

Opening github bug reports for issues of any kind is one thing, and most of the times it's fine to do so, especially for the fuzzing stuff. Contacting MITRE to ask for a CVE to be assigned, bypassing the established security process and without consulting anybody else, is very much a different matter altogether.

@evverx
Copy link
Member

evverx commented Nov 24, 2022

especially for the fuzzing stuff

Fuzzing stuff is exactly what should be covered by the security process. When infinite loops like 86b06c6 can be triggered in dhcp-client on receiving some weird packets it should go through that process. And if projects agree I of course participate in CVD. dbus is a great example where issues like that are taken seriously: https://seclists.org/oss-sec/2022/q4/7

Contacting MITRE to ask for a CVE to be assigned

Well, what else should I do? Let's revisit the policy of not sending fuzzing stuff to that mailing list. I would be more than happy.

@bluca
Copy link
Member

bluca commented Nov 25, 2022

Well, what else should I do?

  • contact MITRE again and withdraw the request
  • if you think something warrants a CVE, follow the process instead of bypassing it, and if there's consensus that a CVE is required then you can request one - and if there is not, a ticket on github will have to be enough

@evverx
Copy link
Member

evverx commented Nov 25, 2022

I did contact MITRE and sent links to this thread and #25518 a few hours ago and for now "NOTE: there is some debate about the security relevance of this report because there are zero subscribers by default." ended up in some other CVE. I think it should be fixed soon. I also put use-after-frees, infinite loops and stuff like that that can be triggered remotely on hold because based on #25518 (comment) it seems if a feature like mdns happens to be off by default it doesn't matter whether use-after-frees can be triggered on receiving packets when it's on or something like that. I don't agree with that but I don't want to annoy anyone either.

@evverx
Copy link
Member

evverx commented Dec 23, 2022

if there is not, a ticket on github will have to be enough

Having thought about it a bit more I think I'm just going to keep reporting issues publicly but instead of vague bug reports like #22480 I'm just going to be more specific. In that particular case there is a reachable assertion in systemd-resolved that can be triggered by any local user by writing crafted messages to that world-writable socket. Restart doesn't help much obviously because it can be wrapped in a loop (and I frankly have no idea why so much trust is put in "restart" because if stuff is crashed fast enough it gets rate-limited by systemd https://www.freedesktop.org/software/systemd/man/systemd.unit.html#StartLimitIntervalSec=interval). I've just checked and Debian Stable still ships resolved where that and a bunch of other local and not-so-local issues are present. The actual Debian security team tracks issues like that (for example lathiat/avahi#338). I'm a bit conflicted about being specific about how to trigger various memory corruptions but my understanding is that it should be fine because apparently by the systemd security logic to trigger something in, say, dhcp-server root should enable it somehow so in the end it's downplayed to something requiring user-interaction with the highest privileges so there should be no harm in full public disclosure.

Yamakuzure pushed a commit to elogind/elogind that referenced this pull request Dec 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

7 participants