Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set blk-mq I/O schedulers when available #7

Closed

Conversation

@develop7
Copy link

commented Dec 28, 2017

This adds blk-mq support effectively setting BFQ scheduler for HDD and mq-deadline for SSD.

Minor improvements: add clarifying comments and explicitly set CFQ for HDDs despite being it a default for now.

Here's how it works:

image

fcrozat and others added 30 commits Oct 29, 2012
parse /etc/insserv.conf.dd content and /etc/insserv.conf and generate
systemd unit drop-in files to add dependencies
The values of the interactive locale configuration are read from those
files in this order:

  - the kernel command line
  - /etc/locale.conf
  - /etc/sysconfig/language

The special case ROOT_USES_LANG=ctype allows to set LC_CTYPE even if
LANG for root is set to e.g. POSIX. But do this only if no LC_CTYPE
has been set in /etc/locale.conf and on the kernel's command line.

[ fixes: boo#927250 ]
[fbui: bsc#973626]
[fbui: imported in SLE12-SP2 from aa09962]
[fbui: this patch replaces completely 3849917]
[fbui: make use of isempty()]
…nfig

Consider settings in /etc/sysconfig/keyboard and
/etc/sysconfig/console

It's a partial import of 7974978, it
still misses the handling of disable_caplock and kbd_rate.

[fbui: fixes bsc#973425]
Fix regression in the default for tmp auto-deletion (FATE#314974).
[fbui: fixes bnc#820213]
[fbui: forwardported from bfd2462]
[fbui: adjust context and make sure that strip of the domain name is
       only done when setting the system host name. Therefore it's
       still possible to pass an FQDN to hostnamectl]
[fbui: I'm still not sure that it was the right thing to do. Other
       possibility was to fix the installer to create a correct
       /etc/hostname file. Need to investigate...]
[fbui: probably a workaround for a kernel shortcoming. Udev maintainer
       doesn't want to carry it so we'll have to maintain it for an
       undefined period of time. See links for some discussions:

       https://bugzilla.redhat.com/show_bug.cgi?id=979257
       https://bugs.freedesktop.org/show_bug.cgi?id=78478]

[fixes bnc#703100]
[fixes fate#311831]
If any devices are marked with 'SYSTEMD_READY=0' then
we shouldn't run any btrfs check on them.

[fixes: bnc#872929]
[fbui: forward-ported from SLE12-SP1 commit 2fff43938e50ae50ccf4a84ed]
Imported from SLE12-SP1, commit 4f8bacf.

[bnc#783054]
[fbui: side note:
 The max value was changed a couple of times it seems:

   -------------------------------------------------------------------
   Fri Nov 28 13:26:21 UTC 2014 - rmilasan@suse.com

   - Change the maximum number of children from CPU_COUNT * 256 to
     CPU_COUNT * 64.
     Update 1097-udevd-increase-maximum-number-of-children.patch

   -------------------------------------------------------------------
   Thu Nov 27 20:30:35 UTC 2014 - rmilasan@suse.com

   - Increase number of children/workers to CPU_COUNT * 256 to avoid
     'maximum number of children reached' (bnc#907393).
     Add 1097-udevd-increase-maximum-number-of-children.patch

 It was introduced to fix bnc#907393 but it had actually no
 effects. However it might be needed to fix bnc#944812: more worker
 are needed in this case. It worths to be noted that a too high value
 can lead to OOM or even to saturation of IOs bottleneck (see:
 https://lists.freedesktop.org/archives/systemd-devel/2014-November/025656.html)]
NetworkManager and wicked does this already. This is needed by yast2
and other parts of the system.

[fixes boo#933092]
SUSE supported tmpfile cleanups based on file ownership before systemd.
So this feature needs to be available in systemd.
This was part of fate#314974

[tblume: suse-only patch ported from SLES12-SP1 commit e769a63]
[tblume: part of fate#314974]
…he the crypt device

Upstream declined to add this patch, see: systemd/systemd#2979

Still we need to add it, at least to SLES12 because the behaviour is a regression
compared to SLES11.

[ tblume: fixes boo#742774 ]
[ tblume: forward ported from 40e6d61 ]
Otherwise all processes part of a user's session wouldn't get time to
exit cleanly during shutdown.

This is now needed since we allow Delegate=yes for user@.service units
since we still want those services to be allowed to create
subhierarchy beneath their control group path, specially since now the
init.scope is created by default.

However for the root user (user@0.service) the Delegate property is
tuned off because we don't want to enable all controller to avoid
performance regressions (bsc#954765). This is not an issue in this
case since root as a privileged user will still be allowed to create
subhierarchy beneath its control group path.

[fbui: fixes bnc#886599]
[tblume: Port of SLES12SP1 patch 0018-Make-LSB-Skripts-know-about-Required-and-Should.patch]

[fbui: this is needed probably because insserv's behavior has been
       sadly changed since SLE11: it now doesn't failed if a
       dependency listed by Required-Start is missing.]

[fbui: according to Werner "This should fix bnc#858864 and
       bnc#857204."  (see Base:System changelog)]
This is a partial import of commit
88013ca part of SUSE/v210 branch.

However /var/lock/{subsys,lockdev} are left alone and will be created
because:

- a bug was opened requesting /var/lock/subsys, see commit
0671c57.

- creating /var/lock/lockdev shouldn't hurt.

[fixes: bnc#733523]
For the record, the upstream support was removed by commit
3cdebc2.

The sysv-generator has some weirdos: for example a service at the rc0
runlevel won't be started during shutdown since it will get both
"WantedBy=poweroff.target" and "Conflicts=shutdown.target".

Anyways what's the current patch implements the following:

 - a symlink /etc/init.d/boot.d/S??boot.foo will add
   "WantedBy/Before=sysinit.target" constraints and make sure that the
   default dependencies added by systemd are turned off.

 - a symlink /etc/init.d/boot.d/K??boot.foo will add
   "Conflicts/Before=shutdown.target" so "foo" service will be stopped
   like any other regular services. If this symlink is not installed
   however, "foo" will be stopped lately during the systemd killing
   spree.

This is a forward-port of commit 29db853 in SP1.

[Since v232]

Support for S* symlinks in runlevel 0 or 6 has been completely and silently
removed by 788d2b0. Since it was already
broken as pointed out above, this probably wasn't really used and therefore
no one will really care. So let's drop it too.

However this has the side effect to make the support of early sysv scripts more
difficult. To make things easy, the support of K* symlinks in boot.d/ has been
removed too: this is probably not used (anymore) (at least intentionally).

The consequence is that early sysv services are stopped during shutdown at
the same time as 'normal' services.
The 3270 console on S/390 can do color but not the 3215 console.

Partial forward port of
0001-On_s390_con3270_disable_ANSI_colour_esc.patch from SLE12-SP1. A
bunch of the previous code has been dropped since some changes
imported from upsteam made them uneeded.

The remaining bits are probably hackish but at least they are now
minimal.

It was an attempt to address bnc#860937. And yes turning the console
color mode off by passing $TERM=dumb via the kernel command line would
have been much more easier and enough.

This is actually implemented by recent systemd. There's also another
command line option: systemd.log_color=off.

See also a short discussion which happened on @systemd-maintainers
whose $subject is "[PATCH] support conmode setting on command line".

[ fbui: fixes bsc#860937 ]
This patch not only prevents journald to enable audit system
unconditionally very early at boot but also prevents it to receive
audit messages for the audit netlink and to push them into the
journal.

The first reason is that when journald enables kernel audit, it does
not disable syscall audit (it doesn't load the audit rules), which
introduced a global performance hit. This can be minimized if audit
service is started but that's not the case for all systems.

The second reason is that for systems where audit was disabled by
default they will suddenly have audit enabled (unless audit=0 was
already passed to the kernel command line). This means tons of audit
messages will be sent to dmesg, syslog, journal files, etc...

Note also that audit messages are duplicated in the journal since they
are received both from kmsg and from the audit netlink. A related bug
report can be found here:
https://bugzilla.redhat.com/show_bug.cgi?id=1160046.

This basically reverts the following upstream commits:

 - 875c2e2
 - 4d9ced9

Upstream issue:
systemd/systemd#959

So disable all of this for now until a better option is found or
someone comes up with a real use case.

[fbui: bsc#984034]
On z/VM where the hotplug memory is always present (if set).

As a consequence, standby memory on z/VM is automatically onlined
whereas it should stay offline until it is manually onlined later on
when needed.

Origin of that solution comes from
https://bugzilla.redhat.com/show_bug.cgi?id=1370161

[fbui: fixes bsc#997682]
2d79a0b did that for TimeoutSec=,
89beff8 did that for JobTimeoutSec=,
and 0004f69 did that for
x-systemd.device-timeout=. But after parsing x-systemd.device-timeout=xxx
we write it out as JobRunningTimeoutSec=xxx. Two options:
- write out JobRunningTimeoutSec=<a very big number>,
- change JobRunningTimeoutSec= to behave like the other options.

I think it would be confusing for JobRunningTimeoutSec= to have different
syntax then TimeoutSec= and JobTimeoutSec=, so this patch implements the
second option.

Fixes #6264, https://bugzilla.redhat.com/show_bug.cgi?id=1462378.

(cherry picked from commit 4a06cbf)
… (#5139)"

This reverts commit 2d058a8.

When we add another name to a unit (by following an alias), we need to
reload all drop-ins. This is necessary to load any additional dropins
found in the dirs created from the alias name.

Fixes #6334.

(cherry picked from commit 9e4ea9c)
… machined is disabled

Commit 78d1039 only installs the mount unit
when machined is enabled but installed unconditionally the link between the
mount unit and remote-fs.target.

This patch adds a condition when creating the hook.

The meson build system is not affected and therefore this patch could be
dropped when we'll upgrade to v235 because the automake support has been
dropped since this version.
fbuihuu and others added 12 commits Sep 28, 2017
This is primarly useful to support escaped double quotes in PROGRAM or
IMPORT{program} directives.

The only possibilty before this patch was to use an external shell script but
this seems too cumbersome for trivial logics such as

 PROGRAM=="/bin/sh -c 'FOO=\"%s{model}\"; echo ${FOO:0:4}'"

or any similar shell constructs that needs to deals with patterns including
whitespaces.

As it's the case for single quote and for directives running a program, words
within escaped double quotes will be considered as a single argument.

Fixes: #6835
(cherry picked from commit 7e760b7)
Reported by Karim Hossen & Thomas Imbert from Sogeti ESEC R&D.

https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1725351
(cherry picked from commit 9f93933)

[fbui: fixes bsc#1065276]
[fbui: fixes CVE-2017-15908]
Enable systemd-firstboot to set the keymap.

RFE:

systemd/systemd#6346
(cherry picked from commit ed457f1)

[tblume: fixes bsc#1046436]
…fig (#7328)

It is possible that kernel will reject our netlink message that
configures the link. However, we should always make sure that interface
will be named properly otherwise we can leave interfaces having
unpredictable kernel names. Thus we don't return early and continue to
export name and link file properties.

Suggested-by: Tom Gundersen <teg@jklm.no>
(cherry picked from commit 56692c5)
Currently if tmpfiles is run with force on symlink creation but there already
exists a directory at that location, the creation will fail. This change
updates the behavior to remove the directory with rm_fr and then attempts to
create the symlink again.

(cherry picked from commit b3f5897)
…ected to be unresolvable (#6860)

In chroot environments, /etc might not be fully initialized: /etc/machine-id
can be missing for example. This makes the expansions of affected specifiers
impossible at that time.

These cases should not be considered as errors and such failures shouldn't be
logged at an error level therefore this patch downgrades the level used to
LOG_NOTICE in such cases.

Also this is logged at LOG_NOTICE only the first time and then downgrade to
LOG_DEBUG for the rest. That way, if debugging is enabled we get the full
output, but otherwise we only see only one message.

The expansion of specifiers is now self contained in a dedicated function
instead of being spread all over the place.

(cherry picked from commit 4cef192)

[fbui: fixes bsc#1055664]
(cherry picked from commit a7353b4)

[fbui: fixes bsc#1070124]
This adds BLK_MQ support effectively setting BFQ scheduler for HDD
and mq-deadline for SSD.

Minor improvements: add clarifying comments, explicitly set CFQ
for HDDs despite being it a default for now, rename rules file to reflect
it is not ssd-specific anymore.
@develop7 develop7 force-pushed the develop7:feature-optimal-schedulers branch from 9f98d2f to 8463b5c Dec 28, 2017
@develop7 develop7 changed the title Set BLK_MQ I/O schedulers when available Set blk-mq I/O schedulers when available Dec 28, 2017
fbuihuu added a commit to fbuihuu/systemd-opensuse-next that referenced this pull request Jan 4, 2018
Fixes:
       Message: Process 806 (systemd-importd) of user 0 dumped core.

                Stack trace of thread 806:
                #0  0x00007f5eaeff7227 raise (libc.so.6)
                openSUSE#1  0x00007f5eaeff8e8a abort (libc.so.6)
                openSUSE#2  0x000055b6d3418f4f log_assert_failed (systemd-importd)
                openSUSE#3  0x000055b6d3409daf safe_close (systemd-importd)
                openSUSE#4  0x000055b6d33c25ea closep (systemd-importd)
                openSUSE#5  0x000055b6d33c38d9 setup_machine_directory (systemd-importd)
                openSUSE#6  0x000055b6d33b8536 method_pull_tar_or_raw (systemd-importd)
                openSUSE#7  0x000055b6d33ed097 method_callbacks_run (systemd-importd)
                openSUSE#8  0x000055b6d33ef929 object_find_and_run (systemd-importd)
                openSUSE#9  0x000055b6d33eff6b bus_process_object (systemd-importd)
                openSUSE#10 0x000055b6d3447f77 process_message (systemd-importd)
                openSUSE#11 0x000055b6d344815a process_running (systemd-importd)
                openSUSE#12 0x000055b6d3448a10 bus_process_internal (systemd-importd)
                openSUSE#13 0x000055b6d3448ae1 sd_bus_process (systemd-importd)
                openSUSE#14 0x000055b6d3449779 time_callback (systemd-importd)
                #15 0x000055b6d3454ff4 source_dispatch (systemd-importd)
                #16 0x000055b6d34562b9 sd_event_dispatch (systemd-importd)
                #17 0x000055b6d34566f8 sd_event_run (systemd-importd)
                #18 0x000055b6d33ba72a bus_event_loop_with_idle (systemd-importd)
                #19 0x000055b6d33b95bc manager_run (systemd-importd)
                #20 0x000055b6d33b9766 main (systemd-importd)
                #21 0x00007f5eaefe2a00 __libc_start_main (libc.so.6)
                #22 0x000055b6d33b5569 _start (systemd-importd)

(cherry picked from commit 579afbe)

[fbui: fixes bsc#1053595]
fbuihuu pushed a commit that referenced this pull request Jan 16, 2018
In general we'd leak anything that was allocated in the first parsing of
netdev, e.g. netdev name, host name, etc. Use normal netdev_unref to make sure
everything is freed.

--- command ---
/home/zbyszek/src/systemd/build2/test-network
--- stderr ---
/etc/systemd/network/wg0.netdev:3: Failed to parse netdev kind, ignoring: wireguard
/etc/systemd/network/wg0.netdev:5: Unknown section 'WireGuard'. Ignoring.
/etc/systemd/network/wg0.netdev:9: Unknown section 'WireGuardPeer'. Ignoring.
NetDev has no Kind configured in /etc/systemd/network/wg0.netdev. Ignoring
/etc/systemd/network/br0.network:13: Unknown lvalue 'NetDev' in section 'Network'
br0: netdev ready

=================================================================
==11666==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 4 byte(s) in 1 object(s) allocated from:
    #0 0x7f3a314cf238 in __interceptor_strdup (/lib64/libasan.so.4+0x77238)
    #1 0x7f3a30e71ad1 in free_and_strdup ../src/basic/string-util.c:870
    #2 0x7f3a30d34fba in config_parse_ifname ../src/shared/conf-parser.c:981
    #3 0x7f3a30d2f5b0 in next_assignment ../src/shared/conf-parser.c:155
    #4 0x7f3a30d30303 in parse_line ../src/shared/conf-parser.c:273
    #5 0x7f3a30d30dee in config_parse ../src/shared/conf-parser.c:390
    #6 0x7f3a30d310a5 in config_parse_many_files ../src/shared/conf-parser.c:428
    #7 0x7f3a30d3181c in config_parse_many ../src/shared/conf-parser.c:487
    #8 0x55b4200f9b00 in netdev_load_one ../src/network/netdev/netdev.c:634
    #9 0x55b4200fb562 in netdev_load ../src/network/netdev/netdev.c:778
    #10 0x55b4200c607a in manager_load_config ../src/network/networkd-manager.c:1299
    #11 0x55b4200818e0 in test_load_config ../src/network/test-network.c:128
    #12 0x55b42008343b in main ../src/network/test-network.c:254
    #13 0x7f3a305f8889 in __libc_start_main (/lib64/libc.so.6+0x20889)

SUMMARY: AddressSanitizer: 4 byte(s) leaked in 1 allocation(s).
-------
@mwilck

This comment has been minimized.

Copy link
Contributor

commented on fc5362d Feb 23, 2018

I think this rule is wrong.

  1. I see no reason to make a scheduler decision for SCSI devices only
  2. The rule fails for customers who activate SCSI-mq with scsi_mod.use_blk_mq=1
  3. It's at least questionable to override the "smart" decision, which is based on device properties, by the coarse "elevator" command line option.

I'd propose something like the following instead (fixing 1. and 2. but not 3.):

ACTION!="add", GOTO="ssd_scheduler_end"
SUBSYSTEM!="block", GOTO="ssd_scheduler_end"

TEST=="%S%p/mq", ENV{.IS_MQ}="1"
ENV{.IS_MQ}=="1", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="kyber"
ENV{.IS_MQ}=="1", ATTR{queue/rotational}!="0", ATTR{queue/scheduler}="bfq"
ENV{.IS_MQ}!="1", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="deadline"
ENV{.IS_MQ}!="1", ATTR{queue/rotational}!="0", ATTR{queue/scheduler}="cfq"

LABEL="ssd_scheduler_end"

This comment has been minimized.

Copy link
Collaborator

replied Feb 27, 2018

Hi Martin, these changes should be reviewed by the kernel folks, I cannot really comment here...

This comment has been minimized.

Copy link
Contributor

replied Feb 27, 2018

I discussed with @hreinecke shortly today, he said my choice of "kyber" was wrong and we should use "mq-deadline" instead. Otherwise, he agreed.

This comment has been minimized.

Copy link
Collaborator

replied Mar 12, 2018

Hi Martin,

Sorry for the delay.

Can you submit your changes to Base:System/systemd ? the rule file has been migrated there.

Since the changes are coming from you and have been also reviewed by @hreinecke, I'll (blindly) apply them.

Thanks.

@fbuihuu fbuihuu force-pushed the openSUSE:openSUSE-Factory branch from 295ead0 to 78221ca Mar 1, 2018
fbuihuu pushed a commit that referenced this pull request Mar 15, 2018
Some versions of asan report the following false positive
when strict_string_checks=1 is passed:

=================================================================
==3297==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f64e4090286 bp 0x7ffe46acd9a0 sp 0x7ffe46acd118 T0)
==3297==The signal is caused by a READ memory access.
==3297==Hint: address points to the zero page.
    #0 0x7f64e4090285 in __strlen_sse2 (/lib64/libc.so.6+0xaa285)
    #1 0x7f64e5a51e46  (/lib64/libasan.so.4+0x41e46)
    #2 0x7f64e4e5e3a0  (/lib64/libglib-2.0.so.0+0x383a0)
    #3 0x7f64e4e5e536 in g_dgettext (/lib64/libglib-2.0.so.0+0x38536)
    #4 0x7f64e48fac5f  (/lib64/libgio-2.0.so.0+0xc1c5f)
    #5 0x7f64e4c03978 in g_type_class_ref (/lib64/libgobject-2.0.so.0+0x30978)
    #6 0x7f64e4be9567 in g_object_new_with_properties (/lib64/libgobject-2.0.so.0+0x16567)
    #7 0x7f64e4be9fd0 in g_object_new (/lib64/libgobject-2.0.so.0+0x16fd0)
    #8 0x7f64e48fd43e in g_dbus_message_new_from_blob (/lib64/libgio-2.0.so.0+0xc443e)
    #9 0x564a6aa0de52 in main ../src/libsystemd/sd-bus/test-bus-marshal.c:228
    #10 0x7f64e4007009 in __libc_start_main (/lib64/libc.so.6+0x21009)
    #11 0x564a6aa0a569 in _start (/home/vagrant/systemd/build/test-bus-marshal+0x5569)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/lib64/libc.so.6+0xaa285) in __strlen_sse2
==3297==ABORTING

It's an external library and errors in external libraries are generally not very
useful for looking for internal bugs.

It would be better not to change the code and use standard suppression
techinques decribed at
https://clang.llvm.org/docs/AddressSanitizer.html#suppressing-reports-in-external-libraries,
but, unfortunaley, none of them seems to be able to suppress fatal errors in asan intself.
fbuihuu pushed a commit that referenced this pull request Mar 15, 2018
Fuzzing with AddressSanitizer reports an error here:
==11==ERROR: AddressSanitizer: global-buffer-overflow on address 0x7fe53f5497d8 at pc 0x7fe53ef055c9 bp 0x7ffd344e9380 sp 0x7ffd344e9378
READ of size 4 at 0x7fe53f5497d8 thread T0
SCARINESS: 27 (4-byte-read-global-buffer-overflow-far-from-bounds)
    #0 0x7fe53ef055c8 in bus_error_name_to_errno /work/build/../../src/systemd/src/libsystemd/sd-bus/bus-error.c:118:24
    #1 0x7fe53ef0577b in bus_error_setfv /work/build/../../src/systemd/src/libsystemd/sd-bus/bus-error.c:274:17
    #2 0x7fe53ef0595a in sd_bus_error_setf /work/build/../../src/systemd/src/libsystemd/sd-bus/bus-error.c:284:21
    #3 0x561059 in manager_load_unit_prepare /work/build/../../src/systemd/src/core/manager.c
    #4 0x560680 in manager_load_unit /work/build/../../src/systemd/src/core/manager.c:1773:13
    #5 0x5d49a6 in unit_add_dependency_by_name /work/build/../../src/systemd/src/core/unit.c:2882:13
    #6 0x538996 in config_parse_unit_deps /work/build/../../src/systemd/src/core/load-fragment.c:152:21
    #7 0x6db771 in next_assignment /work/build/../../src/systemd/src/shared/conf-parser.c:155:32
    #8 0x6d697e in parse_line /work/build/../../src/systemd/src/shared/conf-parser.c:273:16
    #9 0x6d5c48 in config_parse /work/build/../../src/systemd/src/shared/conf-parser.c:390:21
    #10 0x535678 in LLVMFuzzerTestOneInput /work/build/../../src/systemd/src/fuzz/fuzz-unit-file.c:41:16
    #11 0x73bd60 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/libfuzzer/FuzzerLoop.cpp:517:13
    #12 0x73a39f in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*) /src/libfuzzer/FuzzerLoop.cpp:442:3
    #13 0x73d9bc in fuzzer::Fuzzer::MutateAndTestOne() /src/libfuzzer/FuzzerLoop.cpp:650:19
    #14 0x73fa05 in fuzzer::Fuzzer::Loop(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, fuzzer::fuzzer_allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) /src/libfuzzer/FuzzerLoop.cpp:773:5
    #15 0x71f75d in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/libfuzzer/FuzzerDriver.cpp:754:6
    #16 0x71285c in main /src/libfuzzer/FuzzerMain.cpp:20:10
    #17 0x7fe53da0482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #18 0x430e68 in _start (/out/fuzz-unit-file+0x430e68)

0x7fe53f5497d8 is located 8 bytes to the right of global variable 'bus_common_errors' defined in '../../src/systemd/src/libsystemd/sd-bus/bus-common-errors.c:28:51' (0x7fe53f549300) of size 1232
SUMMARY: AddressSanitizer: global-buffer-overflow /work/build/../../src/systemd/src/libsystemd/sd-bus/bus-error.c:118:24 in bus_error_name_to_errno
Shadow bytes around the buggy address:
  0x0ffd27ea12a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ffd27ea12b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ffd27ea12c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ffd27ea12d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ffd27ea12e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0ffd27ea12f0: 00 00 00 00 00 00 00 00 00 00 f9[f9]f9 f9 f9 f9
  0x0ffd27ea1300: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0ffd27ea1310: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  0x0ffd27ea1320: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ffd27ea1330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ffd27ea1340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==11==ABORTING

but I think it's a false positive because of our low-level magic in how this
area is constructed.
@develop7

This comment has been minimized.

Copy link
Author

commented Mar 28, 2018

Moved to https://build.opensuse.org/request/show/591774 since the file in question became distribution-only

@develop7 develop7 closed this Mar 28, 2018
fbuihuu pushed a commit that referenced this pull request May 31, 2018
`fuzz-journal-remote` seems to be failing under `msan` as soon as it starts:

$ sudo infra/helper.py run_fuzzer systemd fuzz-journal-remote
Running: docker run --rm -i --privileged -e FUZZING_ENGINE=libfuzzer -v /home/vagrant/oss-fuzz/build/out/systemd:/out -t gcr.io/oss-fuzz-base/base-runner run_fuzzer fuzz-journal-remote
Using seed corpus: fuzz-journal-remote_seed_corpus.zip
/out/fuzz-journal-remote -rss_limit_mb=2048 -timeout=25 /tmp/fuzz-journal-remote_corpus -max_len=65536 < /dev/null
INFO: Seed: 3380449479
INFO: Loaded 2 modules   (36336 inline 8-bit counters): 36139 [0x7ff36ea31d39, 0x7ff36ea3aa64), 197 [0x9998c8, 0x99998d),
INFO: Loaded 2 PC tables (36336 PCs): 36139 [0x7ff36ea3aa68,0x7ff36eac7d18), 197 [0x999990,0x99a5e0),
INFO:        2 files found in /tmp/fuzz-journal-remote_corpus
INFO: seed corpus: files: 2 min: 4657b max: 7790b total: 12447b rss: 97Mb
Uninitialized bytes in __interceptor_pwrite64 at offset 24 inside [0x7fffdd4d7230, 240)
==15==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x7ff36e685e8a in journal_file_init_header /work/build/../../src/systemd/src/journal/journal-file.c:436:13
    #1 0x7ff36e683a9d in journal_file_open /work/build/../../src/systemd/src/journal/journal-file.c:3333:21
    #2 0x7ff36e68b8f6 in journal_file_open_reliably /work/build/../../src/systemd/src/journal/journal-file.c:3520:13
    #3 0x4a3f35 in open_output /work/build/../../src/systemd/src/journal-remote/journal-remote.c:70:13
    #4 0x4a34d0 in journal_remote_get_writer /work/build/../../src/systemd/src/journal-remote/journal-remote.c:136:21
    #5 0x4a550f in get_source_for_fd /work/build/../../src/systemd/src/journal-remote/journal-remote.c:183:13
    #6 0x4a46bd in journal_remote_add_source /work/build/../../src/systemd/src/journal-remote/journal-remote.c:235:13
    #7 0x4a271c in LLVMFuzzerTestOneInput /work/build/../../src/systemd/src/fuzz/fuzz-journal-remote.c:36:9
    #8 0x4f27cc in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/libfuzzer/FuzzerLoop.cpp:524:13
    #9 0x4efa0b in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*) /src/libfuzzer/FuzzerLoop.cpp:448:3
    #10 0x4f8e96 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, fuzzer::fuzzer_allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) /src/libfuzzer/FuzzerLoop.cpp:732:7
    #11 0x4f9f73 in fuzzer::Fuzzer::Loop(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, fuzzer::fuzzer_allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) /src/libfuzzer/FuzzerLoop.cpp:752:3
    #12 0x4bf329 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/libfuzzer/FuzzerDriver.cpp:756:6
    #13 0x4ac391 in main /src/libfuzzer/FuzzerMain.cpp:20:10
    #14 0x7ff36d14982f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #15 0x41f9d8 in _start (/out/fuzz-journal-remote+0x41f9d8)

  Uninitialized value was stored to memory at
    #0 0x7ff36e61cd41 in sd_id128_randomize /work/build/../../src/systemd/src/libsystemd/sd-id128/sd-id128.c:288:16
    #1 0x7ff36e685cec in journal_file_init_header /work/build/../../src/systemd/src/journal/journal-file.c:426:13
    #2 0x7ff36e683a9d in journal_file_open /work/build/../../src/systemd/src/journal/journal-file.c:3333:21
    #3 0x7ff36e68b8f6 in journal_file_open_reliably /work/build/../../src/systemd/src/journal/journal-file.c:3520:13
    #4 0x4a3f35 in open_output /work/build/../../src/systemd/src/journal-remote/journal-remote.c:70:13
    #5 0x4a34d0 in journal_remote_get_writer /work/build/../../src/systemd/src/journal-remote/journal-remote.c:136:21
    #6 0x4a550f in get_source_for_fd /work/build/../../src/systemd/src/journal-remote/journal-remote.c:183:13
    #7 0x4a46bd in journal_remote_add_source /work/build/../../src/systemd/src/journal-remote/journal-remote.c:235:13
    #8 0x4a271c in LLVMFuzzerTestOneInput /work/build/../../src/systemd/src/fuzz/fuzz-journal-remote.c:36:9
    #9 0x4f27cc in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/libfuzzer/FuzzerLoop.cpp:524:13
    #10 0x4efa0b in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*) /src/libfuzzer/FuzzerLoop.cpp:448:3
    #11 0x4f8e96 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, fuzzer::fuzzer_allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) /src/libfuzzer/FuzzerLoop.cpp:732:7
    #12 0x4f9f73 in fuzzer::Fuzzer::Loop(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, fuzzer::fuzzer_allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) /src/libfuzzer/FuzzerLoop.cpp:752:3
    #13 0x4bf329 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/libfuzzer/FuzzerDriver.cpp:756:6
    #14 0x4ac391 in main /src/libfuzzer/FuzzerMain.cpp:20:10
    #15 0x7ff36d14982f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

  Uninitialized value was created by an allocation of 't' in the stack frame of function 'sd_id128_randomize'
    #0 0x7ff36e61cb00 in sd_id128_randomize /work/build/../../src/systemd/src/libsystemd/sd-id128/sd-id128.c:274

SUMMARY: MemorySanitizer: use-of-uninitialized-value /work/build/../../src/systemd/src/journal/journal-file.c:436:13 in journal_file_init_header
Exiting
MS: 0 ; base unit: 0000000000000000000000000000000000000000
artifact_prefix='./'; Test unit written to ./crash-847911777b3096783f4ee70a69ab6d28380c810b
[vagrant@localhost oss-fuzz]$ sudo infra/helper.py check_build --sanitizer=memory systemd
Running: docker run --rm -i --privileged -e FUZZING_ENGINE=libfuzzer -e SANITIZER=memory -v /home/vagrant/oss-fuzz/build/out/systemd:/out -t gcr.io/oss-fuzz-base/base-runner test_all
INFO: performing bad build checks for /out/fuzz-dhcp-server.
INFO: performing bad build checks for /out/fuzz-journal-remote.
INFO: performing bad build checks for /out/fuzz-unit-file.
INFO: performing bad build checks for /out/fuzz-dns-packet.
4 fuzzers total, 0 seem to be broken (0%).
Check build passed.

It's a false positive which is most likely caused by
google/sanitizers#852. I think it could be got around
by avoiding `getrandom` when the code is compiled with `msan`
fbuihuu pushed a commit that referenced this pull request Sep 3, 2018
…g it

This fixes a memory leak:
```
d5070e2f67ededca022f81f2941900606b16f3196b2268e856295f59._openpgpkey.gmail.com: resolve call failed: 'd5070e2f67ededca022f81f2941900606b16f3196b2268e856295f59._openpgpkey.gmail.com' not found

=================================================================
==224==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 65 byte(s) in 1 object(s) allocated from:
    #0 0x7f71b0878850 in malloc (/usr/lib64/libasan.so.4+0xde850)
    #1 0x7f71afaf69b0 in malloc_multiply ../src/basic/alloc-util.h:63
    #2 0x7f71afaf6c95 in hexmem ../src/basic/hexdecoct.c:62
    #3 0x7f71afbb574b in string_hashsum ../src/basic/gcrypt-util.c:45
    #4 0x56201333e0b9 in string_hashsum_sha256 ../src/basic/gcrypt-util.h:30
    #5 0x562013347b63 in resolve_openpgp ../src/resolve/resolvectl.c:908
    #6 0x562013348b9f in verb_openpgp ../src/resolve/resolvectl.c:944
    #7 0x7f71afbae0b0 in dispatch_verb ../src/basic/verbs.c:119
    #8 0x56201335790b in native_main ../src/resolve/resolvectl.c:2947
    #9 0x56201335880d in main ../src/resolve/resolvectl.c:3087
    #10 0x7f71ad8fcf29 in __libc_start_main (/lib64/libc.so.6+0x20f29)

SUMMARY: AddressSanitizer: 65 byte(s) leaked in 1 allocation(s).
```
fbuihuu pushed a commit that referenced this pull request Oct 1, 2018
==14==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200055fa9c at pc 0x0000005458f1 bp 0x7ffc78940d90 sp 0x7ffc78940d88
READ of size 1 at 0x60200055fa9c thread T0
    #0 0x5458f0 in dhcp6_option_parse_domainname /work/build/../../src/systemd/src/libsystemd-network/dhcp6-option.c:555:29
    #1 0x54706e in dhcp6_lease_set_domains /work/build/../../src/systemd/src/libsystemd-network/sd-dhcp6-lease.c:242:13
    #2 0x53fce0 in client_parse_message /work/build/../../src/systemd/src/libsystemd-network/sd-dhcp6-client.c:984:29
    #3 0x53f3bc in client_receive_advertise /work/build/../../src/systemd/src/libsystemd-network/sd-dhcp6-client.c:1083:13
    #4 0x53d57f in client_receive_message /work/build/../../src/systemd/src/libsystemd-network/sd-dhcp6-client.c:1182:21
    #5 0x7f0f7159deee in source_dispatch /work/build/../../src/systemd/src/libsystemd/sd-event/sd-event.c:3042:21
    #6 0x7f0f7159d431 in sd_event_dispatch /work/build/../../src/systemd/src/libsystemd/sd-event/sd-event.c:3455:21
    #7 0x7f0f7159ea8d in sd_event_run /work/build/../../src/systemd/src/libsystemd/sd-event/sd-event.c:3512:21
    #8 0x531f2b in fuzz_client /work/build/../../src/systemd/src/fuzz/fuzz-dhcp6-client.c:44:9
    #9 0x531bc1 in LLVMFuzzerTestOneInput /work/build/../../src/systemd/src/fuzz/fuzz-dhcp6-client.c:53:9
    #10 0x57bec8 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/libfuzzer/FuzzerLoop.cpp:570:15
    #11 0x579d67 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*) /src/libfuzzer/FuzzerLoop.cpp:479:3
    #12 0x57dc92 in fuzzer::Fuzzer::MutateAndTestOne() /src/libfuzzer/FuzzerLoop.cpp:707:19
    #13 0x580ca6 in fuzzer::Fuzzer::Loop(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, fuzzer::fuzzer_allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) /src/libfuzzer/FuzzerLoop.cpp:838:5
    #14 0x55e968 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/libfuzzer/FuzzerDriver.cpp:764:6
    #15 0x551a1c in main /src/libfuzzer/FuzzerMain.cpp:20:10
    #16 0x7f0f701a082f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #17 0x41e928 in _start (/out/fuzz-dhcp6-client+0x41e928)
fbuihuu pushed a commit that referenced this pull request Oct 1, 2018
==14==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020001c761a at pc 0x000000540abc bp 0x7ffd0caf2c50 sp 0x7ffd0caf2c48
READ of size 2 at 0x6020001c761a thread T0
    #0 0x540abb in client_parse_message /work/build/../../src/systemd/src/libsystemd-network/sd-dhcp6-client.c:849:73
    #1 0x53f3bc in client_receive_advertise /work/build/../../src/systemd/src/libsystemd-network/sd-dhcp6-client.c:1083:13
    #2 0x53d57f in client_receive_message /work/build/../../src/systemd/src/libsystemd-network/sd-dhcp6-client.c:1182:21
    #3 0x7f71d8c3eeee in source_dispatch /work/build/../../src/systemd/src/libsystemd/sd-event/sd-event.c:3042:21
    #4 0x7f71d8c3e431 in sd_event_dispatch /work/build/../../src/systemd/src/libsystemd/sd-event/sd-event.c:3455:21
    #5 0x7f71d8c3fa8d in sd_event_run /work/build/../../src/systemd/src/libsystemd/sd-event/sd-event.c:3512:21
    #6 0x531f2b in fuzz_client /work/build/../../src/systemd/src/fuzz/fuzz-dhcp6-client.c:44:9
    #7 0x531bc1 in LLVMFuzzerTestOneInput /work/build/../../src/systemd/src/fuzz/fuzz-dhcp6-client.c:53:9
    #8 0x57bef8 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/libfuzzer/FuzzerLoop.cpp:570:15
    #9 0x579d97 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*) /src/libfuzzer/FuzzerLoop.cpp:479:3
    #10 0x57dcc2 in fuzzer::Fuzzer::MutateAndTestOne() /src/libfuzzer/FuzzerLoop.cpp:707:19
    #11 0x580cd6 in fuzzer::Fuzzer::Loop(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, fuzzer::fuzzer_allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) /src/libfuzzer/FuzzerLoop.cpp:838:5
    #12 0x55e998 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/libfuzzer/FuzzerDriver.cpp:764:6
    #13 0x551a4c in main /src/libfuzzer/FuzzerMain.cpp:20:10
    #14 0x7f71d784182f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #15 0x41e928 in _start (/out/fuzz-dhcp6-client+0x41e928)
fbuihuu pushed a commit that referenced this pull request Nov 22, 2018
This is a follow-up to 8857fb9 that prevents the fuzzer from crashing with
```
==220==ERROR: AddressSanitizer: ABRT on unknown address 0x0000000000dc (pc 0x7ff4953c8428 bp 0x7ffcf66ec290 sp 0x7ffcf66ec128 T0)
SCARINESS: 10 (signal)
    #0 0x7ff4953c8427 in gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x35427)
    #1 0x7ff4953ca029 in abort (/lib/x86_64-linux-gnu/libc.so.6+0x37029)
    #2 0x7ff49666503a in log_assert_failed_realm /work/build/../../src/systemd/src/basic/log.c:805:9
    #3 0x7ff496614ecf in safe_close /work/build/../../src/systemd/src/basic/fd-util.c:66:17
    #4 0x548806 in server_done /work/build/../../src/systemd/src/journal/journald-server.c:2064:9
    #5 0x5349fa in LLVMFuzzerTestOneInput /work/build/../../src/systemd/src/fuzz/fuzz-journald-kmsg.c:26:9
    #6 0x592755 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/libfuzzer/FuzzerLoop.cpp:571:15
    #7 0x590627 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*) /src/libfuzzer/FuzzerLoop.cpp:480:3
    #8 0x594432 in fuzzer::Fuzzer::MutateAndTestOne() /src/libfuzzer/FuzzerLoop.cpp:708:19
    #9 0x5973c6 in fuzzer::Fuzzer::Loop(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, fuzzer::fuzzer_allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) /src/libfuzzer/FuzzerLoop.cpp:839:5
    #10 0x574541 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/libfuzzer/FuzzerDriver.cpp:764:6
    #11 0x5675fc in main /src/libfuzzer/FuzzerMain.cpp:20:10
    #12 0x7ff4953b382f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #13 0x420f58 in _start (/out/fuzz-journald-kmsg+0x420f58)
```
fbuihuu pushed a commit that referenced this pull request Jan 29, 2019
This function returns 0 on success and a negative value on failure. On success,
it writes the parsed action to the address passed in its third argument.

`bus_set_transient_emergency_action` does this:

 r = parse_emergency_action(s, system, &v);
 if (v < 0)
     // handle failure

However, `v` is not updated if the function fails, and this should be checking
`r` instead of `v`.

The result of this is that if an invalid failure (or success) action is
specified, systemd ends up creating the unit anyway and then misbehaves if it
tries to run the failure action because the action value comes from
uninitialized stack data. In my case, this resulted in a failed assertion:

 Program received signal SIGABRT, Aborted.
 0x00007fe52cca0d7f in raise () from /snap/usr/lib/libc.so.6
 (gdb) bt
 #0  0x00007fe52cca0d7f in raise () from /snap/usr/lib/libc.so.6
 #1  0x00007fe52cc8b672 in abort () from /snap/usr/lib/libc.so.6
 #2  0x00007fe52d66f169 in log_assert_failed_realm (realm=LOG_REALM_SYSTEMD, text=0x56177ab8e000 "action < _EMERGENCY_ACTION_MAX", file=0x56177ab8dfb8 "../src/core/emergency-action.c", line=33, func=0x56177ab8e2b0 <__PRETTY_FUNCTION__.14207> "emergency_action") at ../src/basic/log.c:795
 #3  0x000056177aa98cf4 in emergency_action (m=0x56177c992cb0, action=2059118610, options=(unknown: 0), reboot_arg=0x0, exit_status=1, reason=0x7ffdd2df4290 "unit run-u0.service failed") at ../src/core/emergency-action.c:33
 #4  0x000056177ab2b739 in unit_notify (u=0x56177c9eb340, os=UNIT_ACTIVE, ns=UNIT_FAILED, flags=(unknown: 0)) at ../src/core/unit.c:2504
 #5  0x000056177aaf62ed in service_set_state (s=0x56177c9eb340, state=SERVICE_FAILED) at ../src/core/service.c:1104
 #6  0x000056177aaf8a29 in service_enter_dead (s=0x56177c9eb340, f=SERVICE_SUCCESS, allow_restart=true) at ../src/core/service.c:1712
 #7  0x000056177aaf9233 in service_enter_signal (s=0x56177c9eb340, state=SERVICE_FINAL_SIGKILL, f=SERVICE_SUCCESS) at ../src/core/service.c:1854
 #8  0x000056177aaf921b in service_enter_signal (s=0x56177c9eb340, state=SERVICE_FINAL_SIGTERM, f=SERVICE_SUCCESS) at ../src/core/service.c:1852
 #9  0x000056177aaf8eb3 in service_enter_stop_post (s=0x56177c9eb340, f=SERVICE_SUCCESS) at ../src/core/service.c:1788
 #10 0x000056177aaf91eb in service_enter_signal (s=0x56177c9eb340, state=SERVICE_STOP_SIGKILL, f=SERVICE_SUCCESS) at ../src/core/service.c:1850
 #11 0x000056177aaf91bc in service_enter_signal (s=0x56177c9eb340, state=SERVICE_STOP_SIGTERM, f=SERVICE_FAILURE_EXIT_CODE) at ../src/core/service.c:1848
 #12 0x000056177aaf9759 in service_enter_running (s=0x56177c9eb340, f=SERVICE_FAILURE_EXIT_CODE) at ../src/core/service.c:1941
 #13 0x000056177ab005b7 in service_sigchld_event (u=0x56177c9eb340, pid=112, code=1, status=1) at ../src/core/service.c:3296
 #14 0x000056177aad84b5 in manager_invoke_sigchld_event (m=0x56177c992cb0, u=0x56177c9eb340, si=0x7ffdd2df48f0) at ../src/core/manager.c:2444
 #15 0x000056177aad88df in manager_dispatch_sigchld (source=0x56177c994710, userdata=0x56177c992cb0) at ../src/core/manager.c:2508
 #16 0x00007fe52d72f807 in source_dispatch (s=0x56177c994710) at ../src/libsystemd/sd-event/sd-event.c:2846
 #17 0x00007fe52d730f7d in sd_event_dispatch (e=0x56177c993530) at ../src/libsystemd/sd-event/sd-event.c:3229
 #18 0x00007fe52d73142e in sd_event_run (e=0x56177c993530, timeout=18446744073709551615) at ../src/libsystemd/sd-event/sd-event.c:3286
 #19 0x000056177aad9f71 in manager_loop (m=0x56177c992cb0) at ../src/core/manager.c:2906
 #20 0x000056177aa7c876 in invoke_main_loop (m=0x56177c992cb0, ret_reexecute=0x7ffdd2df4bff, ret_retval=0x7ffdd2df4c04, ret_shutdown_verb=0x7ffdd2df4c58, ret_fds=0x7ffdd2df4c70, ret_switch_root_dir=0x7ffdd2df4c48, ret_switch_root_init=0x7ffdd2df4c50, ret_error_message=0x7ffdd2df4c60) at ../src/core/main.c:1792
 #21 0x000056177aa7f251 in main (argc=2, argv=0x7ffdd2df4e78) at ../src/core/main.c:2573

Fix this by checking the correct variable.
fbuihuu pushed a commit that referenced this pull request Apr 12, 2019
I had a test machine with ulimit -n set to 1073741816 through pam
("session required pam_limits.so set_all", which copies the limits from PID 1,
left over from testing of #10921).

test-execute would "hang" and then fail with a timeout when running
exec-inaccessiblepaths-proc.service. It turns out that the problem was in
close_all_fds(), which would go to the fallback path of doing close()
1073741813 times. Let's just fail if we hit this case. This only matters
for cases where both /proc is inaccessible, and the *soft* limit has been
raised.

  (gdb) bt
  #0  0x00007f7e2e73fdc8 in close () from target:/lib64/libc.so.6
  #1  0x00007f7e2e42cdfd in close_nointr ()
     from target:/home/zbyszek/src/systemd-work3/build-rawhide/src/shared/libsystemd-shared-241.so
  #2  0x00007f7e2e42d525 in close_all_fds ()
     from target:/home/zbyszek/src/systemd-work3/build-rawhide/src/shared/libsystemd-shared-241.so
  #3  0x0000000000426e53 in exec_child ()
  #4  0x0000000000429578 in exec_spawn ()
  #5  0x00000000004ce1ab in service_spawn ()
  #6  0x00000000004cff77 in service_enter_start ()
  #7  0x00000000004d028f in service_enter_start_pre ()
  #8  0x00000000004d16f2 in service_start ()
  #9  0x00000000004568f4 in unit_start ()
  #10 0x0000000000416987 in test ()
  #11 0x0000000000417632 in test_exec_inaccessiblepaths ()
  #12 0x0000000000419362 in run_tests ()
  #13 0x0000000000419632 in main ()
fbuihuu pushed a commit that referenced this pull request Sep 3, 2019
We would not send the property because we'd call sd_bus_get_current_message()
which would return NULL. If there is no message, we cannot support /self or
/auto, but things are still OK if a path with a session name is given.

Traceback when the issue is triggered:

 #2  we'd call sd_bus_get_current_message() here, which would return NULL, and
     session_object_find() would immediately return 0.
 #3  0x00000000004289b7 in session_object_find (bus=0x9f1110, path=0xa160b0 "/org/freedesktop/login1/session/c2",
     interface=0x9efda0 "org.freedesktop.login1.Session", userdata=0x9852f0, found=0x7ffe3e975fe8, error=0x7ffe3e9760b0)
     at ../src/login/logind-session-dbus.c:620
 #4  0x00007ff74bfdde39 in node_vtable_get_userdata (bus=0x9f1110, path=0xa160b0 "/org/freedesktop/login1/session/c2",
     c=0x9f6d58, userdata=0x7ffe3e976070, error=0x7ffe3e9760b0) at ../src/libsystemd/sd-bus/bus-objects.c:37
 #5  0x00007ff74bfe49af in emit_properties_changed_on_interface (bus=0x9f1110,
     prefix=0xa133a0 "/org/freedesktop/login1/session", path=0xa160b0 "/org/freedesktop/login1/session/c2",
     interface=0x43f9f8 "org.freedesktop.login1.Session", require_fallback=true, found_interface=0x7ffe3e976163,
     names=0x7ffe3e9761b0) at ../src/libsystemd/sd-bus/bus-objects.c:2088
 #6  0x00007ff74bfe56a4 in sd_bus_emit_properties_changed_strv (bus=0x9f1110,
     path=0xa160b0 "/org/freedesktop/login1/session/c2", interface=0x43f9f8 "org.freedesktop.login1.Session",
     names=0x7ffe3e9761b0) at ../src/libsystemd/sd-bus/bus-objects.c:2291
 #7  0x00000000004292ea in session_send_changed (s=0xa16e10, properties=0x43ee27 "Active")
    at ../src/login/logind-session-dbus.c:730
 #8  0x0000000000424cd7 in seat_set_active (s=0x9ee280, session=0xa16e10) at ../src/login/logind-seat.c:249
 #9  0x00000000004251cf in seat_active_vt_changed (s=0x9ee280, vtnr=3) at ../src/login/logind-seat.c:361
 #10 0x000000000042547b in seat_read_active_vt (s=0x9ee280) at ../src/login/logind-seat.c:395
 #11 0x000000000040ab5c in manager_dispatch_console (s=0x9f0320, fd=8, revents=8, userdata=0x9852f0)
     at ../src/login/logind.c:588
 #12 0x00007ff74c042d5f in source_dispatch (s=0x9f0320) at ../src/libsystemd/sd-event/sd-event.c:2828
 #13 0x00007ff74c04469f in sd_event_dispatch (e=0x9ef340) at ../src/libsystemd/sd-event/sd-event.c:3241
 #14 0x00007ff74c044b58 in sd_event_run (e=0x9ef340, timeout=18446744073709551615)
     at ../src/libsystemd/sd-event/sd-event.c:3299
 #15 0x000000000040d7e8 in manager_run (m=0x9852f0) at ../src/login/logind.c:1186
 #16 0x000000000040db58 in run (argc=1, argv=0x7ffe3e976728) at ../src/login/logind.c:1234
 #17 0x000000000040dc30 in main (argc=1, argv=0x7ffe3e976728) at ../src/login/logind.c:1244

Fixes #13437. Bug introduced in 3b92c08.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.