-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
travis: also rebuild everything with ASan and UBSan and install dependencies necessary for running parse-hwdb.py #10678
Conversation
The most important part here is the second commit where |
I ran into https://docs.travis-ci.com/user/common-build-problems/#log-length-exceeded :-) I'll move it into a separate stage. So SYSTEMD_KBD_MODEL_MAP='/build/src/locale/kbd-model-map' PATH='/build/build:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin' SYSTEMD_LANGUAGE_FALLBACK_MAP='/build/src/locale/language-fallback-map' /build/build/test-sd-hwdb
--- stderr ---
Found container virtualization container-other.
/* test_failed_enumerate */
hwdb.bin does not exist, please run 'systemd-hwdb update'
=================================================================
==8268==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000000140 at pc 0x7fed4822ac27 bp 0x7ffcdc756c30 sp 0x7ffcdc756c20
WRITE of size 4 at 0x606000000140 thread T0
#0 0x7fed4822ac26 in sd_hwdb_unref ../src/libsystemd/sd-hwdb/sd-hwdb.c:373
#1 0x55a5a493e290 in sd_hwdb_unrefp ../src/systemd/sd-hwdb.h:41
#2 0x55a5a493e61e in test_failed_enumerate ../src/test/test-sd-hwdb.c:8
#3 0x55a5a493ee09 in main ../src/test/test-sd-hwdb.c:67
#4 0x7fed47c5e412 in __libc_start_main (/lib64/libc.so.6+0x24412)
#5 0x55a5a493e18d in _start (/build/build/test-sd-hwdb+0x218d)
0x606000000140 is located 0 bytes inside of 64-byte region [0x606000000140,0x606000000180)
freed by thread T0 here:
#0 0x7fed48667480 in free (/lib64/libasan.so.5+0xef480)
#1 0x7fed480f201e in freep ../src/basic/alloc-util.h:65
#2 0x7fed480f2726 in proc_cmdline_parse ../src/basic/proc-cmdline.c:86
#3 0x7fed480d7406 in log_parse_environment_realm ../src/basic/log.c:1121
#4 0x7fed480180af in test_setup_logging ../src/shared/tests.c:104
#5 0x55a5a493ee04 in main ../src/test/test-sd-hwdb.c:65
#6 0x7fed47c5e412 in __libc_start_main (/lib64/libc.so.6+0x24412)
previously allocated by thread T0 here:
#0 0x7fed48667c88 in realloc (/lib64/libasan.so.5+0xefc88)
#1 0x7fed48029796 in greedy_realloc ../src/basic/alloc-util.c:55
#2 0x7fed480f4824 in get_process_cmdline ../src/basic/process-util.c:147
#3 0x7fed480f21a4 in proc_cmdline ../src/basic/proc-cmdline.c:37
#4 0x7fed480f26bb in proc_cmdline_parse ../src/basic/proc-cmdline.c:91
#5 0x7fed480d7406 in log_parse_environment_realm ../src/basic/log.c:1121
#6 0x7fed480180af in test_setup_logging ../src/shared/tests.c:104
#7 0x55a5a493ee04 in main ../src/test/test-sd-hwdb.c:65
#8 0x7fed47c5e412 in __libc_start_main (/lib64/libc.so.6+0x24412)
SUMMARY: AddressSanitizer: heap-use-after-free ../src/libsystemd/sd-hwdb/sd-hwdb.c:373 in sd_hwdb_unref
Shadow bytes around the buggy address:
0x0c0c7fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c0c7fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c0c7fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0c0c7fff8000: fa fa fa fa 00 00 00 00 00 00 05 fa fa fa fa fa
0x0c0c7fff8010: fd fd fd fd fd fd fd fd fa fa fa fa fd fd fd fd
=>0x0c0c7fff8020: fd fd fd fd fa fa fa fa[fd]fd fd fd fd fd fd fd
0x0c0c7fff8030: fa fa fa fa fd fd fd fd fd fd fd fd fa fa fa fa
0x0c0c7fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c0c7fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c0c7fff8060: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c0c7fff8070: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==8268==ABORTING
------- |
0c98ac9
to
c570084
Compare
…s never allocated See systemd#10678 (comment).
6cdeea3
to
e42f84b
Compare
@mrc0mmand could you take a look? I made it work in the sense that the code compiles, the tests are run and pretty much all known issues have been successfully detected. But I wouldn't say that it's perfect :-) Feel free to push commits on top of this branch if you think that it would be better to rearrange stages (or, maybe, run two jobs in parallel), rename things and anything else you think could be improved. By the way, I also think that it would be useful to add something like that to CentOS CI. |
Will do!
I guess that's going to be an issue, as that works only for maintainers (i.e. people with write access to systemd/systemd)
Definitely. |
I thought that write access to the "systemd-centos-ci" repository you have somehow transferred to this one. @poettering @keszybz would it be possible to invite @mrc0mmand as a collaborator? |
e42f84b
to
ae7c480
Compare
…s never allocated See systemd#10678 (comment).
Anyway, I rebased the branch on top of master where as far as I can see all the issues are fixed so Travis CI should be green and, in theory, the PR is ready to roll :-) |
travis-ci/managers/fedora.sh
Outdated
$DOCKER_EXEC git clean -dxff | ||
$DOCKER_EXEC meson -Db_sanitize=address,undefined build | ||
$DOCKER_EXEC ninja -v -C build | ||
$DOCKER_EXEC sh -c "printf '#!/bin/sh\necho The test is failing for unknown reason, skipping; exit 77' >/build/build/test-capability" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one should be removed after #10687 is merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's part of the PR, so we should be fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the second copy of the workaround. Kludges tend to sprawl :-) I'll wait for your PR to be merged to remove it altogether and to make sure that the test passes under ASan+UBsan too just in case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, I see. Thanks!
.travis.yml
Outdated
@@ -36,6 +36,25 @@ jobs: | |||
after_script: | |||
- $CI_MANAGERS/fedora.sh CLEANUP | |||
|
|||
- stage: Build & test with ASan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion it would be great if both standard and ASAN runs ran in parallel - dropping the stage definition here should do the trick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried something like that but it didn't work out. Travis just ran the ASan part skipping the "basic" part altogether. Maybe, I got the indentation wrong. I'll try again to see how it goes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify what I meant by "later". My idea was to make Travis work and merge it as soon as possible so that the CI could start catching issues that are easily detectable by ASan+UBsan and the stages/jobs could be rearranged later in a way that wouldn't require catching up with new bugs lurking around :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a working fix for the parallel runs (e.g. https://travis-ci.org/mrc0mmand/systemd/jobs/452344355)
From a084480b15b817b95f5db4e1605d88635c1fdd99 Mon Sep 17 00:00:00 2001
From: Frantisek Sumsal <frantisek@sumsal.cz>
Date: Thu, 8 Nov 2018 12:47:07 +0100
Subject: [PATCH] travis: run ASan job in parallel with the standard one
---
.travis.yml | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/.travis.yml b/.travis.yml
index add0576d4..eaeb95051 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -36,8 +36,7 @@ jobs:
after_script:
- $CI_MANAGERS/fedora.sh CLEANUP
- - stage: Build & test with ASan
- name: Fedora Rawhide
+ - name: Fedora Rawhide (ASan)
language: bash
env:
- FEDORA_RELEASE="rawhide"
--
2.17.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Now I'm a bit puzzled about what went wrong when I tried to do the same thing :-) Anyway, I applied the patch. Let's see how it goes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks for the info. I'm looking into the test-capability
fail right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the meantime, I'm going to put the kludge there again and use meson test -t 3
to run the tests with 3 times the normal timeout. The PR is already useful as it is even without test-capability
, which can be fixed later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, this is, once again, some black magic stuff.
Let's take a look at a more verbose version of the log:
==5649==AddressSanitizer Init done
have ambient caps: yes
Capabilities:= cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,cap_syslog,cap_wake_alarm,cap_block_suspend,cap_audit_read+eip
Capabilities:= cap_dac_override,cap_net_raw+ep
==5656==Could not attach to thread 5655 (errno 1).
==5656==Failed suspending threads.
==5655==LeakSanitizer has encountered a fatal error.
==5655==HINT: For debugging, try setting environment variable LSAN_OPTIONS=verbosity=1:log_threads=1
==5655==HINT: LeakSanitizer does not work under ptrace (strace, gdb, etc)
From the issues and documentation I found so far it means that Docker is missing the SYS_PTRACE
capability. However, as shown in the output above, the capability is there (and docker run --cap-add SYS_PTRACE
makes no difference).
And to make matters worse, it, of course, works as expected on both CentOS and Fedora...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, after another painfully long debug session I finally found a fix (or workaround, still not sure). The fail is caused by test_drop_privileges_fail
, as it drops all privileges, even SYS_PTRACE
, which is necessary for ASan. I have no idea why it works on Fedora/CentOS, but the proposed patch below seems to fix the issue:
diff --git a/src/test/test-capability.c b/src/test/test-capability.c
index 730dbf8cb..fc0d347bb 100644
--- a/src/test/test-capability.c
+++ b/src/test/test-capability.c
@@ -21,7 +21,7 @@ static uid_t test_uid = -1;
static gid_t test_gid = -1;
/* We keep CAP_DAC_OVERRIDE to avoid errors with gcov when doing test coverage */
-static uint64_t test_flags = 1ULL << CAP_DAC_OVERRIDE;
+static uint64_t test_flags = (1ULL << CAP_DAC_OVERRIDE) | (1ULL << CAP_SYS_PTRACE);
@evverx - If it's a relatively sane solution, I'll open another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mrc0mmand I've just opened #10692. Let's continue the discussion there.
ae7c480
to
20786e4
Compare
…s never allocated See systemd#10678 (comment).
4325a9e
to
45c7951
Compare
It should help to catch issues like systemd#10677.
…s never allocated See systemd#10678 (comment).
…stemd with it too
The test is currently failing when run under ASan in a docker container: ``` --- command --- SYSTEMD_KBD_MODEL_MAP='/build/src/locale/kbd-model-map' PATH='/build/build:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin' SYSTEMD_LANGUAGE_FALLBACK_MAP='/build/src/locale/language-fallback-map' /build/build/test-capability --- stderr --- have ambient caps: yes Capabilities:= cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_linux_immutable,cap_net_bind_service,cap_net_broadcast,cap_net_admin,cap_net_raw,cap_ipc_lock,cap_ipc_owner,cap_sys_module,cap_sys_rawio,cap_sys_chroot,cap_sys_ptrace,cap_sys_pacct,cap_sys_admin,cap_sys_boot,cap_sys_nice,cap_sys_resource,cap_sys_time,cap_sys_tty_config,cap_mknod,cap_lease,cap_audit_write,cap_audit_control,cap_setfcap,cap_mac_override,cap_mac_admin,cap_syslog,cap_wake_alarm,cap_block_suspend,cap_audit_read+eip Capabilities:= cap_dac_override,cap_net_raw+ep ==7021==LeakSanitizer has encountered a fatal error. ==7021==HINT: For debugging, try setting environment variable LSAN_OPTIONS=verbosity=1:log_threads=1 ==7021==HINT: LeakSanitizer does not work under ptrace (strace, gdb, etc) Assertion 'WIFEXITED(status) && WEXITSTATUS(status) == 0' failed at ../src/test/test-capability.c:71, function fork_test(). Aborting. ------- ``` https://api.travis-ci.org/v3/job/452349948/log.txt
45c7951
to
cdd116b
Compare
@poettering @keszybz @yuwata I think the PR is ready to merge. Could anyone press the merge button? |
I'm going to merge the PR so that the CI can start to catch issues easily detectable by ASan+UBSan. I could always revert it if anyone thinks that it's a bad idea. |
No description provided.