Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: ddappsec startup warnings #2501

Open
amberovsky opened this issue Feb 1, 2024 · 9 comments
Open

[Bug]: ddappsec startup warnings #2501

amberovsky opened this issue Feb 1, 2024 · 9 comments
Labels
🐛 bug Something isn't working

Comments

@amberovsky
Copy link

amberovsky commented Feb 1, 2024

Bug report

Ubuntu 20.04 in Docker

PHP Warning: PHP Startup: [ddappsec] Failed to retrieve root span meta in Unknown on line 0
PHP Warning: PHP Startup: [ddappsec] Failed to add tag _dd.appsec.event_rules.version in Unknown on line 0
PHP Warning: PHP Startup: [ddappsec] Failed to retrieve root span meta in Unknown on line 0
PHP Warning: PHP Startup: [ddappsec] Failed to add _dd.runtime_family to root span in Unknown on line 0

PHP version

PHP 8.2.15 (cli) (built: Jan 20 2024 14:16:39) (NTS)

Tracer or profiler version

0.97.0

Installed extensions

[PHP Modules]
bcmath
calendar
Core
ctype
curl
date
dom
exif
FFI
fileinfo
filter
ftp
gettext
hash
iconv
intl
json
libxml
mbstring
mongodb
openssl
pcntl
pcre
PDO
Phar
posix
random
readline
Reflection
session
shmop
SimpleXML
sockets
sodium
SPL
standard
sysvmsg
sysvsem
sysvshm
tokenizer
xml
xmlreader
xmlwriter
xsl
Zend OPcache
zlib

[Zend Modules]
Zend OPcache

Output of phpinfo()

Unfortunately I'm unable to provide either phpinfo() (internal service without direct http access) nor CLI SAPI (disabled)

There are no changes to the default config apart from these:

extension = ddtrace.so
extension = datadog-profiling.so
extension = ddappsec.so
datadog.appsec.enabled = On

Upgrading from

No response

@amberovsky amberovsky added the 🐛 bug Something isn't working label Feb 1, 2024
@Anilm3
Copy link
Contributor

Anilm3 commented Feb 1, 2024

Hello @amberovsky ,

This typically happens when ddtrace is not available:

  • Can you confirm that ddtrace is enabled and working as expected?
  • Do you have any other logs, such as warnings or errors?
  • Does this only happen at startup?

On the other hand, anything you could let us know about your setup could be useful for us to reproduce the issue, for example are you using apache or fpm? Any specific framework?

@amberovsky
Copy link
Author

amberovsky commented Feb 1, 2024

Hi @Anilm3 ,

  • ddtrace is enabled and working (although I'm not sure if it is working "as expected" - we've just started evaluating DD)
  • happens only at deployment/startup
  • there are also php-fpm SIGSEGVs happening during the workload, please see logs below. I don't have coredumps at the moment, and I'm trying to understand whether they are related to appsec or profiling extensions.

Symfony 6.4, nginx, php-fpm, OpCache

[28-Jan-2024 19:55:39] WARNING: [pool www] child 26668 exited on signal 11 (SIGSEGV) after 5197.846840 seconds from start
[29-Jan-2024 11:41:57] WARNING: [pool www] child 26718 exited on signal 11 (SIGSEGV) after 8.092824 seconds from start
[29-Jan-2024 20:23:59] WARNING: [pool www] child 16897 exited on signal 11 (SIGSEGV) after 596.541010 seconds from start
[29-Jan-2024 20:24:06] WARNING: [pool www] child 17184 exited on signal 11 (SIGSEGV) after 7.301436 seconds from start
[29-Jan-2024 22:08:50] WARNING: [pool www] child 16032 exited on signal 11 (SIGSEGV) after 7546.904623 seconds from start
[30-Jan-2024 13:54:52] WARNING: [pool www] child 102 exited on signal 11 (SIGSEGV) after 417.388427 seconds from start
[30-Jan-2024 15:00:33] WARNING: [pool www] child 98 exited on signal 11 (SIGSEGV) after 4357.623340 seconds from start
[30-Jan-2024 15:00:33] WARNING: [pool www] child 95 exited on signal 11 (SIGSEGV) after 4358.157075 seconds from start
[30-Jan-2024 15:13:45] WARNING: [pool www] child 113 exited on signal 11 (SIGSEGV) after 5149.703371 seconds from start
[30-Jan-2024 21:52:23] WARNING: [pool www] child 22142 exited on signal 11 (SIGSEGV) after 153.467050 seconds from start
[31-Jan-2024 07:10:49] WARNING: [pool www] child 28520 exited on signal 11 (SIGSEGV) after 25816.944905 seconds from start
[31-Jan-2024 08:38:40] WARNING: [pool www] child 28202 exited on signal 11 (SIGSEGV) after 31568.360108 seconds from start

And

Jan 28 19:55:39 kernel: php-fpm8.2[14862]: segfault at 56160000030e ip 00007fe3982d6438 sp 00007ffd4c8deb80 error 4 in datadog-profiling.so[7fe397f98000+4f9000]
Jan 29 11:41:57 kernel: traps: php-fpm8.2[18537] general protection fault ip:561634482f76 sp:7ffd4c8dedd0 error:0 in zero (deleted)[561634400000+6400000]
Jan 29 20:23:59 kernel: traps: php-fpm8.2[23992] general protection fault ip:561634482f76 sp:7ffd4c8dedd0 error:0 in zero (deleted)[561634400000+6400000]
...

UPD: Just saw the segfault is related to datadog-profiling.so

@Anilm3
Copy link
Contributor

Anilm3 commented Feb 1, 2024

I'm not sure if the segfaults are related at all, but perhaps that's a more pressing problem.

@realFlowControl can you have a look?

@bwoebi
Copy link
Collaborator

bwoebi commented Feb 1, 2024

@amberovsky Is there a possibility that you could get core dumps and post us at least the stacktraces? That would be probably helpful in triaging. Thanks!

@amberovsky
Copy link
Author

@bwoebi @Anilm3 does this help?

(gdb) thread apply all bt full

Thread 4 (Thread 0x7f0f5adfd700 (LWP 328)):
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
No locals.
#1  0x00007f0f5c54d985 in std::sys::unix::futex::futex_wait () at library/std/src/sys/unix/futex.rs:62
No locals.
#2  std::sys_common::thread_parking::futex::Parker::park () at library/std/src/sys_common/thread_parking/futex.rs:52
No locals.
#3  std::thread::park () at library/std/src/thread/mod.rs:978
No locals.
#4  0x00007f0f5c310e3d in crossbeam_channel::context::Context::wait_until () at /rust/cargo/registry/src/index.crates.io-6f17d22bba15001f/crossbeam-channel-0.5.8/src/context.rs:177
No locals.
#5  0x00007f0f5c311102 in crossbeam_channel::flavors::array::Channel<T>::recv::{{closure}} () at /rust/cargo/registry/src/index.crates.io-6f17d22bba15001f/crossbeam-channel-0.5.8/src/flavors/array.rs:431
No locals.
#6  crossbeam_channel::context::Context::with::{{closure}} () at /rust/cargo/registry/src/index.crates.io-6f17d22bba15001f/crossbeam-channel-0.5.8/src/context.rs:52
No locals.
#7  0x00007f0f5c311c17 in crossbeam_channel::context::Context::with::{{closure}} () at /rust/cargo/registry/src/index.crates.io-6f17d22bba15001f/crossbeam-channel-0.5.8/src/context.rs:60
No locals.
#8  std::thread::local::LocalKey<T>::try_with () at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/thread/local.rs:270
No locals.
#9  crossbeam_channel::context::Context::with () at /rust/cargo/registry/src/index.crates.io-6f17d22bba15001f/crossbeam-channel-0.5.8/src/context.rs:55
No locals.
#10 crossbeam_channel::flavors::array::Channel<T>::recv () at /rust/cargo/registry/src/index.crates.io-6f17d22bba15001f/crossbeam-channel-0.5.8/src/flavors/array.rs:420
No locals.
#11 0x00007f0f5c343caa in crossbeam_channel::channel::Receiver<T>::recv () at /rust/cargo/registry/src/index.crates.io-6f17d22bba15001f/crossbeam-channel-0.5.8/src/channel.rs:814
No locals.
#12 0x00007f0f5c34e811 in datadog_php_profiling::profiling::uploader::Uploader::run () at profiling/src/profiling/uploader.rs:101
No locals.
#13 0x00007f0f5c31bf07 in datadog_php_profiling::profiling::Profiler::new::{{closure}} () at profiling/src/profiling/mod.rs:527
No locals.
#14 datadog_php_profiling::profiling::thread_utils::spawn::{{closure}} () at profiling/src/profiling/thread_utils.rs:38
No locals.
#15 std::sys_common::backtrace::__rust_begin_short_backtrace () at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/sys_common/backtrace.rs:135
No locals.
#16 0x00007f0f5c33019d in std::thread::Builder::spawn_unchecked_::{{closure}}::{{closure}} () at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/thread/mod.rs:529
No locals.
#17 <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once () at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/panic/unwind_safe.rs:271
No locals.
#18 std::panicking::try::do_call () at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:500
No locals.
#19 std::panicking::try () at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:464
No locals.
#20 std::panic::catch_unwind () at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panic.rs:142
--Type <RET> for more, q to quit, c to continue without paging--
No locals.
#21 std::thread::Builder::spawn_unchecked_::{{closure}} () at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/thread/mod.rs:528
No locals.
#22 core::ops::function::FnOnce::call_once{{vtable-shim}} () at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/ops/function.rs:250
No locals.
#23 0x00007f0f5c557c66 in <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once () at library/alloc/src/boxed.rs:1985
No locals.
#24 <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once () at library/alloc/src/boxed.rs:1985
No locals.
#25 std::sys::unix::thread::Thread::new::thread_start () at library/std/src/sys/unix/thread.rs:108
No locals.
#26 0x00007f0f6167f609 in start_thread (arg=<optimized out>) at pthread_create.c:477
        ret = <optimized out>
        pd = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139703925856000, -8450623212753878221, 140726012776414, 140726012776415, 140726012776608, 139703925853440, 8405950328496882483, 8405854604155192115}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = 0
#27 0x00007f0f617b9353 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
No locals.

Thread 3 (Thread 0x7f0f5affe700 (LWP 327)):
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
No locals.
#1  0x00007f0f5c5573a4 in std::sys::unix::futex::futex_wait () at library/std/src/sys/unix/futex.rs:62
No locals.
#2  0x00007f0f5c54d9fc in std::sys_common::thread_parking::futex::Parker::park_timeout () at library/std/src/sys_common/thread_parking/futex.rs:72
No locals.
#3  std::thread::park_timeout () at library/std/src/thread/mod.rs:1045
No locals.
#4  0x00007f0f5c30a0ce in crossbeam_channel::context::Context::wait_until () at src/context.rs:168
No locals.
#5  crossbeam_channel::select::run_select::{{closure}} () at src/select.rs:260
No locals.
#6  crossbeam_channel::context::Context::with::{{closure}} () at src/context.rs:52
No locals.
#7  0x00007f0f5c30a420 in crossbeam_channel::context::Context::with::{{closure}} () at src/context.rs:60
No locals.
#8  std::thread::local::LocalKey<T>::try_with () at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/thread/local.rs:270
No locals.
#9  crossbeam_channel::context::Context::with () at src/context.rs:55
No locals.
#10 crossbeam_channel::select::run_select () at src/select.rs:212
No locals.
--Type <RET> for more, q to quit, c to continue without paging--
#11 0x00007f0f5c334e89 in crossbeam_channel::select::select () at /rust/cargo/registry/src/index.crates.io-6f17d22bba15001f/crossbeam-channel-0.5.8/src/select.rs:474
No locals.
#12 datadog_php_profiling::profiling::TimeCollector::run () at profiling/src/profiling/mod.rs:430
No locals.
#13 0x00007f0f5c31c127 in datadog_php_profiling::profiling::Profiler::new::{{closure}} () at profiling/src/profiling/mod.rs:523
No locals.
#14 datadog_php_profiling::profiling::thread_utils::spawn::{{closure}} () at profiling/src/profiling/thread_utils.rs:38
No locals.
#15 std::sys_common::backtrace::__rust_begin_short_backtrace () at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/sys_common/backtrace.rs:135
No locals.
#16 0x00007f0f5c33046d in std::thread::Builder::spawn_unchecked_::{{closure}}::{{closure}} () at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/thread/mod.rs:529
No locals.
#17 <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once () at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/panic/unwind_safe.rs:271
No locals.
#18 std::panicking::try::do_call () at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:500
No locals.
#19 std::panicking::try () at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panicking.rs:464
No locals.
#20 std::panic::catch_unwind () at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/panic.rs:142
No locals.
#21 std::thread::Builder::spawn_unchecked_::{{closure}} () at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/thread/mod.rs:528
No locals.
#22 core::ops::function::FnOnce::call_once{{vtable-shim}} () at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/core/src/ops/function.rs:250
No locals.
#23 0x00007f0f5c557c66 in <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once () at library/alloc/src/boxed.rs:1985
No locals.
#24 <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once () at library/alloc/src/boxed.rs:1985
No locals.
#25 std::sys::unix::thread::Thread::new::thread_start () at library/std/src/sys/unix/thread.rs:108
No locals.
#26 0x00007f0f6167f609 in start_thread (arg=<optimized out>) at pthread_create.c:477
        ret = <optimized out>
        pd = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139703927957248, -8450623212753878221, 140726012776398, 140726012776399, 140726012776592, 139703927954688, 8405950054155846451, 8405854604155192115}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = 0
#27 0x00007f0f617b9353 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
No locals.

Thread 2 (Thread 0x7f0f5b9ff700 (LWP 326)):
#0  futex_abstimed_wait_cancelable (private=<optimized out>, abstime=0x7f0f5b9fea90, clockid=<optimized out>, expected=0, futex_word=0x562c80d34010) at ../sysdeps/nptl/futex-internal.h:320
        __ret = -516
        clockbit = <optimized out>
--Type <RET> for more, q to quit, c to continue without paging--
        op = <optimized out>
        __ret = <optimized out>
        oldtype = 0
        err = <optimized out>
        oldtype = <optimized out>
        err = <optimized out>
        __ret = <optimized out>
        clockbit = <optimized out>
        op = <optimized out>
        __ret = <optimized out>
        resultvar = <optimized out>
        __arg6 = <optimized out>
        __arg5 = <optimized out>
        __arg4 = <optimized out>
        __arg3 = <optimized out>
        __arg2 = <optimized out>
        __arg1 = <optimized out>
        _a6 = <optimized out>
        _a5 = <optimized out>
        _a4 = <optimized out>
        _a3 = <optimized out>
        _a2 = <optimized out>
        _a1 = <optimized out>
#1  __pthread_cond_wait_common (abstime=0x7f0f5b9fea90, clockid=<optimized out>, mutex=0x562c80d33f18, cond=0x562c80d33fe8) at pthread_cond_wait.c:520
        spin = 0
        buffer = {__routine = 0x7f0f61686050 <__condvar_cleanup_waiting>, __arg = 0x7f0f5b9fe9d0, __canceltype = 1590350032, __prev = 0x0}
        cbuffer = {wseq = 334, cond = 0x562c80d33fe8, mutex = 0x562c80d33f18, private = 0}
        err = <optimized out>
        g = 0
        flags = <optimized out>
        g1_start = <optimized out>
        maxspin = 0
        signals = <optimized out>
        result = 0
        wseq = 334
        seq = 167
        private = <optimized out>
        maxspin = <optimized out>
        err = <optimized out>
        result = <optimized out>
        wseq = <optimized out>
        g = <optimized out>
        seq = <optimized out>
--Type <RET> for more, q to quit, c to continue without paging--
        flags = <optimized out>
        private = <optimized out>
        signals = <optimized out>
        g1_start = <optimized out>
        spin = <optimized out>
        buffer = <optimized out>
        cbuffer = <optimized out>
        s = <optimized out>
#2  __pthread_cond_timedwait (cond=0x562c80d33fe8, mutex=0x562c80d33f18, abstime=0x7f0f5b9fea90) at pthread_cond_wait.c:665
        flags = <optimized out>
        clockid = <optimized out>
#3  0x00007f0f5ca51178 in _dd_writer_loop (_=<optimized out>) at /home/circleci/datadog/tmp/build_extension/ext/coms.c:986
        wait_deadline = {tv_sec = 1707909696, tv_nsec = 526932000}
        interval = <optimized out>
        stack = <optimized out>
        processed_stacks = <optimized out>
        curl = <optimized out>
        running = true
        __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {139703938443912, 8405948680481440563, 140726012777774, 140726012777775, 140726012777776, 139703938444544, 8405948680504509235, 8405937626107642675}, __mask_was_saved = 0}}, __pad = {0x7f0f5b9fec50, 0x0, 0x0, 0x0}}
        __cancel_routine = 0x7f0f5ca4fe40 <_dd_writer_loop_cleanup>
        __cancel_arg = <optimized out>
        __not_first_call = <optimized out>
        sigset = {__val = {67127815, 0 <repeats 15 times>}}
        writer = 0x7f0f5d0d1620 <global_writer>
        caphdrp = {version = 537396514, pid = 0}
        capdatap = {{effective = 0, permitted = 0, inheritable = 0}, {effective = 0, permitted = 0, inheritable = 0}}
#4  0x00007f0f6167f609 in start_thread (arg=<optimized out>) at pthread_create.c:477
        ret = <optimized out>
        pd = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {139703938447104, -8450623212753878221, 140726012777774, 140726012777775, 140726012777776, 139703938444544, 8405948680303182643, 8405854604155192115}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = 0
#5  0x00007f0f617b9353 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
No locals.

Thread 1 (Thread 0x7f0f5f6375c0 (LWP 191)):
#0  0x0000562c7f4fb5cb in ?? ()
No symbol table info available.
#1  0x00007f0f5f5de480 in ?? ()
No symbol table info available.
#2  0x00007f0f58c33178 in ?? ()
No symbol table info available.
--Type <RET> for more, q to quit, c to continue without paging--
#3  0x0000000000000308 in ?? ()
No symbol table info available.
#4  0x0000562c82366968 in ?? ()
No symbol table info available.
#5  0x00007f0f58c33178 in ?? ()
No symbol table info available.
#6  0x0000562c910a5d2c in ?? ()
No symbol table info available.
#7  0x00007f0f58a82a10 in ?? ()
No symbol table info available.
#8  0x0000000000000001 in ?? ()
No symbol table info available.
#9  0x000000035f402eb0 in ?? ()
No symbol table info available.
#10 0x00007ffd540074b0 in ?? ()
No symbol table info available.
#11 0x00007f0f58c583a0 in ?? ()
No symbol table info available.
#12 0x0000000000000001 in ?? ()
No symbol table info available.
#13 0x00007f0f5f5de480 in ?? ()
No symbol table info available.
#14 0x00007f0f00000001 in ?? ()
No symbol table info available.
#15 0x0000000000000002 in ?? ()
No symbol table info available.
#16 0x0000000000000001 in ?? ()
No symbol table info available.
#17 0x00007f0f5f4980d8 in ?? ()
No symbol table info available.
#18 0x0000562c7f582c81 in ?? ()
No symbol table info available.
#19 0x00007f0f5f413020 in ?? ()
No symbol table info available.
#20 0x9cddec2256084100 in ?? ()
No symbol table info available.
#21 0x00007f0f5f413020 in ?? ()
No symbol table info available.
#22 0x00007f0f5f466a00 in ?? ()
No symbol table info available.
#23 0x00007f0f5f413020 in ?? ()
No symbol table info available.
#24 0x0000562c7f776900 in ?? ()
--Type <RET> for more, q to quit, c to continue without paging--
No symbol table info available.
#25 0x0000000000000000 in ?? ()
No symbol table info available.
(gdb)

@amberovsky
Copy link
Author

I can upload the coredump itself if there is a secure way of doing so (E.g. if I can msg you the s3 path?)

@bwoebi
Copy link
Collaborator

bwoebi commented Feb 16, 2024

Hm, the crashing thread is thread #1 and it's full of question marks. There are three options:
a) you don't have debug symbols for PHP itself
b) it's JIT code and it crashes in JIT
c) It's a crash in the rust code.

For c) you could try https://output.circle-artifacts.com/output/job/280b1883-9ac5-42e7-b9fb-6c19b6101509/artifacts/0/datadog-setup.php - we have now debug symbols in our rust code too. If a), you may try to install them. If b) it's going to be hard...

@realFlowControl
Copy link
Collaborator

Hey there @amberovsky,

did you have a chance to try the suggestions?

You may also reach out to us in private via https://chat.datadoghq.com/ and find me as https://datadoghq.slack.com/team/U04MJBH48A2 so we can try and help you debug the crash

Kind regards
Florian

@amberovsky
Copy link
Author

Hi @realFlowControl ,
I'm testing on php 8.3 and the latest plugin version now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛 bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants