Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

貌似出现了Bug #5188

Closed
CodeKirin-dragon opened this issue Nov 12, 2023 · 4 comments
Closed

貌似出现了Bug #5188

CodeKirin-dragon opened this issue Nov 12, 2023 · 4 comments

Comments

@CodeKirin-dragon
Copy link

程序在正常运行过程中没有问题,如果长时间运行会出现占用大量CPU的情况,初步怀疑是PHP代码出现死循环,排查下来无果,请问要怎么排查?

看文档尝试使用 GDB 调试
因为不会使用,下面是堆栈的打印信息。还请各位大佬帮忙看一下。其他的进程呀打印了下面的输出都是类似的。

(gdb) bt
#0  0x00007f477ab0d91e in __libc_recv (fd=272, buf=0x7f47142cc358, len=1, flags=0)
    at ../sysdeps/unix/sysv/linux/recv.c:28
#1  0x00007f476ff1ca49 in recv (__flags=0, __n=1, __buf=0x7f47142cc358, __fd=<optimized out>)
    at /usr/include/x86_64-linux-gnu/bits/socket2.h:44
#2  swoole::network::Socket::recv (this=0x55fa85b73ef0, __buf=0x7f47142cc358, __n=1, 
    __flags=__flags@entry=0) at /root/swoole/swoole-src-4.8.13/src/network/socket.cc:709
#3  0x00007f476ff022e3 in swoole::coroutine::Socket::recv_all (this=0x55fa851644a0, __buf=<optimized out>, 
    __buf@entry=0x7f47142cc358, __n=<optimized out>)
    at /root/swoole/swoole-src-4.8.13/src/coroutine/socket.cc:1026
#4  0x00007f476fee12b2 in swoole_socket_coro_recv (type=SOCKET_RECV_ALL, return_value=0x7f46d4e09da0, 
    execute_data=0x7f46d4e09e10) at /root/swoole/swoole-src-4.8.13/ext-src/swoole_socket_coro.cc:1299
#5  zim_swoole_socket_coro_recvAll (execute_data=0x7f46d4e09e10, return_value=0x7f46d4e09da0)
    at /root/swoole/swoole-src-4.8.13/ext-src/swoole_socket_coro.cc:1328
#6  0x000055fa8022cac0 in execute_ex ()
#7  0x00007f476fe68536 in swoole::PHPCoroutine::main_func (arg=0x7fff79974dd0)
    at /root/swoole/swoole-src-4.8.13/ext-src/swoole_coroutine.cc:801
#8  0x00007f476fef6ebd in std::function<void (void*)>::operator()(void*) const (__args#0=<optimized out>, 
    this=0x55fa850eafa0) at /usr/include/c++/7/bits/std_function.h:706
---Type <return> to continue, or q <return> to quit---

image

  1. What version of Swoole are you using (show your php --ri swoole)?
root@S58-181:~# php --ri swoole

swoole

Swoole => enabled
Author => Swoole Team <team@swoole.com>
Version => 4.8.13
Built => Mar 27 2023 23:02:04
coroutine => enabled with boost asm context
epoll => enabled
eventfd => enabled
signalfd => enabled
cpu_affinity => enabled
spinlock => enabled
rwlock => enabled
openssl => OpenSSL 1.1.1  11 Sep 2018
dtls => enabled
curl-native => enabled
pcre => enabled
zlib => 1.2.11
mutex_timedlock => enabled
pthread_barrier => enabled
futex => enabled
async_redis => enabled

Directive => Local Value => Master Value
swoole.enable_coroutine => On => On
swoole.enable_library => On => On
swoole.enable_preemptive_scheduler => Off => Off
swoole.display_errors => On => On
swoole.use_shortname => Off => Off
swoole.unixsock_buffer_size => 8388608 => 8388608
  1. What is your machine environment used (show your uname -a & php -v & gcc -v) ?
root@S58-181:~# uname -a
Linux S58-181 4.15.0-20-generic #21-Ubuntu SMP Tue Apr 24 06:16:15 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
root@S58-181:~# php -v
PHP 8.0.11 (cli) (built: Sep 25 2021 08:23:49) ( NTS )
Copyright (c) The PHP Group
Zend Engine v4.0.11, Copyright (c) Zend Technologies
root@S58-181:~# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 7.5.0-3ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04) 
root@S58-181:~# 
@NathanFreeman
Copy link
Member

strace -p 进程id 看看进程当前在做什么

@CodeKirin-dragon
Copy link
Author

strace -p 进程id 看看进程当前在做什么

openat(AT_FDCWD, "/proc/1077/stat", O_RDONLY) = 412
read(412, "1077 (redis-server) R 1 1077 107"..., 4096) = 363
close(412)                              = 0
wait4(-1, 0x7ffc10b51654, WNOHANG, NULL) = 0
write(111, "*2\r\n$6\r\n103974\r\n*0\r\n", 20) = 20
write(136, "$11\r\nd:48657820;\r\n", 18) = 18
write(393, "*2\r\n$5\r\n29060\r\n*0\r\n", 19) = 19
epoll_wait(5, [{EPOLLIN, {u32=239, u64=239}}, {EPOLLIN, {u32=173, u64=173}}], 10128, 100) = 2
read(239, "*6\r\n$4\r\nSCAN\r\n$6\r\n151011\r\n$5\r\nCO"..., 16384) = 124
read(173, "*6\r\n$4\r\nSCAN\r\n$6\r\n103974\r\n$5\r\nCO"..., 16384) = 124
write(173, "*2\r\n$5\r\n67233\r\n*0\r\n", 19) = 19
write(239, "*2\r\n$6\r\n195735\r\n*0\r\n", 20) = 20
epoll_wait(5, [{EPOLLIN, {u32=243, u64=243}}, {EPOLLIN, {u32=382, u64=382}}, {EPOLLIN, {u32=395, u64=395}}], 10128, 77) = 3
read(243, "*6\r\n$4\r\nSCAN\r\n$5\r\n29060\r\n$5\r\nCOU"..., 16384) = 123
read(382, "*6\r\n$4\r\nSCAN\r\n$5\r\n67233\r\n$5\r\nCOU"..., 16384) = 123
read(395, "*6\r\n$4\r\nSCAN\r\n$6\r\n195735\r\n$5\r\nCO"..., 16384) = 124
write(395, "*2\r\n$1\r\n0\r\n*0\r\n", 15) = 15
write(382, "*2\r\n$6\r\n207461\r\n*0\r\n", 20) = 20
write(243, "*2\r\n$6\r\n233666\r\n*0\r\n", 20) = 20
epoll_wait(5, [{EPOLLIN, {u32=199, u64=199}}], 10128, 47) = 1
read(199, "*2\r\n$3\r\nDEL\r\n$47\r\ncache.file.inf"..., 16384) = 67
write(199, ":0\r\n", 4)                 = 4
epoll_wait(5, [{EPOLLIN, {u32=190, u64=190}}, {EPOLLIN, {u32=384, u64=384}}], 10128, 47) = 2
read(190, "*6\r\n$4\r\nSCAN\r\n$6\r\n207461\r\n$5\r\nCO"..., 16384) = 124
read(384, "*6\r\n$4\r\nSCAN\r\n$6\r\n233666\r\n$5\r\nCO"..., 16384) = 124
write(384, "*2\r\n$6\r\n103974\r\n*0\r\n", 20) = 20
write(190, "*2\r\n$6\r\n151011\r\n*0\r\n", 20) = 20
epoll_wait(5, [{EPOLLIN, {u32=204, u64=204}}, {EPOLLIN, {u32=240, u64=240}}], 10128, 31) = 2
read(204, "*2\r\n$6\r\nEXISTS\r\n$56\r\nswoole.prox"..., 16384) = 79
read(240, "*6\r\n$4\r\nSCAN\r\n$6\r\n103974\r\n$5\r\nCO"..., 16384) = 124
write(240, "*2\r\n$5\r\n67233\r\n*0\r\n", 19) = 19
write(204, ":0\r\n", 4)                 = 4
epoll_wait(5, [{EPOLLIN, {u32=355, u64=355}}, {EPOLLIN, {u32=393, u64=393}}], 10128, 23) = 2
read(355, "*6\r\n$4\r\nSCAN\r\n$6\r\n151011\r\n$5\r\nCO"..., 16384) = 124
read(393, "*6\r\n$4\r\nSCAN\r\n$5\r\n67233\r\n$5\r\nCOU"..., 16384) = 123
getpid()                                = 1077
openat(AT_FDCWD, "/proc/1077/stat", O_RDONLY) = 412
read(412, "1077 (redis-server) R 1 1077 107"..., 4096) = 362
close(412)                              = 0
wait4(-1, 0x7ffc10b51654, WNOHANG, NULL) = 0
write(393, "*2\r\n$6\r\n207461\r\n*0\r\n", 20) = 20
write(355, "*2\r\n$6\r\n195735\r\n*0\r\n", 20) = 20
epoll_wait(5, [{EPOLLIN, {u32=27, u64=27}}, {EPOLLIN, {u32=419, u64=419}}, {EPOLLIN, {u32=175, u64=175}}], 10128, 100) = 3
read(27, "*6\r\n$4\r\nSCAN\r\n$1\r\n0\r\n$5\r\nCOUNT\r\n"..., 16384) = 119
read(419, "*2\r\n$3\r\nGET\r\n$36\r\ncache.system.n"..., 16384) = 56
read(175, "*2\r\n$3\r\nGET\r\n$31\r\ncache.system.n"..., 16384) = 51
write(175, "$11\r\nd:48657820;\r\n", 18) = 18
write(419, "$23\r\ns:15:\"181497319767611\";\r\n", 30) = 30
write(27, "*2\r\n$5\r\n29060\r\n*0\r\n", 19) = 19
epoll_wait(5, [{EPOLLIN, {u32=243, u64=243}}, {EPOLLIN, {u32=383, u64=383}}], 10128, 89) = 2
read(243, "*6\r\n$4\r\nSCAN\r\n$6\r\n207461\r\n$5\r\nCO"..., 16384) = 124
read(383, "*6\r\n$4\r\nSCAN\r\n$6\r\n195735\r\n$5\r\nCO"..., 16384) = 124
write(383, "*2\r\n$1\r\n0\r\n*0\r\n", 15) = 15
write(243, "*2\r\n$6\r\n151011\r\n*0\r\n", 20) = 20
epoll_wait(5, [{EPOLLIN, {u32=156, u64=156}}, {EPOLLIN, {u32=884, u64=884}}], 10128, 70) = 2
read(156, "*6\r\n$4\r\nSCAN\r\n$5\r\n29060\r\n$5\r\nCOU"..., 16384) = 123
read(884, "*4\r\n$5\r\nSETEX\r\n$31\r\ncache.system"..., 16384) = 83
write(884, "+OK\r\n", 5)                = 5
write(156, "*2\r\n$6\r\n233666\r\n*0\r\n", 20) = 20
epoll_wait(5, [{EPOLLIN, {u32=111, u64=111}}, {EPOLLIN, {u32=384, u64=384}}, {EPOLLIN, {u32=410, u64=410}}], 10128, 59) = 3
read(111, "*2\r\n$3\r\nDEL\r\n$47\r\ncache.file.inf"..., 16384) = 67
read(384, "*6\r\n$4\r\nSCAN\r\n$6\r\n151011\r\n$5\r\nCO"..., 16384) = 124
read(410, "*4\r\n$5\r\nSETEX\r\n$36\r\ncache.system"..., 16384) = 100
write(410, "+OK\r\n", 5)                = 5
write(384, "*2\r\n$6\r\n195735\r\n*0\r\n", 20) = 20
write(111, ":0\r\n", 4)                 = 4
epoll_wait(5, [{EPOLLIN, {u32=349, u64=349}}], 10128, 47) = 1
read(349, "*6\r\n$4\r\nSCAN\r\n$6\r\n233666\r\n$5\r\nCO"..., 16384) = 124
^Cstrace: Process 1077 detached

感谢大佬的回复,一直持续输出这东西看情况是好像是业务上某个定时任务,好像一直循环操作redis,我应该知道问题所在了,请问大佬这个可以排查到具体代码吗?虽然这个输入也可以排查但是用到相同的key很多,排查起来比较慢。

@NathanFreeman
Copy link
Member

NathanFreeman commented Nov 13, 2023

看起来Swoole\Crorutine\Socket::recvAll()每次只使用一个长度为1的缓冲区去获取数据,每次就获取1字节的数据

@matyhtf
Copy link
Member

matyhtf commented Nov 14, 2023

类似的问题,可以使用 strace、gdb(zbacktrace)、perf 等工具进行跟踪调试

php-src 和 swoole-src 提供的 gdbinit 脚本,里面有很多工具可以查看相关的信息

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants