Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

define __STDC_NO_ATOMICS__ 之后,启动有概率卡死 #1353

Closed
HappyMiluo opened this issue Mar 6, 2021 · 5 comments
Closed

define __STDC_NO_ATOMICS__ 之后,启动有概率卡死 #1353

HappyMiluo opened this issue Mar 6, 2021 · 5 comments

Comments

@HappyMiluo
Copy link

HappyMiluo commented Mar 6, 2021

image

在makefile中加了这两句,关闭了atomic,启动有概率卡死
gdb堆栈如下:

(gdb) info threads
28 Thread 0x7f57d623b700 (LWP 1727) 0x000000000040769a in skynet_handle_grab (handle=3) at skynet-src/skynet_handle.c:147
27 Thread 0x7f57d5a3a700 (LWP 1728) 0x000000000040769a in skynet_handle_grab (handle=23) at skynet-src/skynet_handle.c:147
26 Thread 0x7f57d5239700 (LWP 1729) 0x00007f57d71bc1c3 in epoll_wait () from /lib64/libc.so.6
25 Thread 0x7f57d4a38700 (LWP 1730) 0x00007f57d7b0268c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
24 Thread 0x7f57d4237700 (LWP 1731) 0x000000000040769a in skynet_handle_grab (handle=9) at skynet-src/skynet_handle.c:147
23 Thread 0x7f57d3a36700 (LWP 1732) 0x00007f57d7b0268c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
22 Thread 0x7f57d3235700 (LWP 1733) 0x00007f57d7b0268c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
21 Thread 0x7f57d2a34700 (LWP 1734) 0x00007f57d7b0268c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
20 Thread 0x7f57d2233700 (LWP 1735) 0x00007f57d7b0268c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
19 Thread 0x7f57d1a32700 (LWP 1736) 0x00007f57d7b0268c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
18 Thread 0x7f57d1231700 (LWP 1737) 0x00007f57d7b0268c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
17 Thread 0x7f57d0a30700 (LWP 1738) 0x000000000040769a in skynet_handle_grab (handle=1) at skynet-src/skynet_handle.c:147
16 Thread 0x7f57cbfff700 (LWP 1739) 0x00007f57d7b0268c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
15 Thread 0x7f57cb7fe700 (LWP 1740) 0x00007f57d7b0268c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
14 Thread 0x7f57caffd700 (LWP 1741) 0x00007f57d7b0268c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
13 Thread 0x7f57ca7fc700 (LWP 1742) 0x00007f57d7b0268c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
12 Thread 0x7f57c9ffb700 (LWP 1743) 0x00007f57d7b0268c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
11 Thread 0x7f57c97fa700 (LWP 1744) 0x000000000040762e in skynet_handle_register (ctx=0x7f57a802ba30) at skynet-src/skynet_handle.c:59
10 Thread 0x7f57c8ff9700 (LWP 1745) 0x000000000040769a in skynet_handle_grab (handle=57) at skynet-src/skynet_handle.c:147
9 Thread 0x7f57c87f8700 (LWP 1746) 0x00007f57d7b0268c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
8 Thread 0x7f57c7ff7700 (LWP 1747) 0x00007f57d7b0268c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
7 Thread 0x7f57c77f6700 (LWP 1748) 0x000000000040769a in skynet_handle_grab (handle=13) at skynet-src/skynet_handle.c:147
6 Thread 0x7f57c6ff5700 (LWP 1749) 0x00007f57d7b0268c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
5 Thread 0x7f57c67f4700 (LWP 1750) 0x00007f57d7b0268c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
4 Thread 0x7f57c5ff3700 (LWP 1751) 0x00007f57d7b0268c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
3 Thread 0x7f57c57f2700 (LWP 1752) 0x00007f57d7b0268c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
2 Thread 0x7f57c4ff1700 (LWP 1753) 0x00007f57d7b0268c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
1 Thread 0x7f57d7f2a700 (LWP 1726) 0x00007f57d7aff2fd in pthread_join () from /lib64/libpthread.so.0
(gdb) Thread 11
[Switching to thread 11 (Thread 0x7f57c97fa700 (LWP 1744))]#0 0x000000000040762e in skynet_handle_register (ctx=0x7f57a802ba30) at skynet-src/skynet_handle.c:59
59 skynet-src/skynet_handle.c: No such file or directory.
in skynet-src/skynet_handle.c
(gdb) bt
#0 0x000000000040762e in skynet_handle_register (ctx=0x7f57a802ba30) at skynet-src/skynet_handle.c:59
#1 0x00000000004097e2 in skynet_context_new (name=0x7f57c97f9320 "snlua", param=0x7f57c97f9326 "serverStartService 7") at skynet-src/skynet_server.c:155
#2 0x0000000000409941 in cmd_launch (context=0x7f57cc014460, param=0x7f57a8041cd0 "snlua serverStartService 7") at skynet-src/skynet_server.c:492
#3 0x00007f57d001916f in lcommand (L=0x7f57cc0c8458) at lualib-src/lua-skynet.c:135
#4 0x0000000000414b3d in luaD_precall (L=0x7f57cc0c8458, func=0x7f57cc0dc430, nresults=1) at ldo.c:503
#5 0x0000000000422b28 in luaV_execute (L=, ci=) at lvm.c:1617
#6 0x00000000004148a3 in unroll (L=0x7f57cc0c8458, ud=) at ldo.c:613
#7 0x0000000000413c8c in luaD_rawrunprotected (L=0x7f57cc0c8458, f=0x414c90 , ud=0x7f57c97f964c) at ldo.c:147
#8 0x0000000000414fe4 in lua_resume (L=0x7f57cc0c8458, from=, nargs=6, nresults=0x7f57c97f96cc) at ldo.c:716
#9 0x00007f57d623e478 in lua_resumeX (L=0x7f57cc005d58, co_index=1, n=6) at service-src/service_snlua.c:90
#10 auxresume (L=0x7f57cc005d58, co_index=1, n=6) at service-src/service_snlua.c:146
#11 timing_resume (L=0x7f57cc005d58, co_index=1, n=6) at service-src/service_snlua.c:198
#12 0x00007f57d623e780 in luaB_coresume (L=0x7f57cc005d58) at service-src/service_snlua.c:217
#13 0x0000000000414b3d in luaD_precall (L=0x7f57cc005d58, func=0x7f57cc0ba1e0, nresults=-1) at ldo.c:503
#14 0x0000000000422c1b in luaV_execute (L=, ci=) at lvm.c:1649
#15 0x0000000000414c55 in ccall (L=0x7f57cc005d58, func=, nResults=, inc=65537) at ldo.c:548
#16 0x0000000000413c8c in luaD_rawrunprotected (L=0x7f57cc005d58, f=0x4113f0 <f_call>, ud=0x7f57c97f9a10) at ldo.c:147
#17 0x0000000000414def in luaD_pcall (L=0x7f57cc005d58, func=, u=, old_top=192, ef=) at ldo.c:784
#18 0x0000000000411309 in lua_pcallk (L=0x7f57cc005d58, nargs=, nresults=-1, errfunc=, ctx=,
k=) at lapi.c:1033
#19 0x000000000042a7b0 in luaB_pcall (L=0x7f57cc005d58) at lbaselib.c:455
#20 0x0000000000414b3d in luaD_precall (L=0x7f57cc005d58, func=0x7f57cc0ba060, nresults=2) at ldo.c:503
#21 0x0000000000422b28 in luaV_execute (L=, ci=) at lvm.c:1617
#22 0x0000000000414c55 in ccall (L=0x7f57cc005d58, func=, nResults=, inc=65537) at ldo.c:548
#23 0x0000000000413c8c in luaD_rawrunprotected (L=0x7f57cc005d58, f=0x4113f0 <f_call>, ud=0x7f57c97f9d30) at ldo.c:147
#24 0x0000000000414def in luaD_pcall (L=0x7f57cc005d58, func=, u=, old_top=48, ef=) at ldo.c:784
#25 0x0000000000411309 in lua_pcallk (L=0x7f57cc005d58, nargs=, nresults=0, errfunc=, ctx=,
k=) at lapi.c:1033
#26 0x00007f57d0018f73 in _cb (context=0x7f57cc014460, ud=0x7f57cc005d58, type=10, session=88, source=8, msg=0x7f57a8041b30, sz=34) at lualib-src/lua-skynet.c:75
#27 0x0000000000408d8d in dispatch_message (ctx=0x7f57cc014460, msg=0x7f57c97f9e10) at skynet-src/skynet_server.c:276
#28 0x000000000040908f in skynet_context_message_dispatch (sm=0xc5b8f0, q=0x7f57cc06a320, weight=1) at skynet-src/skynet_server.c:336
#29 0x0000000000409ebd in thread_worker (p=) at skynet-src/skynet_start.c:163
#30 0x00007f57d7afeaa1 in start_thread () from /lib64/libpthread.so.0
#31 0x00007f57d71bbbcd in clone () from /lib64/libc.so.6

@cloudwu
Copy link
Owner

cloudwu commented Mar 6, 2021

skynet-src/skynet_handle.c 的 skynet_handle_register () 不像是一个可能陷入死循环无法结束的函数。

@cloudwu
Copy link
Owner

cloudwu commented Mar 6, 2021

如果你的代码是最新的话,
skynet-src/skynet_handle.c:147
skynet-src/skynet_handle.c:59

线程同时处于这两个位置是不正常的。因为在 59 行处于写锁中,而 147 行处于读锁。

写锁应该是独占的。

@HappyMiluo
Copy link
Author

HappyMiluo commented Mar 7, 2021

看了下代码 skynet-src/skynet_handle.c 很久没修改过了,代码确实是最新的,我的编译器版本是gcc4.4.7。
在调用skynet.abort()之后,也有概率进程没关闭。堆栈如下:

(gdb) info thread
2 Thread 0x7f9a719a8700 (LWP 30306) 0x000000000040769a in skynet_handle_grab (handle=1) at skynet-src/skynet_handle.c:147
1 Thread 0x7f9a74c97700 (LWP 30301) 0x00007f9a7486d2fd in pthread_join () from /lib64/libpthread.so.0
(gdb) thread 2
[Switching to thread 2 (Thread 0x7f9a719a8700 (LWP 30306))]#0 0x000000000040769a in skynet_handle_grab (handle=1) at skynet-src/skynet_handle.c:147
147 skynet-src/skynet_handle.c: No such file or directory.
in skynet-src/skynet_handle.c
(gdb) bt
#0 0x000000000040769a in skynet_handle_grab (handle=1) at skynet-src/skynet_handle.c:147
#1 0x0000000000409044 in skynet_context_message_dispatch (sm=0x1cdb7e0, q=0x1cd5140, weight=-1) at skynet-src/skynet_server.c:308
#2 0x0000000000409ebd in thread_worker (p=) at skynet-src/skynet_start.c:163
#3 0x00007f9a7486caa1 in start_thread () from /lib64/libpthread.so.0
#4 0x00007f9a73f29bcd in clone () from /lib64/libc.so.6
(gdb) thread 1
[Switching to thread 1 (Thread 0x7f9a74c97700 (LWP 30301))]#0 0x00007f9a7486d2fd in pthread_join () from /lib64/libpthread.so.0
(gdb) bt
#0 0x00007f9a7486d2fd in pthread_join () from /lib64/libpthread.so.0
#1 0x0000000000409ca3 in start (thread=24) at skynet-src/skynet_start.c:227
#2 0x0000000000409e1e in skynet_start (config=0x7ffefdbe4d70) at skynet-src/skynet_start.c:279
#3 0x0000000000407138 in main (argc=, argv=) at skynet-src/skynet_main.c:166

@cloudwu
Copy link
Owner

cloudwu commented Mar 7, 2021

似乎是一个只在你的环境出现的问题,你可以尝试解决它。

@HappyMiluo
Copy link
Author

HappyMiluo commented Mar 7, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants