Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

调用skynet.abort()时候coredump #588

Closed
ghost319 opened this issue Mar 14, 2017 · 7 comments
Closed

调用skynet.abort()时候coredump #588

ghost319 opened this issue Mar 14, 2017 · 7 comments

Comments

@ghost319
Copy link

#0 _dl_close_worker (map=, force=) at dl-close.c:160
#1 0x00002aedd668c40e in _dl_close (_map=0x2aedd79ad400) at dl-close.c:840
#2 0x00002aedd66864f4 in _dl_catch_error (objname=0x2aedd7932a90, errstring=0x2aedd7932a98, mallocedp=0x2aedd7932a88,
operate=0x2aedd6dbefb0 <dlclose_doit>, args=0x2aedd79ad400) at dl-error.c:187
#3 0x00002aedd6dbf521 in _dlerror_run (operate=operate@entry=0x2aedd6dbefb0 <dlclose_doit>, args=0x2aedd79ad400) at dlerror.c:163
#4 0x00002aedd6dbefdf in __dlclose (handle=) at dlclose.c:46
#5 0x00000000004301b3 in gctm ()
#6 0x0000000000412800 in luaD_precall ()
#7 0x0000000000412ab3 in luaD_call ()
#8 0x0000000000412b11 in luaD_callnoyield ()
#9 0x0000000000411f6c in luaD_rawrunprotected ()
#10 0x0000000000412e1d in luaD_pcall ()
#11 0x000000000041404a in GCTM ()
#12 0x000000000041575a in luaC_freeallobjects ()
#13 0x0000000000419c5e in close_state ()
#14 0x00002aedd820379c in snlua_release (l=0x2aedda976f70) at service-src/service_snlua.c:192
#15 0x00000000004090ea in delete_context (ctx=0x2aedda997880) at skynet-src/skynet_server.c:213
#16 skynet_context_release (ctx=ctx@entry=0x2aedda997880) at skynet-src/skynet_server.c:223
#17 0x0000000000407b40 in skynet_handle_retire (handle=21) at skynet-src/skynet_handle.c:102
#18 0x0000000000407bc5 in skynet_handle_retireall () at skynet-src/skynet_handle.c:122
#19 0x0000000000408c7b in cmd_abort (context=, param=) at skynet-src/skynet_server.c:531

问题找了2天,还是没头绪,云大有没有什么好的查找思路可以提供一下?

@cloudwu
Copy link
Owner

cloudwu commented Mar 14, 2017

dlclose 的时候挂的可以看看具体是关闭的哪个 so 导致的。通常是 lua 代码里引入的 C 模块。

@ghost319
Copy link
Author

不一定在dlclose那挂

#0 0x0000000000414e21 in sweeplist ()
#1 0x0000000000415791 in luaC_freeallobjects ()
#2 0x0000000000419c5e in close_state ()
#3 0x00002afd83c0379c in snlua_release (l=0x2afd8841d930) at service-src/service_snlua.c:192
#4 0x00000000004090ea in delete_context (ctx=0x2afd8841bd00) at skynet-src/skynet_server.c:213
#5 skynet_context_release (ctx=ctx@entry=0x2afd8841bd00) at skynet-src/skynet_server.c:223
#6 0x0000000000407b40 in skynet_handle_retire (handle=41) at skynet-src/skynet_handle.c:102
#7 0x0000000000407bc5 in skynet_handle_retireall () at skynet-src/skynet_handle.c:122
#8 0x0000000000408c7b in cmd_abort (context=, param=) at skynet-src/skynet_server.c:531
#9 0x00002afd85412190 in lcommand (L=0x2afd872f9808) at lualib-src/lua-skynet.c:113

@ghost319
Copy link
Author

2个服务我查了下,共同用的c库就只用2个,一个mysql库,另外一个库是自己写的,但是没有malloc和free这两个操作。

@cloudwu
Copy link
Owner

cloudwu commented Mar 14, 2017

检查一下是否有哪里 double free 。https://github.com/cloudwu/skynet/wiki/MemoryHook

自己加一点检测代码,或换一个可以检测出 double free 的内存管理器。glibc 自带的就可以。

@ghost319
Copy link
Author

感谢云大,问题找到了。是double free的问题。
如果是线上的项目用jemalloc那大概怎么查这种double free比较好?因为用了jemalloc定位都不准确了。但是用glibc就能定位到哪里double free

@cloudwu
Copy link
Owner

cloudwu commented Mar 15, 2017

在 memory_hook.c 里稍微包装一下,在调用 je_malloc 的地方多分配几个字节,调用 je_free 的地方在多分配的位置设置一个标记,让下次 je_free 的时候可以检查的到。

@ghost319
Copy link
Author

知道了,感谢云大

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants