Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ Segmentation fault] 在luaS_new,请问下可能是那个地方出错? #866

Closed
Yanjinux opened this issue Jul 30, 2018 · 7 comments
Closed

Comments

@Yanjinux
Copy link

偶然出现的core dump,无法重现。信息如下:
#0 0x000000000041d3e3 in luaS_new (L=L@entry=0x7fa230a74ba8, str=0x7fa22f9c6725 "on_message_lua") at lstring.c:231
231 lstring.c: No such file or directory.
[Current thread is 1 (Thread 0x7fa2335e4700 (LWP 1563))]
(gdb) where
#0 0x000000000041d3e3 in luaS_new (L=L@entry=0x7fa230a74ba8, str=0x7fa22f9c6725 "on_message_lua") at lstring.c:231
#1 0x00000000004115d2 in auxgetstr (L=0x7fa230a74ba8, t=0x54434e4954534984, k=) at lapi.c:590
#2 0x00007fa22f9c540b in on_message (mosq=0x7fa22f9c6725, userdata=0x7fa22f9c6725, message=0x7fa22e95dd98) at mosquitto_skynet.c:31
#3 0x00007fa22f9c3337 in _mosquitto_handle_publish (mosq=0x7fa230b35000) at read_handle.c:126
#4 0x00007fa22f9c2fd7 in _mosquitto_packet_read (mosq=0x7fa230b35000) at net_mosq.c:1084
#5 0x00007fa22f9c1111 in mosquitto_loop_read (mosq=0x7fa230b35000, max_packets=798779173, max_packets@entry=1) at mosquitto.c:1191
#6 0x00007fa22f9c51ac in lhandle_mqtt (L=) at mosquitto_skynet.c:252
#7 0x0000000000415339 in luaD_precall (L=L@entry=0x7fa230a73da8, func=func@entry=0x7fa22e9ff7c0, nresults=nresults@entry=0) at ldo.c:434
#8 0x0000000000420b8e in luaV_execute (L=L@entry=0x7fa230a73da8) at lvm.c:1146
#9 0x00000000004150c0 in unroll (L=0x7fa230a73da8, ud=) at ldo.c:556
#10 0x0000000000414a6c in luaD_rawrunprotected (L=L@entry=0x7fa230a73da8, f=f@entry=0x415510 , ud=ud@entry=0x7fa2335e204c) at ldo.c:142
#11 0x00000000004156ff in lua_resume (L=L@entry=0x7fa230a73da8, from=from@entry=0x7fa230ada208, nargs=nargs@entry=4) at ldo.c:664
#12 0x0000000000428d97 in auxresume (L=L@entry=0x7fa230ada208, co=co@entry=0x7fa230a73da8, narg=4) at lcorolib.c:39
#13 0x00000000004290c7 in luaB_coresume (L=0x7fa230ada208) at lcorolib.c:60
#14 0x0000000000415339 in luaD_precall (L=L@entry=0x7fa230ada208, func=func@entry=0x7fa22df5d170, nresults=nresults@entry=-1) at ldo.c:434
#15 0x0000000000420b8e in luaV_execute (L=L@entry=0x7fa230ada208) at lvm.c:1146
#16 0x000000000041560f in luaD_call (L=L@entry=0x7fa230ada208, func=, nResults=) at ldo.c:499
#17 0x0000000000415661 in luaD_callnoyield (L=0x7fa230ada208, func=, nResults=) at ldo.c:509
#18 0x0000000000414a6c in luaD_rawrunprotected (L=L@entry=0x7fa230ada208, f=f@entry=0x411740 <f_call>, ud=ud@entry=0x7fa2335e2340) at ldo.c:142
#19 0x000000000041592d in luaD_pcall (L=L@entry=0x7fa230ada208, func=func@entry=0x411740 <f_call>, u=u@entry=0x7fa2335e2340, old_top=176, ef=) at ldo.c:729
#20 0x0000000000412d5c in lua_pcallk (L=L@entry=0x7fa230ada208, nargs=5, nresults=nresults@entry=-1, errfunc=errfunc@entry=0, ctx=ctx@entry=0, k=k@entry=0x427e20 ) at lapi.c:969
#21 0x0000000000427fb0 in luaB_pcall (L=0x7fa230ada208) at lbaselib.c:424
#22 0x0000000000415339 in luaD_precall (L=L@entry=0x7fa230ada208, func=func@entry=0x7fa22df5d090, nresults=nresults@entry=2) at ldo.c:434
#23 0x0000000000420b8e in luaV_execute (L=L@entry=0x7fa230ada208) at lvm.c:1146
#24 0x000000000041560f in luaD_call (L=L@entry=0x7fa230ada208, func=, nResults=) at ldo.c:499
#25 0x0000000000415661 in luaD_callnoyield (L=0x7fa230ada208, func=, nResults=) at ldo.c:509
#26 0x0000000000414a6c in luaD_rawrunprotected (L=L@entry=0x7fa230ada208, f=f@entry=0x411740 <f_call>, ud=ud@entry=0x7fa2335e25f0) at ldo.c:142
#27 0x000000000041592d in luaD_pcall (L=L@entry=0x7fa230ada208, func=func@entry=0x411740 <f_call>, u=u@entry=0x7fa2335e25f0, old_top=48, ef=) at ldo.c:729
#28 0x0000000000412d5c in lua_pcallk (L=L@entry=0x7fa230ada208, nargs=nargs@entry=5, nresults=nresults@entry=0, errfunc=errfunc@entry=1, ctx=ctx@entry=0, k=k@entry=0x0) at lapi.c:969
#29 0x00007fa2307daeb9 in _cb (context=0x7fa230ab9900, ud=0x7fa230ada208, type=6, session=0, source=0, msg=0x7fa22fc0d780, sz=24) at lualib-src/lua-skynet.c:52
#30 0x000000000040aac5 in dispatch_message (ctx=0x7fa230ab9900, msg=0x7fa2335e26c0) at skynet-src/skynet_server.c:274
#31 0x000000000040


在调试时,发现 lua_State中的 l_G指向的地址是非法的。
#0 0x000000000041d3e3 in luaS_new (L=L@entry=0x7fa230a74ba8, str=0x7fa22f9c6725 "on_message_lua") at lstring.c:231
i = 40
j = 0
p = 0x54434e4954534bac
(gdb) p p[0]
Cannot access memory at address 0x54434e4954534bac

@cloudwu
Copy link
Owner

cloudwu commented Jul 30, 2018

打开 MEMORY_CHECK https://github.com/cloudwu/skynet/wiki/MemoryHook ,检查内存有没有正确分配和释放。

@cloudwu
Copy link
Owner

cloudwu commented Jul 30, 2018

如果有 lua 的 C 扩展,打开 LUA_USE_APICHECK 做检查。

https://github.com/cloudwu/skynet/blob/master/3rd/lua/luaconf.h#L702-L709

@srdgame
Copy link
Contributor

srdgame commented Aug 13, 2018

@Yanjinux 用lib mosq做的c service吗? 稳定了吗? 考虑开源吗?

@cloudwu
Copy link
Owner

cloudwu commented Aug 13, 2018

在这个调用栈上,可以看到第六层从 lua 进入 c 库时的 L 和第一层调用 lua api 的 L 并不一致。这是不寻常的。通常,我们不应该在 c 的数据结构中记录任何 L 指针。

@Yanjinux
Copy link
Author

@srdgame 嗯 是在mosq 上封装的一个c库 主要是做了 事件分发 和 实现 request/response模式 。 稳定了
。 不过应该不会开源的 抱歉,公司的。

@Yanjinux
Copy link
Author

@cloudwu 有点迷。 跑稳定性必现,然后更新skynet 和 lua 到最新的版本 就好了。 所以应该算解决了把,虽然没找到确切的原因。

@cloudwu
Copy link
Owner

cloudwu commented Aug 14, 2018

我认为你需要查一下 binding 库是否写的符合规范。比如,不要在 binding 库中记录任何 L 。

@cloudwu cloudwu closed this as completed Oct 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants