Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于Skynet引入包后未使用的问题。 #628

Closed
CandyMi opened this issue Apr 26, 2017 · 5 comments
Closed

关于Skynet引入包后未使用的问题。 #628

CandyMi opened this issue Apr 26, 2017 · 5 comments

Comments

@CandyMi
Copy link

CandyMi commented Apr 26, 2017

watchdog代码
`local skynet = require "skynet"
local socket = require "socket"

skynet.start(function()
local id = socket.listen("",8080)
socket.start(id,function(fd,addr)
local agent = skynet.newservice "agent"
skynet.send(agent,"lua","SOCKET",fd,addr)
socket.abandon(fd)
end)
end)`

服务端agent代码
`local skynet = require "skynet"
local socket = require "socket"
local mysql = require "mysql" -- 仅导入从未使用

local CMD = {}

function CMD.SOCKET(fd,addr)
socket.start(fd)
while true do

	local data = socket.read(fd)
	if not data then
		--读到EOF表示关闭了套接字。
		socket.close(fd)
		break
	end
            -- do something
end
--skynet.error("[",os.date("%c",os.time()),"],","client ",addr,"close this session.")
skynet.exit()

end

skynet.start(function()
skynet.dispatch("lua",function(,,cmd,...)
local func = assert(CMD[cmd])
if type(func) == "function" then
func(...)
end

end)

end)`

客户端代码(python)
from gevent import socket def start(): while 1: try: s = socket.socket() s.connect(("192.168.40.2",8080)) except: pass finally: s.close start()

测试操作系统:FreebSD 10.3 VMware
CPU:4C4T
内存:2G

经过一段时间压测发现,skynet 会莫名其妙dowm掉!
由于水平有限,gdb调试skynet无法发现问题所在。

但是发现的只要启动的agent service有require一个包并且未使用,就会出现这个问题。

@CandyMi CandyMi changed the title local 关于Skynet引入包后未使用的问题。 Apr 26, 2017
@cloudwu
Copy link
Owner

cloudwu commented Apr 26, 2017

无法重现,建议在能重现的环境仔细找问题。

@CandyMi
Copy link
Author

CandyMi commented Apr 26, 2017

我在系统的/var/log/debug日志文件内发现很多类似下列信息的日志

Apr 26 15:10:32 localhost kernel: sonewconn: pcb 0xfffff80011202188: Listen queue overflow: 193 already in queue awaiting acceptance (5227 occurrences)

请问是否与此相关?

@CandyMi
Copy link
Author

CandyMi commented Apr 26, 2017

这个是/var/log/message日志的内容:
Apr 26 15:25:23 localhost kernel: Limiting closed port RST response from 390 to 200 packets/sec
Apr 26 15:25:28 localhost kernel: Limiting closed port RST response from 330 to 200 packets/sec
Apr 26 15:25:29 localhost kernel: Limiting closed port RST response from 310 to 200 packets/sec
Apr 26 15:25:34 localhost kernel: Limiting closed port RST response from 413 to 200 packets/sec
Apr 26 15:25:35 localhost kernel: Limiting closed port RST response from 437 to 200 packets/sec
Apr 26 15:25:41 localhost kernel: Limiting closed port RST response from 416 to 200 packets/sec
Apr 26 15:25:44 localhost kernel: Limiting closed port RST response from 384 to 200 packets/sec
Apr 26 15:25:48 localhost kernel: sonewconn: pcb 0xfffff8001c18bab8: Listen queue overflow: 193 already in queue awaiting acceptance (34362 occurrences)
Apr 26 15:25:50 localhost sshd[1030]: error: Received disconnect from 192.168.40.13 port 64011:0:
Apr 26 15:27:57 localhost kernel: sonewconn: pcb 0xfffff8001c18bab8: Listen queue overflow: 193 already in queue awaiting acceptance (868 occurrences)
Apr 26 15:30:03 localhost kernel: sonewconn: pcb 0xfffff8001c18bab8: Listen queue overflow: 193 already in queue awaiting acceptance (2992 occurrences)
Apr 26 15:31:50 localhost kernel: sonewconn: pcb 0xfffff8001c18bab8: Listen queue overflow: 193 already in queue awaiting acceptance (807 occurrences)
Apr 26 15:32:29 localhost kernel: pid 1609 (skynet), uid 0: exited on signal 6 (core dumped)

这个是刚刚新启动后压测时突然的core dump!gdb 调试core 进去内容如下:

root@localhost:~/skynet # gdb skynet skynet.core
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...
Core was generated by `skynet'.
Program terminated with signal 6, Aborted.
Reading symbols from /lib/libthr.so.3...done.
Loaded symbols for /lib/libthr.so.3
Reading symbols from /lib/libm.so.5...done.
Loaded symbols for /lib/libm.so.5
Reading symbols from /usr/lib/librt.so.1...done.
Loaded symbols for /usr/lib/librt.so.1
Reading symbols from /lib/libc.so.7...done.
Loaded symbols for /lib/libc.so.7
Reading symbols from ./cservice/logger.so...done.
Loaded symbols for ./cservice/logger.so
Reading symbols from ./cservice/snlua.so...done.
Loaded symbols for ./cservice/snlua.so
Reading symbols from ./luaclib/skynet.so...done.
Loaded symbols for ./luaclib/skynet.so
Reading symbols from ./luaclib/profile.so...done.
Loaded symbols for ./luaclib/profile.so
Reading symbols from ./luaclib/memory.so...done.
Loaded symbols for ./luaclib/memory.so
Reading symbols from ./cservice/harbor.so...done.
Loaded symbols for ./cservice/harbor.so
Reading symbols from ./luaclib/socketdriver.so...done.
Loaded symbols for ./luaclib/socketdriver.so
Reading symbols from ./luaclib/mysqlaux.so...done.
Loaded symbols for ./luaclib/mysqlaux.so
Reading symbols from /libexec/ld-elf.so.1...done.
Loaded symbols for /libexec/ld-elf.so.1
#0 0x0000000800658d6a in _rtld_atfork_post () from /libexec/ld-elf.so.1
[New Thread 801808800 (LWP 100164/)]
[New Thread 801808400 (LWP 100163/)]
[New Thread 801808000 (LWP 100162/)]
[New Thread 801807c00 (LWP 100161/)]
[New Thread 801807800 (LWP 100160/)]
[New Thread 801807400 (LWP 100159/)]
[New Thread 801807000 (LWP 100158/)]
[New Thread 801806400 (LWP 100112/)]
(gdb) bt
#0 0x0000000800658d6a in _rtld_atfork_post () from /libexec/ld-elf.so.1
#1 0x0000000800654759 in _rtld_atfork_post () from /libexec/ld-elf.so.1
#2 0x0000000800648cc4 in dlclose () from /libexec/ld-elf.so.1
#3 0x000000080064860b in dlclose () from /libexec/ld-elf.so.1
#4 0x00000000004359d5 in gctm ()
#5 0x0000000000414095 in luaD_precall ()
#6 0x00000000004143e6 in luaD_callnoyield ()
#7 0x0000000000413726 in luaD_rawrunprotected ()
#8 0x0000000000414962 in luaD_pcall ()
#9 0x00000000004168d3 in GCTM ()
#10 0x0000000000415b5a in luaC_freeallobjects ()
#11 0x000000000041c192 in close_state ()
#12 0x0000000802602651 in snlua_release (l=0x80490daf0) at service-src/service_snlua.c:192
#13 0x00000000004092f5 in skynet_context_message_dispatch (sm=, q=,
weight=) at skynet-src/skynet_server.c:213
#14 0x000000000040a82e in thread_worker (p=) at skynet-src/skynet_start.c:162
#15 0x000000080086a855 in pthread_create () from /lib/libthr.so.3
#16 0x0000000000000000 in ?? ()

上次提issues提到这个可能会是一个double free,但是现在未引用其它第三方包也有这个问题?

@cloudwu
Copy link
Owner

cloudwu commented Apr 26, 2017

double free 是很容易查的,先查出来再说。不要一个问题没解决就去看下一个。

@CandyMi
Copy link
Author

CandyMi commented Apr 27, 2017

@cloudwu 找到问题所在,是系统优化的问题!系统限制导致进程被kill掉了。

@cloudwu cloudwu closed this as completed Oct 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants