Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client指定conn=4,运行一天后占用大量内存 #42

Closed
Cye3s opened this issue Jun 22, 2016 · 34 comments
Closed

client指定conn=4,运行一天后占用大量内存 #42

Cye3s opened this issue Jun 22, 2016 · 34 comments

Comments

@Cye3s
Copy link

Cye3s commented Jun 22, 2016

client版本: v20160620 amd64 (15,16版本同样问题)
OS版本:Openwrt x64

如果不指定conn参数,使用默认值,运行4天内存占用都正常
qq 20160622083541

@xtaci
Copy link
Owner

xtaci commented Jun 22, 2016

你是自己编译的么

@Cye3s
Copy link
Author

Cye3s commented Jun 22, 2016

@xtaci
Copy link
Owner

xtaci commented Jun 22, 2016

嗯,这个估计比较麻烦,golang是GC的语言,即使内存实际没有用到这么多,但是也不一定交还给系统,所以RES比较高,我检查一下是否存在goroutine泄漏。

@xtaci
Copy link
Owner

xtaci commented Jun 22, 2016

这个问题,多半是因为内存碎片造成的

http://stackoverflow.com/questions/24863164/how-to-analyse-golang-memory

@xtaci
Copy link
Owner

xtaci commented Jun 22, 2016

@Cye3s 另外,这是我运行了一天的 -conn 4
16371 ubuntu 20 0 168M 37464 2784 S 0.7 0.9 26:35.80 ./client_linux_amd64 -key xxxxx -rcvwnd 2048 -sndwnd 256 -r xxxxx

内存是37M, 这个是否和sysctl的参数有关,比如overcommit

@Cye3s
Copy link
Author

Cye3s commented Jun 22, 2016

奇怪的就是不设置conn参数,内存占用比较正常,或者说conn=4成倍放大了内存碎片生成速度?
那我先不加conn参数

@xtaci
Copy link
Owner

xtaci commented Jun 22, 2016

对,多goroutine必然加剧碎片速度, 另外release是用的1.7beta2编译的,你可以试试1.6 stable编译,是不是gc上有区别。

@Cye3s
Copy link
Author

Cye3s commented Jun 22, 2016

有空我试下吧,没接触过golang
如果你想在Openwrt里测试,可以跑个虚拟机
https://wiki.openwrt.org/doc/howto/vmware
https://downloads.openwrt.org/chaos_calmer/15.05.1/x86/64/

@xtaci
Copy link
Owner

xtaci commented Jun 22, 2016

好,手里确实没有环境

@jannson
Copy link
Contributor

jannson commented Jun 22, 2016

@xtaci 大神需要的话我寄你一台 支持 openwrt 的 1900AC v2,我平时不用,也仅仅测试用的。

@xtaci
Copy link
Owner

xtaci commented Jun 22, 2016

@jannson 谢谢,不必,一个虚拟机就能搞定的。

@wxyzh
Copy link

wxyzh commented Jun 22, 2016

x86和arm的openwrt测试结果会一致吗?

@Cye3s
Copy link
Author

Cye3s commented Jun 22, 2016

@jannson 头像眼熟,想了下,原来是ks论坛的小宝同学,哈哈哈

@wxyzh
Copy link

wxyzh commented Jun 22, 2016

@Cye3s 是的,可惜网件的梅林不是很稳定

@jannson
Copy link
Contributor

jannson commented Jun 22, 2016

@wxyzh 什么东西不稳定?

@wxyzh
Copy link

wxyzh commented Jun 22, 2016

@jannson 以前用过r6300v2 的老版本的,策略路由什么的跑不起来。好像歪楼了……

@xtaci
Copy link
Owner

xtaci commented Jun 23, 2016

@Cye3s 试一下0623的版本,内存略有降低,长期使用效果可能更好。

@xtaci xtaci closed this as completed Jun 24, 2016
@Cye3s
Copy link
Author

Cye3s commented Jun 24, 2016

试了新版,设置conn=2,跑了2个小时就占用440M,我还是改回conn=1吧
kcp

@xtaci
Copy link
Owner

xtaci commented Jun 24, 2016

@Cye3s 你可以贴一下你的sysctl.conf

@Cye3s
Copy link
Author

Cye3s commented Jun 24, 2016

root@OpenWrt:/etc# cat sysctl.conf
kernel.panic=3
net.ipv4.conf.default.arp_ignore=1
net.ipv4.conf.all.arp_ignore=1
net.ipv4.ip_forward=1
net.ipv4.icmp_echo_ignore_broadcasts=1
net.ipv4.icmp_ignore_bogus_error_responses=1
net.ipv4.igmp_max_memberships=100
net.ipv4.tcp_ecn=0
net.ipv4.tcp_fin_timeout=30
net.ipv4.tcp_keepalive_time=120
net.ipv4.tcp_syncookies=1
net.ipv4.tcp_timestamps=1
net.ipv4.tcp_sack=1
net.ipv4.tcp_dsack=1

net.ipv6.conf.default.forwarding=1
net.ipv6.conf.all.forwarding=1

net.netfilter.nf_conntrack_acct=1
net.netfilter.nf_conntrack_checksum=0
net.netfilter.nf_conntrack_max=16384
net.netfilter.nf_conntrack_tcp_timeout_established=7440
net.netfilter.nf_conntrack_udp_timeout=60
net.netfilter.nf_conntrack_udp_timeout_stream=180

net.bridge.bridge-nf-call-arptables=0
net.bridge.bridge-nf-call-ip6tables=0
net.bridge.bridge-nf-call-iptables=0

@Cye3s
Copy link
Author

Cye3s commented Jun 24, 2016

另外我的软路由是4个Intel 82583V,所以我用脚本启用了RPS,不知道有没影响

#!/bin/sh
# Enable RPS (Receive Packet Steering)
rfc=4096
proc_num=$(grep -c processor /proc/cpuinfo)
if [ $proc_num -eq 1 ]
then
  echo "RPS/RFS disabled because of single cpu."
  exit 0
elif [ $proc_num -le 3 ]
then
  ## 11(bit) = 3(0x3)
  rc=3
else
  ## 1111(bit) = f(0xf)
  rc="f"
fi
rsfe=$(echo $proc_num*$rfc | bc)
sysctl -w net.core.rps_sock_flow_entries=$rsfe
for fileRps in $(ls /sys/class/net/eth*/queues/rx-*/rps_cpus)
do
  echo "$rc" > $fileRps
done
for fileRfc in $(ls /sys/class/net/eth*/queues/rx-*/rps_flow_cnt)
do
  echo $rfc > $fileRfc
done
#tail /sys/class/net/eth*/queues/rx-*/rps_cpus
#tail /sys/class/net/eth*/queues/rx-*/rps_flow_cnt

@xtaci
Copy link
Owner

xtaci commented Jun 24, 2016

嗯,我已经在我的环境中开启了 golang profiling,相信很快就能找到问题所在。

@xtaci
Copy link
Owner

xtaci commented Jun 24, 2016

# runtime.MemStats
# Alloc = 13962616
# TotalAlloc = 1995719528
# Sys = 83282168
# Lookups = 3537
# Mallocs = 2100757
# Frees = 2077134
# HeapAlloc = 13962616
# HeapSys = 74711040
# HeapIdle = 59260928
# HeapInuse = 15450112
# HeapReleased = 0
# HeapObjects = 23623
# Stack = 2097152 / 2097152
# MSpan = 222720 / 1130496
# MCache = 4800 / 16384
# BuckHashSys = 1478123
# NextGC = 19569903

确实是golang没有归还给系统,HeapSys很大,HeapInuse很小,Idle很多

@Cye3s 你重新下载一下0623的版本,我换成sync.Pool分配了,没有以前的分配那么激进

@Cye3s
Copy link
Author

Cye3s commented Jun 24, 2016

问下,如果conn=2,是不是这两个参数要减半
-sndwnd 256 -rcvwnd 2048

@xtaci
Copy link
Owner

xtaci commented Jun 24, 2016

满载的时候肯定要减半,否则一定会掉包,导致情况更糟

@Cye3s
Copy link
Author

Cye3s commented Jun 24, 2016

明白了,就有点类似多线程下载,带宽100Mbps,设置成1个线程跑100Mbps和2个线程跑50Mbps
可以在首页说明下,或者物理连接的这两个参数值自动去除conn数?

@xtaci
Copy link
Owner

xtaci commented Jun 24, 2016

@Cye3s 对的,另外,你需要再次重新下载0623,检查一下内存问题。刚刚更新。

@Cye3s
Copy link
Author

Cye3s commented Jun 28, 2016

0627的版本
conn=2,运行15小时占用300多M,比上一版有改善,不过估计跑久了还是会占满内存,得定时重启
11

@xtaci
Copy link
Owner

xtaci commented Jun 28, 2016

@Cye3s
我运行25小时后的结果,85M
7e9b6e55-3d49-45b0-8834-81e02349e44a

我还是怀疑你的sysctl有些参数,比如overcommit memory会导致分配失控

@Cye3s
Copy link
Author

Cye3s commented Jun 28, 2016

sysctl -a看了下
vm.overcommit_kbytes = 0
vm.overcommit_memory = 0
vm.overcommit_ratio = 50

@xtaci
Copy link
Owner

xtaci commented Jun 28, 2016

@Cye3s 我确实不知道原因了,我这里内存下降了非常多,4G/ubuntu 14.04

@hangaj
Copy link

hangaj commented Aug 25, 2016

@Cye3s 能请教下你kcptun openwrtx64上如何设置开机启动的吗,
我的是openwrt x86的我写了个运行脚本#!/bin/sh
usr/sbin/kcptunclient386 -r "服务器地址:端口" -l ":12000" -mode fast2 -mtu 1400 -sndwnd 256 -rcvwnd 1146 2>/dev/null &
然后把这个脚本加入rc.local ,开机运不行起来,直接putty,运行脚本,却能跑,不知道为什么,求赐教。

@Cye3s
Copy link
Author

Cye3s commented Aug 26, 2016

@hangaj
我是在rc.local加了行命令

/root/kcptun_client -l "0.0.0.0:1585" -r "vps:1755" -mtu 1480 -sndwnd 128 -rcvwnd 1024 -key "xxxxxxxx" -conn 2 -dscp 46 >/dev/null 2>&1 &

你那脚本没复制错的话,usr前是不是少个/

@hangaj
Copy link

hangaj commented Aug 26, 2016

@Cye3s 感谢抽空回答,今天突然能用了。。好奇怪。。。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants