Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server memory highly load due to too many connections? #9812

Closed
chgitcrazy opened this issue Jun 6, 2018 · 1 comment
Closed

server memory highly load due to too many connections? #9812

chgitcrazy opened this issue Jun 6, 2018 · 1 comment

Comments

@chgitcrazy
Copy link

chgitcrazy commented Jun 6, 2018

Our Environment and Config:
os: centos 6
etcd Version: 3.2.11
Git SHA: 1e1dbb2
Go Version: go1.8.5
Go OS/Arch: linux/amd64
Etcd config:
name: 'etcdserver-test02'
data-dir: /usr/etcd/data
wal-dir: /usr/etcd/data/wal
snapshot-count: 5000
heartbeat-interval: 1000
election-timeout: 5000
quota-backend-bytes: 0
listen-peer-urls: https://192.168.1.100:2380
listen-client-urls: https://192.168.1.100:2379
max-snapshots: 5
max-wals: 5
cors:
initial-advertise-peer-urls: https://192.168.1.100:2380
advertise-client-urls: https://192.168.1.100:2379
discovery:
discovery-fallback: 'proxy'
discovery-proxy:
discovery-srv:
initial-cluster: 'etcdserver-test01=https://192.168.1.101:2380,etcdserver-test02=https://192.168.1.100:2380,etcdserver-test03=https://192.168.1.102:2380,etcdserver-test04=https://192.168.1.103:2380,etcdserver-test05=https://192.168.1.104:2380,etcdserver-test06=https://192.168.1.105:2380,etcdserver-test07=https://192.168.1.106:2380'
initial-cluster-token: 'etcd-cluster-kfefeiifeNHHEfeifek'
initial-cluster-state: 'existing'
strict-reconfig-check: false
enable-v2: false
enable-pprof: true
proxy: 'off'
proxy-failure-wait: 5000
proxy-refresh-interval: 30000
proxy-dial-timeout: 1000
proxy-write-timeout: 5000
proxy-read-timeout: 0
client-transport-security:
ca-file: '/usr/etcd/config/ssl/ca.pem'
cert-file: '/usr/etcd/config/ssl/server.pem'
key-file: '/usr/etcd/config/ssl/server-key.pem'
client-cert-auth: true
trusted-ca-file: '/usr/etcd/config/ssl/ca.pem'
auto-tls: true
peer-transport-security:
ca-file: '/usr/etcd/config/ssl/ca.pem'
cert-file: '/usr/etcd/config/ssl/member2.pem'
key-file: '/usr/etcd/config/ssl/member2-key.pem'
peer-client-cert-auth: true
trusted-ca-file: '/usr/etcd/config/ssl/ca.pem'
auto-tls: true
debug: true
log-package-levels:
log-output: default
force-new-cluster: false

This case occured when 10000+ clients connected etcd cluster(7 nodes) ; After a period of time, I found some clients connected failed with the error (“context deadline exceeded”); Then ,by the monitor system,I observed that the memory of that two hosts have exceeded 30G(the two nodes total memory:32G),but the grafana displayed that etcd process in that two hosts only consumed memory not large than 5G and the hosts wasn’t responding.
Picture as follows:
test1111
test2222
test3333

Besides the above , by analysing the etcd log , I found a lot like these information :
etcdmain: rejected connection from “192.168.1.104:32427" (error "read tcp 192.168.1.102:2380->192.168.1.104:32427: i/o timeout", ServerName “”),why?

And in the cluster ,I found each member has multy connection on port 2380 with other member repeatly,it’s so strange ,as follows:
ESTAB 0 0 192.168.1.102:2380 192.168.1.104:14203
ESTAB 0 0 192.168.1.102:2380 192.168.1.104:20579
ESTAB 0 0 192.168.1.102:2380 192.168.1.104:19975
ESTAB 0 0 192.168.1.102:2380 192.168.1.104:15983
ESTAB 0 0 192.168.1.102:2380 192.168.1.104:20296
ESTAB 0 0 192.168.1.102:2380 192.168.1.105:60271
ESTAB 0 0 192.168.1.102:2380 192.168.1.104:18111
ESTAB 0 0 192.168.1.102:2380 192.168.1.105:60273
ESTAB 0 0 192.168.1.102:2380 192.168.1.104:18323
ESTAB 0 0 192.168.1.102:2380 192.168.1.104:20451
ESTAB 0 0 192.168.1.102:2380 192.168.1.104:18217
ESTAB 0 0 192.168.1.102:2380 192.168.1.104:19466
ESTAB 0 0 192.168.1.102:2380 192.168.1.104:15917
ESTAB 0 0 192.168.1.102:2380 192.168.1.104:18318
ESTAB 0 0 192.168.1.102:2380 192.168.1.105:60194
ESTAB 0 0 192.168.1.102:2380 192.168.1.104:20152
ESTAB 0 0 192.168.1.102:2380 192.168.1.103:16269
ESTAB 0 0 192.168.1.102:2380 192.168.1.103:16615
ESTAB 0 0 192.168.1.102:2380 192.168.1.103:17397
ESTAB 0 0 192.168.1.102:2380 192.168.1.103:17397
ESTAB 0 0 192.168.1.102:2380 192.168.1.103:17397
ESTAB 0 0 192.168.1.102:2380 192.168.1.103:17397
ESTAB 0 0 192.168.1.102:2380 192.168.1.103:17397
……..
…….(so many connections)

when in host 192.168.104 , I use ‘ss’ command to find the connections from local to foreign:
ss |awk '{print $5}'|grep ":2380"|awk -F ":" "{print $1}"|sort |uniq -c|sort
     3 192.168.1.101:2380
     3 192.168.1.102:2380
     3 192.168.1.103:2380
     3 192.168.1.1042380
     3 192.168.1.100:2380
  3498 192.168.1.106:2380

here, 3498 connections to 192.168.1.106 , it’s so discuzing!

So,my problems is as follows:

  1. why a member has too many connections with other member repeatly?
  2. why etcd occupied memory is not large than 5G,but the host memory has exceeded 30G?Is this due to many connections?
    @gyuho @xiang90
@chgitcrazy chgitcrazy changed the title machine memory highly load due to too many connections? server memory highly load due to too many connections? Jun 6, 2018
@cfc4n
Copy link
Contributor

cfc4n commented Jul 26, 2018

@gyuho Can you close this issue? This is the same problem as #9911 , and I record the detail on my blog https://www.cnxct.com/etcd-lease-keepalive-debug-note/ why it happened.

@gyuho gyuho closed this as completed Jul 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants