Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

giant memory usage overhead on leader when recovering progress of dead follower(40x) #2662

Closed
yichengq opened this issue Apr 13, 2015 · 12 comments

Comments

@yichengq
Copy link
Contributor

It brings more memory pressure than #2657 (20x)

1. start a 3-member cluster
2. keep writing 200k 128-byte keys to leader randomly for several mins
3. leader uses 700mb+ RES
4. kill one follower
5. keep writing single 128-byte keys to leader randomly, and wait for ~1min
6. leader uses 1128mb RES (actual data is 128b*200k=25MB)

The overhead is around 40x

@xiang90
Copy link
Contributor

xiang90 commented Apr 13, 2015

I think this should be more clear... The reason is that leader is keeping a pointer to the old snapshot for the follower, since our snapshot is not MVCC. It is not caused by the death of the follower, but caused by the snapshot sending.

@xiang90
Copy link
Contributor

xiang90 commented Apr 13, 2015

@yichengq Also I cannot reproduce this. If I put 200k 128-byte into 3 member etcd, the memory usage would be more than 500MB (not 300MB).

Moreover, it is unclear why I need to

write one key with 128-byte size for 10000 times
leader uses 600mb RES
kill one follower, while writing one key with 128-byte size 50000 times

(I understand you want to trigger a snapshot, but it is unclear to people who is not familiar with the internal)

@yichengq
Copy link
Contributor Author

@xiang90 That one is restarted from a 3-member cluster with 200k keys in the store.

Another instruction that bootstraps a brand-new cluster:

  1. start a 3-member cluster
  2. keep writing 200k 128-byte keys to leader randomly for several mins
  3. leader uses 700mb+ RES
  4. kill one follower
  5. keep writing 200k 128-byte keys to leader randomly, and wait for ~1min
  6. leader uses 1128mb RES

@xiang90
Copy link
Contributor

xiang90 commented Apr 13, 2015

@yichengq Can you update the issue to

1. start a 3-member cluster
2. keep writing 200k 128-byte keys to leader randomly for several mins
3. leader uses 700mb+ RES
4. keep writing 200k 128-byte keys to leader randomly, and wait for ~1min
5. leader uses 1128mb RES

This is more clear.

@yichengq yichengq changed the title giant memory usage overhead on leader when follower dead(40x) giant memory usage overhead on leader when recovering progress of dead follower(40x) Apr 13, 2015
@garo
Copy link

garo commented Apr 21, 2015

We might be affecting this as one of our primary etcd instances take around 2 GiB of RES memory when the actual amount of stored data is just about 100 KiB.

@yichengq
Copy link
Contributor Author

@garo Could you share anything special that happenes, the log of etcd instance and the size of snapshot under $data_dir/member/snap with us?

@garo
Copy link

garo commented Apr 21, 2015

@yichengq The snap dir size is 24 MiB. I was using 2.0.5 previously and I've just upgraded to 2.0.9 to see if the problem persist so I don't have the logs anymore. Will update later.

@yichengq
Copy link
Contributor Author

@garo Our testing is done at master branch, and it is 40x. We are wondering how it performs on 2.0.9 too and i will measure/confirm the number tomorrow.
Does your cluster have a dead member? Does it have many watch op on it? What is the average size of keys?
Please open a new issue if you have more data.

@ohlinux
Copy link

ohlinux commented Jun 10, 2015

@garo @yichengq have any progress? I'm interested in the data.

@xiang90
Copy link
Contributor

xiang90 commented Dec 30, 2015

@heyitsanthony Probably we can test this for v3 storage again? I assume the memory pressure for leader would be much lower as we do not do any copy while sending snapshot.

@heyitsanthony
Copy link
Contributor

Ran the benchmark on 3 nodes with V3.

Initialization: 18MB RSS
200K 128-byte keys loaded: 159MB RSS (followers 156MB); took about a minute and a half on laptop
Minute of random updates to keys after killing follower: 174MB RSS (follower 167MB)

Overall, looks good.

@xiang90
Copy link
Contributor

xiang90 commented Dec 30, 2015

@yichengq @heyitsanthony Closing this since v3 looks good and we are not going to improve v2 store.

@xiang90 xiang90 closed this as completed Dec 30, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

6 participants