Large snapshots prevent the addition of new managers to the cluster #2374

nishanttotla · 2017-09-15T00:17:54Z

This issue has been seen in a couple of production clusters, and is considered critical. We must fix it on SwarmKit.

Summary

When the raft snapshot becomes larger than 4MB, then adding a new manager to the cluster becomes problematic. This is because the default gRPC message limit is 4MB, and sending a snapshot over to the new joining manager fails. As a result, the new manager does not end up with proper cluster state. This can also happen if a manager in an existing cluster falls behind and needs to receive a snapshot from a raft peer.

What Makes the Snapshot Large

Running a large number of services/tasks possibly connected to many networks can increase the size of the snapshot. If the task history retention limit is particularly high, a lot of old tasks can stay around bloating it further. Having a large number of (possibly large) secrets can also cause this problem.

Possible Fixes

There are several possible fixes that have been discussed. Let's use this issue to discuss pros and cons.

Increase the gRPC message limit size to something higher and more reasonable. (how to decide this limit is unclear)
Stream the snapshot instead of trying to send it as one gRPC message.
Don't keep task history in the raft log, because it is not as critical. (this may alleviate the problem but not necessarily fix it)
Compress the snapshot when writing to disk. If a new manager has to receive it, it can decompress it upon reception. (this may alleviate the problem but not necessarily fix it)

We may have to do a combination of these things.

cc @wsong @anshulpundir @stevvooe @aluzzardi @aaronlehmann @jlhawn

anshulpundir · 2017-09-15T00:23:06Z

@wsong and I discussed this, but this should include better error reporting for failure scenarios related to this.

aaronlehmann · 2017-09-15T02:25:10Z

I think streaming the snapshot is the best approach.

anshulpundir · 2017-09-15T04:15:30Z

3 is just a better separation of critical vs non-critical data, IMO. 4 is an optimization. 1 can possibly be done in the short term. 2 is the approach for long term.

anshulpundir · 2017-09-15T08:05:14Z

Increasing grpc message size to 128MB, in case it is needed: #2375
Didn't get a chance to test it yet though.

anshulpundir · 2017-09-20T18:34:14Z

Short-term fix has been landed. Reducing priority to P1 since the long term solution is not P1.

anshulpundir · 2017-12-06T23:34:49Z

Fixed in #2458

nishanttotla added area/raft kind/bug priority/P0 labels Sep 15, 2017

anshulpundir self-assigned this Sep 15, 2017

abronan mentioned this issue Sep 18, 2017

single manager docker swarm stuck in Down state after reboot moby/moby#34827

Closed

anshulpundir added priority/P1 and removed priority/P0 labels Sep 20, 2017

bvis mentioned this issue Sep 26, 2017

When instances are restarted by AWS, they don't join the swarm cluster docker-archive/for-aws#56

Open

anshulpundir closed this as completed Dec 6, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large snapshots prevent the addition of new managers to the cluster #2374

Large snapshots prevent the addition of new managers to the cluster #2374

nishanttotla commented Sep 15, 2017 •

edited

Loading

anshulpundir commented Sep 15, 2017

aaronlehmann commented Sep 15, 2017 via email

anshulpundir commented Sep 15, 2017 •

edited

Loading

anshulpundir commented Sep 15, 2017 •

edited

Loading

anshulpundir commented Sep 20, 2017

anshulpundir commented Dec 6, 2017 •

edited

Loading

Large snapshots prevent the addition of new managers to the cluster #2374

Large snapshots prevent the addition of new managers to the cluster #2374

Comments

nishanttotla commented Sep 15, 2017 • edited Loading

Summary

What Makes the Snapshot Large

Possible Fixes

anshulpundir commented Sep 15, 2017

aaronlehmann commented Sep 15, 2017 via email

anshulpundir commented Sep 15, 2017 • edited Loading

anshulpundir commented Sep 15, 2017 • edited Loading

anshulpundir commented Sep 20, 2017

anshulpundir commented Dec 6, 2017 • edited Loading

nishanttotla commented Sep 15, 2017 •

edited

Loading

anshulpundir commented Sep 15, 2017 •

edited

Loading

anshulpundir commented Sep 15, 2017 •

edited

Loading

anshulpundir commented Dec 6, 2017 •

edited

Loading