Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement dynamic cluster configuration #1016

Closed
4 of 5 tasks
jonboulle opened this issue Sep 8, 2014 · 13 comments
Closed
4 of 5 tasks

Implement dynamic cluster configuration #1016

jonboulle opened this issue Sep 8, 2014 · 13 comments
Milestone

Comments

@jonboulle
Copy link
Contributor

Dynamic cluster resizing will be performed manually by cluster operators. Expose the necessary means (e.g. API operations) to do so.

Tracker ticket - work remaining:

@jonboulle jonboulle added this to the v0.5.0-RC milestone Sep 8, 2014
@xiang90
Copy link
Contributor

xiang90 commented Sep 8, 2014

@jonboulle @unihorn The first thing is to think about different scenarios when people need this.
Grow/Shrink cluster for a very solid reason; replace dead machines. Then we can have a good API for all these.

@sbward
Copy link

sbward commented Sep 11, 2014

Would it be feasible to implement self-configuration when machines are added and removed for any reason (even unexpectedly)? Please see #941. This would make things much easier for operators.

@xiang90
Copy link
Contributor

xiang90 commented Sep 11, 2014

@sbward Not really. Operators should involve in every configuration change.

etcd itself should never make configuration change (we have deprecated the standby mode, which can do auto promote and demote). The reason behind this is that: etcd is the source of trust; if there is a problem inside etcd cluster, the operator must treat it seriously.

in the future, we might provide a way to do self-coordination (introducing non-voting members, so you can have 13 node cluster).

@txomon
Copy link

txomon commented Sep 19, 2014

The idea I had commented in #941 was to make sure we can have clusters configured easier.

My idea was to make etcd instances easily configurable between themselves. For example, I had thought about the following usecase:

A machine has etcd runnning, B machine wants to join A, so B is launched with an env var with A address.

After some time, C wants to join A and B, and it is lauched with an env var pointing to A. Syncs, and suddenly contact with A is lost.

Thanks to the new implementation, C would also have B's address and would be able to communicate with it till A joins the group again.

This has the danger that having two parallel etcd clusters in this automatic mode has to be done carefully, as if two nodes belonging to different clusters happen to connect, it will automatically merge both clusters.

Anyway, I think this is a great feature, that if possible to be enabled through a configuration file would make etcd nearly a droplet application.

@xiang90
Copy link
Contributor

xiang90 commented Sep 19, 2014

@txomon We will have a bootstraping and dynamic configuration proposal very soon (within next week). We hope to get you feedback and make etcd easier to operate. Thanks!

@jonboulle
Copy link
Contributor Author

Also depends on #1370

@yichengq yichengq modified the milestones: v0.5.0, v0.5.0-alpha Oct 24, 2014
@txomon
Copy link

txomon commented Oct 31, 2014

Is there any docs on how to use this feature?

@jonboulle
Copy link
Contributor Author

@txomon
Copy link

txomon commented Nov 4, 2014

This is really nice!

Feedback: Adding a member seems tedious to me. Is there any idea to simplify the API?

I would try to suppress all the options in the member's launch, meaning that we would be able to add a member to the cluster by just knowing one of the known members. I can't think on a use case where the Error case could be justified or not automated. In case the cluster doesn't exists, then it gets created, no need to specify that in the etcd launch command.

I would indeed think on that as the default workmode. With this, I mean that the only membership change that can't be "supposed" is the deletion of a member, although etcd could trigger a command to the cluster when stopped to notify it's stop.

I will try to write down a use case.

@xiang90
Copy link
Contributor

xiang90 commented Nov 4, 2014

@txomon

Feedback: Adding a member seems tedious to me. Is there any idea to simplify the API?

We could, but we probably will not. We want the dynamic configuration process to be as explicit as possible.

I would try to suppress all the options in the member's launch, meaning that we would be able to add a member to the cluster by just knowing one of the known members.

That will work. Again we want the dynamic configuration working in a step-by-step fashion and as explicit as possible. We ask the newly joined member to provide an expected configuration to match the current cluster configuration to ensure it joins into the cluster it wants.

We will not make this process simpler by sacrificing explicitness. The reason is that:

  1. Dynamic configuration should happen very seldom as you can do static multiple members cluster bootstrap now.
  2. When you want to do dynamic configuration, you should have a solid reason and follow the doc step by step. We have helped a lot of people to recover their cluster/replaced a member in the cluster. Based on our experience, most of the people expect human involvement into this process. And most people want a step by step guild and each step has observable outputs.
  3. We still provide the flexibility to let you only know one member in the cluster to join into it. If you want the simplicity, you can write a 10 lines of script to call etcdctl first then set the env etcdctl gives you to etcd to start the process. By doing that, the user will know what exactly happens and they can actually control it. We hope our user can use concrete API rather than let us do the magic. They should be able to create the magic themselves with our API if they want.

@txomon
Copy link

txomon commented Nov 4, 2014

Ok! It's clear to me now. Anyway, I was thinking on environments where you want to resize the datacenter.

Thanks for the info!

@xiang90
Copy link
Contributor

xiang90 commented Nov 4, 2014

Anyway, I was thinking on environments where you want to resize the datacenter.

You probably think of running some etcd proxies on a lots of machines in the dc. That requires frequent changes. We will have a plan for that. :)

@jonboulle
Copy link
Contributor Author

The only thing remaining for this is integration tests which will be covered by #1399 and #1562

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants